Skip to content

Commit

Permalink
Handle edge cases
Browse files Browse the repository at this point in the history
* I’m not entirely sure, but there might be cases where `file_path` is a directory, i.e. `file_size` would be `None` in that case.
* The code didn’t cover the case where the entity already has a `fileSize` property. In this case, `file_size` would be `None`, even though `file_path` is a file. I’m not entirely sure when this happens, probably when reingesting an existing file.

Both cases were catched by tests :)
  • Loading branch information
tillprochaska committed Nov 14, 2023
1 parent 31fb41e commit a5ec577
Showing 1 changed file with 7 additions and 2 deletions.
9 changes: 7 additions & 2 deletions ingestors/manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,8 +162,11 @@ def ingest(self, file_path, entity, **kwargs):
"""Main execution step of an ingestor."""
file_path = ensure_path(file_path)
file_size = None
if file_path.is_file() and not entity.has("fileSize"):

if file_path.is_file():
file_size = file_path.stat().st_size # size in bytes

if file_size is not None and not entity.has("fileSize"):
entity.add("fileSize", file_size)

now = datetime.now()
Expand All @@ -187,7 +190,9 @@ def ingest(self, file_path, entity, **kwargs):

INGEST_SUCCEEDED.labels(ingestor_name).inc()
INGEST_DURATION.labels(ingestor_name).observe(duration)
INGEST_INGESTED_BYTES.labels(ingestor_name).inc(file_size)

if file_size is not None:
INGEST_INGESTED_BYTES.labels(ingestor_name).inc(file_size)

entity.set("processingStatus", self.STATUS_SUCCESS)
except ProcessingException as pexc:
Expand Down

0 comments on commit a5ec577

Please sign in to comment.