Skip to content

Conversation

@lhoupert
Copy link
Contributor

@lhoupert lhoupert commented Dec 9, 2025

Main Changes:

  • Adds a new Python script scripts/query_stac.py that queries a STAC API to identify new items in a source collection within a configurable time window
  • The script checks if items have already been processed in a target collection and outputs a JSON list of items that need processing
  • Includes comprehensive unit tests (tests/unit/test_query_stac.py) with mock STAC clients covering various scenarios including error handling and edge cases
  • Adds minimal JSON fixtures for reproducible testing of source and target collections

@lhoupert lhoupert changed the title Feat add query stac for automate cron workflow Feat: add query stac for automate cron workflow Dec 9, 2025
@lhoupert lhoupert marked this pull request as ready for review December 9, 2025 15:27
Copy link
Contributor

@emmanuelmathot emmanuelmathot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good. just using logger

@lhoupert
Copy link
Contributor Author

Change done,

except for

    # Output ONLY JSON to stdout (for Argo withParam)
    sys.stdout.write(json.dumps(items_to_process))
    sys.stdout.flush()

as I need this in the standard output so Argo can parse it here:
https://github.com/EOPF-Explorer/platform-deploy/blob/7e817c4451e625fcfdd48ee015a9816c8c6ab0af/workspaces/devseed-staging/data-pipeline/cronwf/eopf-explorer-cronwf-recent-data-processor.yaml#L85

@lhoupert lhoupert merged commit 59f4151 into main Dec 10, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants