-
Notifications
You must be signed in to change notification settings - Fork 563
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Remove incomplete marker for job finished only after metadata is written #3197
Conversation
📝 WalkthroughWalkthroughThe changes in this pull request primarily affect the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot"s finite context window. It"s strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (1)
snakemake/persistence.py (1)
355-355
: Consider adding error handling around metadata operations.While the change fixes the immediate issue, consider these improvements:
- Wrap metadata operations in try-except to ensure incomplete marker is only removed if metadata is successfully written
- Consider making file operations async for better performance
Example implementation:
async def finished(self, job): if not self.dag.workflow.execution_settings.keep_metadata: # do not store metadata if not requested return + try: code = self._code(job.rule) # ... existing code ... for f in job.output: # ... existing code ... self._record( self._metadata_path, { # ... existing metadata ... }, f, ) + self._remove_incomplete_marker(job) + except Exception as e: + logger.error(f"Failed to write metadata: {e}") + raise - self._remove_incomplete_marker(job)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
snakemake/persistence.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
snakemake/persistence.py (1)
Pattern **/*.py
: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of the self
argument of methods.
Do not suggest type annotation of the cls
argument of classmethods.
Do not suggest return type annotation if a function or method does not contain a return
statement.
🔇 Additional comments (1)
snakemake/persistence.py (1)
355-355
: LGTM! This change fixes the metadata recording issue.
Moving the _remove_incomplete_marker
call after metadata is written ensures that the job"s start time is properly preserved in the metadata. This is crucial for accurate runtime reporting in plots.
Fix: This PR adds back the removal of the incomplete marker file also when keep_metadata is False (i.e. --drop-metadata is active). The removal calls were merged and put at the start of the function in #3162, but this caused loss of starttime info in the metadata record. In #3197 this was fixed, but by moving the removal call to the end of the function, the incomplete tracker file was not removed anymore when --drop-metadata was active. This PR adds the call to the removal function back also for that case, and adds a note to inform why the code is structured this way. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved handling of job metadata by ensuring the incomplete marker is removed only after successful metadata record creation, preserving job start time. - **Documentation** - Added clarifying comments to explain changes regarding the incomplete marker removal. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Johannes Köster <[email protected]>
🤖 I have created a release *beep* *boop* --- ## [8.25.4](v8.25.3...v8.25.4) (2024-11-27) ### Bug Fixes * clean env vars in apptainer ([#3199](#3199)) ([76d5329](76d5329)) * ensure that intermediate files in job groups do not cause spurious mtime errors when checking for consistency with output files ([#3220](#3220)) ([4ba2bdf](4ba2bdf)) * Remove incomplete marker also when drop-metadata is active ([#3215](#3215)) ([a4f2e5c](a4f2e5c)) * Remove incomplete marker for job finished only after metadata is written ([#3197](#3197)) ([6567e5f](6567e5f)) * Support versioned URLs in Asset class and fix missing versions in Snakemake report ([#3203](#3203)) ([f086f6c](f086f6c)) * update rust-script usage to recent version (v0.35.0) [#3183](#3183) ([#3208](#3208)) ([43885d7](43885d7)) ### Documentation * clarify continuously updated input section ([#3219](#3219)) ([72a6994](72a6994)) * Fix typo in CHANGELOG.md ([#3198](#3198)) ([0e445ed](0e445ed)) * refer to Merkle trees instead of "blockchain" in caching.rst ([#3216](#3216)) ([282e5d9](282e5d9)) * remove twitter in favor of bluesky and mastodon ([#3217](#3217)) ([231c6df](231c6df)) * use "dictionary" not "array" wording in config docs ([#3156](#3156)) ([17aed41](17aed41)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This PR makes Snakemake remove the incomplete marker for a job thats finished only after the metadata is written. Otherwise the jobs starttime for the report was missing in the metadata and caused problems in the plot for the runtimes of individual jobs.
Summary by CodeRabbit
Bug Fixes
Refactor
finished
method for better clarity and maintainability while retaining existing functionality.