-
Notifications
You must be signed in to change notification settings - Fork 505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cached recipes #1906
base: master
Are you sure you want to change the base?
Cached recipes #1906
Conversation
Nice! Some feedback:
|
This looks awesome! I still think it could be better to have an explicit "rerun recipe if the value of this thing changes" function so the user can determine what is important, rather than having Don't think this would be too bad if the indicator function also returns the value, so you could do: config:
# There is probably a snappier name than rerun_if_changed
echo "running for configuration {{ rerun_if_changed(env("PROJECT_CONFIG")) }}"
# ... |
This is definitely an interesting idea, but I'd like to explore how things look without explicit annotations. If it works well without having to annotate anything, that would be ideal. |
Having a call to I also disallowed backtick interpolation because currently the command would be executed twice (one of the backlog problems). Even when it is evaluated once, it won't be run in the same execution order as
The structure itself is After this comment, I'll make it so there is a unique cache file per justfile path and working directory combo. The filename on my system is
I'm actually using the filename customization to not pollute the contributor's workspace during tests, so I'll leave it for now. |
I think it's fine to just not detect this and the recipe doesn't get cached.
Let's punt on linewise recipes and only allow
I think we should aim for an initial very simple version which has shortcomings and then fix those one by one. For now, what do you think about just having a single My thinking is that:
We can initially land this feature behind |
Alrighty. I'm just trying to prevent open issues like "
I am very opposed to this. While it sounds good in principle to be able to read the previous cached recipes, most of the time the hash is gonna change a lot because the recipe contains That said, I would not mind if the exact format changed somewhat. For example, I wouldn't mind having the cache file/directory in the working directory instead of the user's general cache directory.
The recipe name and/or justfile path should also be included in the cache input. If we do put the cache file/directory in the working directory, that would not need to be part of the input. I can't think of anything else.
👍 |
Don't forget to put version check into cache. Eventually it will be version 2 of the cache, and someone will run older just with newer cache. Expected behavior: either invalidate cache, or refuse to run. The main issue is that it's impossible to add behavior to older versions (they are old by definition), so safeguard should be from from the very beginning. |
This reverts commit a649e1d.
@casey Two quick things:
(Still need to handle recipe dependencies and add docs, but this should be done other than that) |
@amarao I decided to not include an explicit version in the cache format to simplify things and instead let the schema be the version. Obviously, we still handle outdated/invalid schemas by invalidating the cache. Besides, running |
I still have the two previous questions that need to be answered, but it's ready besides that. Whenever you have time for this full-time unpaid job, of course :) One of those questions is about double evaluation, but that feels much easier to handle in the context of how The only other thing to consider (that I know about) is that this might heavily conflict with #1562, however, that seems to be paused at the moment due to the author's lack of time. |
I think I'd rather constrain it to shebang recipes at the moment. Linewise recipes seem much more complicated. One issue is that linewise recipes are evaluated line-by-line, so an error in an early line will prevent a later line from running. If a later line relies on an earlier line, for example because it creates a file that is read in an expression in a later line, then evaluating it ahead of time would produce an error. We could, in the case of cached recipes, evaluate linewise recipe lines ahead of time, and only re-run the recipe if no lines have changed, which is probably what we want to do, but would be a larger change, and I'd rather get this PR in sooner rather than later.
Ultimately, I think I'd like to use redb an embedded pure-rust key/value store, for the cache. I'm happy to do that myself though, so whatever format is easiest for you for now and easiest to review is fine. Since
I'm not sure I understand this point. Can't recipes be evaluated only once, before running, and the evaluated recipe body used both for cache checks and execution? |
There are some conflicts, but they shouldn't be too bad to resolve. Let me know when this is ready for review! |
Converted to a draft just so I can keep track of what's ready to go. Feel free to undraft! |
Sounds good! I recently changed jobs, but I'll try to get back to this this week.
True, but that's only if the user is using backtick expressions to mutate things "during compile time" which should be discouraged. This is one reason I wanted to disallow backtick expressions initially until In any case, I agree that cached line-by-line recipes should have the same semantics as shebang recipes (all expressions evaluated at start before passed to shell). Ideally, cached line-by-line recipes would run exactly the same way as cached shebang recipes, but you can still reap the benefits of changing line-echos with
I initially thought I would do two evaluation runs to implement EDIT: So much for that deadline, haha. I think I'll have time to look at this again soon. |
Closes #867, #1861, cc #1685
Putting this out there to get some preliminary feedback, ask questions, and take advantage of self-inflicted peer pressure 😁
What's in this PR?
Adds a
[cached]
recipe attribute that marks that recipe as cacheable. A cacheable recipe only reruns when the body of the recipe changes after variables and other{{interpolations}}
are evaluated. See examples in #867.TODOs
--no-cache
to force run cached recipes.--delete-cache
to delete the cache file for this project.sha256_file()
andblake3_file()
to take directories and/or globs, or add a more explicitcache_files()
/hash_files()
with this functionality.cache_env_vars()
to allow caching by a glob of different environment variables, e.g.CARGO_.*
.cache_ignore()
to ignore certain content while caching that will always/often change such asuuid()
andjust_pid()
.[cached("expression")]
that 'stratify' a recipe so you can cache a recipe by independent targets.Decisions, decisions... Edge cases and questions
uuid()
andjust_pid()
, are there other impure interpolations that you can think of off the top of your head? I've already scanned the README, but wanted to double check.&& after
) do not need to be cached and do not run if the recipe matches the last run. I don't think they need to run because they are typically clean-up jobs and are only needed if the recipe actually ran. If the user wants them to run every time, they should set up a separate recipe that depends on the main recipe and subsequent in that order.