Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add command jj annotate/blame #4176

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

allonsy
Copy link
Collaborator

@allonsy allonsy commented Jul 29, 2024

This change adds a new annotate command (in git this is blame).
In order to annotate a file, call: jj annotate <FILE_PATH>.
In response, jj will traverse the commit graph, trying to find the source of the change for each line.
The output is something similar to this:

>>> jj annotate file.txt
tqqpzkzp bd94545f Alec Snyder 2024-07-29 18:39:13 1: this is an initial line
nqvnwumz b3a07c79 Alec Snyder 2024-07-29 18:39:35 2: this line is commit 1
wlpkktol 5f7de224 Alec Snyder 2024-07-29 18:40:01 3: this line is commit 2

I've also added a new module to jj_lib which exposes a function get_annotation_for_file which
returns a line by line evaluation of the the given file in a more general interface.

A few caveats

  • directories aren't supported
  • conflicted files aren't supported as I'm not exactly sure what to print out in these cases.

Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added tests to cover my changes

@allonsy
Copy link
Collaborator Author

allonsy commented Jul 29, 2024

As far as the test cases I've thrown at it, it matches with git blame but if anyone finds a counterexample, I'm happy to take a look and see where my bugs are

Copy link
Collaborator

@PhilipMetzger PhilipMetzger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a initial comment, it should also mention that this starts work on #2988.

CHANGELOG.md Show resolved Hide resolved
@@ -111,6 112,7 @@ To get started, see the tutorial at https://github.com/martinvonz/jj/blob/main/d
###### **Subcommands:**

* `abandon` — Abandon a revision
* `annotate` — Annotate a file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe it without using the command name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I followed the structure of abandon but I can modify it

Copy link
Collaborator

@ilyagr ilyagr Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For name ideas, credit is somewhat popular in #2988. I also like attribute a bit more than annotate if we want a neutral tone. Perhaps if there's a discussion about that, it can happen in that issue.

If people are not super-excited about these, annotate is a fine default choice for a name. I hope it will change later, but it doesn't have to.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've got a few options

  • annotate: I generally like it
  • attribute: also fine with this
  • credit: not my favorite, it doesn't really describe to me what it does
  • blame: just silly (leftover from git). No reason to be so negative in the command.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just thought of map, and then rereading attribute I thought of byline.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

map isn't bad actually. I'm starting to like it more and more.
Especially if we eventually add a TUI to this, it will feel like navigating a file map line by line.

Anyone else want to weigh in? I don't really feel strongly here about the name, but we should decide something before merging

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for annotate because it's familiar.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I vote for "annotate" because it's familiar.
I vote against "attribute" because it's too easy to read as a noun, and is (more?) likely to collide with other things users are thinking/talking about.

"Map" is interesting, but I'm not really feeling it. There's a whole class of words that are kind of like this in my mind: graph, log, history. They tend to be a bit overloaded in this context :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also prefer annotate. It's available in:

  • svn (also has blame and praise synonyms),
  • p4 (also has blame synonym, I think),
  • hg (also has blame synonym)
  • git (which uses a different output format from git blame, but fundamentally has the same behavior, I believe)

Probably others.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with annotate

lib/src/annotate.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@PhilipMetzger PhilipMetzger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a second review.

lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
Comment on lines 186 to 219
results: &mut PartialResults,
) -> Result<(), AnnotateError> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Why is this not fn process_commits(...) -> Result<PartialResults, AnnotateError)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can, but is there a specific reason for why you think it's better?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't linear control flow easier to read instead of having output parameters?

Copy link
Collaborator

@ilyagr ilyagr Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, it's not too bad to have & mut parameters if the function receives a partially-processed object and processes it further, but then it should be the first argument since modifying this object is the most important thing the function does.

Even in this case, I would often reach for a function that would take a PartialResults (not a reference) and return a PartialResults. But (perhaps unlike Philip), I don't see this as much better, I don't have a clear rule about how to choose. I think that would work here too, so I'm not against changing it, and perhaps I'm missing some reason it would be better.

Separately, re-designing the logic to decrease the need for such functions is a good thing, if possible. I'm not immediately sure if that could be done here.

A big no-no for me (which I think is not the case here) is if a function takes an empty object as &mut and uses &mut purely as an output.

lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
cli/tests/test_annotate_command.rs Outdated Show resolved Hide resolved
@ilyagr
Copy link
Collaborator

ilyagr commented Jul 29, 2024

conflicted files aren't supported as I'm not exactly sure what to print out in these cases.

I think that it should figure out which side of the conflict comes from which parent of the commit. Any sides that don't come from a parent (e.g. after a rebase) can be attributed as being added to the commit itself.

Infrastructure related to #4062 would be helpful here.

Of course, none of this is necessary for a first version of annotate functionality.

let original_content_lines: Vec<String> =
original_contents.lines().map(|s| s.unwrap()).collect();
let mut result_lines = Vec::new();
for (idx, line) in original_content_lines.into_iter().enumerate() {
Copy link
Collaborator

@ilyagr ilyagr Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might have missed something, but can't you just inline original_content_lines (and eliminate the vector entirely, just have ... .map(...).enumerate()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a glance, the underlying design seems reasonable to me so far, but the code seems like it will be a bit difficult to extend. (I'm mostly interested in extending it to support conflicts and some sort of an absorb command, if we decide to do that using blame). I'm wondering what we can do to make it more extensible/maintainable/modular (a good sign of it becoming more modular would be if we could split this into several commits in a reasonable way; this seems hard to do as-is), but I don't immediately have a clear suggestion overall.

I'm not sure how much of this burden has to lie on you and this PR, and how much could be cleaned up over time later; I wonder what others think.

In any case, thank you for moving the needle on this feature!

lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
}

fn get_next_commit(&self) -> Option<CommitId> {
return self.local_line_map.keys().next().cloned();
Copy link
Collaborator

@yuja yuja Jul 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of picking one randomly, I think we should visit parents in topological order, and terminate if there are no more lines to look for.

Mercurial appears to do DFS-based walk.

EDIT: Perhaps, RevsetExpression::ancestors() can be used without changing the underlying annotation logic. I'll be nice if we can filter out uninteresting revisions, but there isn't a good API to map commit.parents() within the filtered graph. It's also unclear how copy-tracing will be integrated.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can convert this pretty easily into a DFS walk (I think). Right now, it acts kind of like of BFS search. The question is, is there a performance reason to do DFS in this case over the random choice I do now? I'm curious to know if a DFS approach terminates faster. Of course it probably depends on the graph and how the commits are. I'm guessing for both my approach and a DFS approach (and throw in a BFS approach for good measure), I can come up with examples that are bad and good for performance in each case. However, in the common cases, were graphs are fairly linear, what is the best approach performance-wise?

In terms of graph pruning, I tried to make a few optimizations. Namely, if the parent and child share the same file id, I immediately move on to the parent of the parent so we don't have to perform a diff.
I wonder if there are other ways to filter the graph even further.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think DFS vs BFS matters here. The thing we'll need to guarantee is topological ordering (all descendant lines should have been propagated to the current commit iiuic), and this property is guaranteed if we use revset API. It will also help terminate early.

BTW, jj annotate lib/src/repo.rs panics with the current implementation. It might be because of merge commits.

In terms of graph pruning, I tried to make a few optimizations. Namely, if the parent and child share the same file id, I immediately move on to the parent of the parent

The default index backend wouldn't have any data to optimize it further. (We might want to add bloom filter or something to optimize log PATH query.) I assume the Google one will have optimized path for file-oriented query.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch on the bug. Just pushed a fix.
It was when you have a file with a merge in its history where one side (of the merge) was ignored rather than integrated all changes.
I've also added a test case to catch this going forward

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Now I understand why random choice works here. Since forward_to_new_commit() removes matching lines from the old map, the same matching lines will never propagate to the other parents. So the first parent is always prioritized (= the order is stabilized.)

Still I think topological order is better for optimization. With the current implementation, ancestors of fork point can be visited more than once, where contents are fetched and diffed again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I hesitated about using ancestors() since it most cases, we don't have to check every ancestor so we waste time computing the graph.
In the current implementation, it only goes as deep so as it needs and once it finds changes for every single line, it stops iterating.
I think where I'm unsatisfied with my implementation is, as you mentioned, some elements of non-determinism due to the get_next_commit function. In this case, if two parent commits commit the exact same change to a file, it is not deterministic which commit is taken as the source. This isn't necessarily incorrect, as in that case it is indeed ambiguous which commit is the source commit. However, it would be nice to be able to say which branch is always taken.
This non-determinism is solved by using variants to ancestors() and whatnot, if, perhaps, a little slower.

Overall, here is what I'm going to do: based on the comments here, I'm going to try out a few implementations mentioned here and benchmark them across a few different files and cases: Some small files, some large files, some that only need to go shallow in the graph, and some that need to go deep into the commit graph and I'll see how each implementation performs.

I'll try the current the implementation, one with RevsetAncestors, and with RevsetAncestors() filtered to the specific files (of course with iter_graph since it should be lazy).
I'll report back with the results

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, if two parent commits commit the exact same change to a file, it is not deterministic which commit is taken as the source.

Can you use the commit time?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I hesitated about using ancestors() since it most cases, we don't have to check every ancestor so we waste time computing the graph.

Oh, you mean we can omit the second parent if all lines are propagated to the first parent? That's indeed correct. However, I think visiting ancestors (by using the index) is relatively cheap compared to diffing file contents multiple times. And it might avoid abstracting the whole annotation machinery by backends. As Martin said, the current implementation is unusable at Google scale.

file(PATH) query is slow with the current default index implementation, but I hope it will be improved later.

This isn't necessarily incorrect, as in that case it is indeed ambiguous which commit is the source commit. However, it would be nice to be able to say which branch is always taken.

It's also important for test stability.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, So I've uploaded a new commit with a revset implementation that uses ancestors a file filter. It's overall a little faster (maybe 10% faster) when we have to go deeper into the graph. For shallow graphs its a little more expensive but probably not noticeable and not really the cases we are trying to optimize for.
This new approach is stable in terms of traversing the graph and may be more readable.

@martinvonz how do you think this implementation would perform on large google-like repos? Is there something I'm missing in terms of revsetExpression optimizations?

Overall, what do we think of this approach? I made the changes in a separate commit so we can just choose which one we like better

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinvonz how do you think this implementation would perform on large google-like repos?

Looks good, thanks!

results.swap_full_commit_id(current_commit_id, parent_commit_id);
return Ok(());
}
let current_contents = get_file_contents(store, file_name, current_file_id)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might be better to cache original content in local_line_map instead of fetching again.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean cache the contents of each commit still in local_line_map?
Or do you actually mean the original content (content of the working copy)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache the content of each commit. I assume it will be consumed soon (as long as the graph is linear.) Cache of the original content might also be good.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a file cache

@martinvonz
Copy link
Owner

Thanks for working on this! I have not looked at the implementation yet, but I'll just quickly note that I think a TUI for the functionality would be really useful. I quite often end up trying to find where the previous version of a line came from, and quite often I end up repeating that process several times. GitHub has a "Blame prior to change abc123..." button for each line (or span of lines) for that. It's probably not a big deal to redo the calculation when the user asks for that option, but ideally we could reuse some of the information we already calculated.

@allonsy
Copy link
Collaborator Author

allonsy commented Jul 30, 2024

Thanks for working on this! I have not looked at the implementation yet, but I'll just quickly note that I think a TUI for the functionality would be really useful. I quite often end up trying to find where the previous version of a line came from, and quite often I end up repeating that process several times. GitHub has a "Blame prior to change abc123..." button for each line (or span of lines) for that. It's probably not a big deal to redo the calculation when the user asks for that option, but ideally we could reuse some of the information we already calculated.

That's a nice idea. I'm hoping to merge this in and maybe in a future PR I can implement a TUI? Unless you think implementing a TUI would radically change things enough to not make it worth it to merge this in now?

@martinvonz
Copy link
Owner

I'm hoping to merge this in and maybe in a future PR I can implement a TUI?

Sounds good. (And I didn't mean to ask you to create a TUI, but sounds great if you're interested in doing that.)

Unless you think implementing a TUI would radically change things enough to not make it worth it to merge this in now?

Nope. I just mentioned it in case it would help any future work in this area.

@allonsy
Copy link
Collaborator Author

allonsy commented Jul 30, 2024

conflicted files aren't supported as I'm not exactly sure what to print out in these cases.

I think that it should figure out which side of the conflict comes from which parent of the commit. Any sides that don't come from a parent (e.g. after a rebase) can be attributed as being added to the commit itself.

Infrastructure related to #4062 would be helpful here.

Of course, none of this is necessary for a first version of annotate functionality.

I agree with this for conflicts, I initially tried this but gave up for the first iteration as currently the conflict writers (lib/src/conflict.rs) doesn't have the commit information. So I'd have to forward that through the callers. Definitely doable but a little out of scope for this PR I think

@allonsy allonsy force-pushed the annotate branch 3 times, most recently from 8090b5f to f62815a Compare July 30, 2024 18:40
@allonsy allonsy force-pushed the annotate branch 2 times, most recently from 684840d to ae48961 Compare August 6, 2024 16:55
Copy link
Owner

@martinvonz martinvonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not done reviewing yet, but here are my comments so far

CHANGELOG.md Outdated Show resolved Hide resolved
cli/src/commands/annotate.rs Outdated Show resolved Hide resolved
cli/src/commands/annotate.rs Outdated Show resolved Hide resolved
cli/src/commands/annotate.rs Outdated Show resolved Hide resolved
cli/src/commands/annotate.rs Outdated Show resolved Hide resolved
cli/src/commands/annotate.rs Outdated Show resolved Hide resolved
cli/src/commands/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@PhilipMetzger PhilipMetzger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG to me after this round, but you should wait for a final approval from either Martin or Yuya.

Copy link
Collaborator

@PhilipMetzger PhilipMetzger Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: Use the affected component format for the commit, e.g cli: Add `jj annotate/blame` and mention begins work on #2988 either in the PR or the commit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good reminder

@@ -69,6 70,7 @@ use crate::ui::Ui;
#[derive(clap::Parser, Clone, Debug)]
enum Command {
Abandon(abandon::AbandonArgs),
Annotate(annotate::AnnotateArgs),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martinvonz Does annotate deserve the builtin alias for blame? Or should we just use the config file approach here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't personally mind the name blame, but I know that some people do, and that's the main reason we made annotate the canonical name. If we include a blame alias by default, I think it's likely that most people will use that instead. So, if we're trying to nudge people to use annotate instead, it seems counterproductive to include the alias. But I don't care strongly either way.

Copy link
Collaborator

@PhilipMetzger PhilipMetzger Aug 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then we should update the suggested aliases in the docs to mention it but not add a core alias here, if that works for everyone.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the canonical name, annotate might be a subcommand of file.
#2988 (comment)

(I would probably add jj annotate alias just like jj cat, though.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we set on moving this command to jj file annotate with an alias of jj annotate?

lib/src/annotate.rs Outdated Show resolved Hide resolved
Comment on lines 114 to 124
let old_map = self.local_line_map.get_mut(old_commit_id);
if old_map.is_none() {
return;
}
let old_map = old_map.unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all such patterns in this commit

Suggested change
let old_map = self.local_line_map.get_mut(old_commit_id);
if old_map.is_none() {
return;
}
let old_map = old_map.unwrap();
if let Some(old_map) = self.local_line_map.get_mut(old_commit_id) {
//...
} else {
return;
}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That means that I have to wrap the whole rest of the function inside the if-let. Is that preferred?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its easier to read, but others should chime in.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer it mostly because it avoids the .unwrap() calls

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would use let Some(old_map) = .. else { return; }; here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code to now use let Some. I think it looks better now actually, thanks for the suggestions

lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
@allonsy allonsy force-pushed the annotate branch 4 times, most recently from b446272 to 4c5831a Compare August 7, 2024 17:15
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: revert unintended formatting changes?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1

Also, the section about annotate now needs to be moved up into the Unreleased section.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated. I also changed my editor to not automatically format that file (source of the issue) going forward.

cli/src/commands/annotate.rs Outdated Show resolved Hide resolved
Comment on lines 81 to 84
let annotate_commit_summary_text = settings
.config()
.get_string("templates.annotate_commit_summary")?;
let template = workspace_command.parse_commit_template(&annotate_commit_summary_text)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: perhaps, template parsing can be moved to caller so the error will be reported earlier.

It will also help when we add --template argument.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

cli/src/commands/annotate.rs Outdated Show resolved Hide resolved
cli/src/commands/mod.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
for hunk in diff.hunks() {
match hunk {
DiffHunk::Matching(common_string) => {
for _ in common_string.lines() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: .split_inclusive(|b| *b == b'\n') for consistency

lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
lib/src/annotate.rs Outdated Show resolved Hide resolved
@allonsy allonsy force-pushed the annotate branch 2 times, most recently from 176b802 to 6eaff71 Compare August 13, 2024 18:11
if file_value.is_absent() {
return Err(user_error(format!("No such path: {ui_path}")));
}
if let Some(TreeValue::Tree(_)) = file_value.as_normal() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe we can use file_value.is_tree() here? It's not exactly the same, but I think it's okay to start annotation from conflicts.

return Err(user_error(format!("No such path: {ui_path}")));
}
if let Some(TreeValue::Tree(_)) = file_value.as_normal() {
return Err(user_error(format!("file path is not a file: {ui_path}")));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: capitalize error message. Copying from jj file show, it could be rephrased as "Path exists but is not a file"

fn render_annotations(
repo: &dyn Repo,
ui: &mut Ui,
template_render: TemplateRenderer<Commit>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: pass by reference

Comment on lines 41 to 51
/// the requested file path was not found
#[error("Unable to locate file: {0}")]
FileNotFound(String),
/// the file type is incorrect. Usually a directory was given but a regular
/// file is required
#[error("File {0} must be a regular file, not a directory")]
UnsupportedFileType(String),
/// the file is in a conflicted state and can therefore not be annotated
/// properly
#[error("File {0} is conflicted at commit: {1}")]
Conflicted(String, String),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: UnsupportedFileType and Conflicted are no longer used. Maybe FileNotFound can also replaced with an empty result.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the unused errors. I'm hesitant about removing FileNotFound because this is a library function.
From the UI, it already checks that the file exists, but, if other functions call this with a non-existent file, I think it makes sense to throw an error rather than empty results right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotation process stops when the file doesn't exist (or replaced by directory) in ancestor tree, so I think it's consistent that annotation of absent file stops immediately.

We can of course special case the starting path, though. I don't have a strong opinion about that.

Comment on lines 269 to 288
for parent_edge in edges {
if parent_edge.edge_type != GraphEdgeType::Missing {
let parent_commit = repo.store().get_commit(&parent_edge.target)?;
let parent_tree = parent_commit.tree()?;
let parent_file_id = get_file_id(parent_commit.id(), &parent_tree, file_name)?;

if let Some(parent_file_id) = parent_file_id {
process_file_ids(
results,
repo.store(),
file_name,
&current_file_id,
current_commit_id,
&parent_file_id,
parent_commit.id(),
)?;
}
}
}
results.drain_remaining_for_commit_id(current_commit_id);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case this comment was buried, the intent is to remove local_line_map.get_mut(old_commit_id)/local_line_map.get_mut(new_commit_id) hash map lookup inside the lines loop.

A new module is added to jj_lib which exposes a function
get_annotation_for_file. This annotates the given file line by line with
commit information according to the commit that made the most recent
change to the line.
Similarly, a new command is added to the CLI called `jj annotate` which
accepts a file path. It then prints out line by line the commit
information for the line and the line itself. Specific commit
information can be configured via the templates.annotate_commit_summary
config variable
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants