Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1533802 - Sticky nesting for C , Rust, JS r=kats #212

Merged
merged 10 commits into from
Jun 4, 2019

Conversation

asutherland
Copy link
Contributor

@asutherland asutherland commented Apr 26, 2019

This implements position: sticky display of lines for "source" analysis lines
for which we have a "nestingRange" available. Support for emitting
"nestingRange" is added to the C (clang), Rust, and JS analyzer logic.

However, the Rust analyzer changes are very limited because the "save-analysis"
data which forms the basis of the Rust analysis is extremely limited. It
effectively already is being provided with the result of an analysis pass which
has had all AST information stripped away. There did not seem to be any
readily available mechanism in the "RLS" to attempt to reconstitute an AST and
relate it to the analysis data. The best hack I was able to come up with was
to attempt to recursively build up a span covering all of the 'def' children of
a 'def' or an 'impl'. This means that nesting for rust is largely limited to
"struct" definitions and for inherent "impl" blocks, with the nestingRange only
covering up until the start of the last "fn" in the "impl".

@asutherland asutherland force-pushed the scope-nesting branch 2 times, most recently from 4590db2 to 84e59c0 Compare April 26, 2019 05:51
@asutherland asutherland force-pushed the scope-nesting branch 2 times, most recently from 91c55b6 to d4934e1 Compare May 8, 2019 20:41
@asutherland
Copy link
Contributor Author

So this seems to be sufficiently working for C /Rust/JS. As noted in the commit message (and which I've updated into the root comment), the rust stuff is pretty hacky but I didn't see any other immediately available options. I'll try and spin a new dev run now so I know how to do that.

@asutherland
Copy link
Contributor Author

And I'll stop force-pushing from here on out.

@asutherland asutherland changed the title Bug 1533802 - Prototype sticky nesting for C r=kats Bug 1533802 - Sticky nesting for C , Rust, JS r=kats May 8, 2019
Copy link
Contributor

@emilio emilio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I've mostly looked at the rust bits so far)

tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
@asutherland
Copy link
Contributor Author

Thanks for all the rust feedback! I'm still very much at the "cargo cult what I see in the existing code and randomly google for what I don't see", so I deeply appreciate the feedback. (Also, apologies for all the beginner mistakes. I have the Blandy/Orendorff book, but have clearly not gotten very far into it. Arguably it would have been wiser to take some time at the start of the rust indexing delve in order to level up my understanding and decrease my stumbling.)

Copy link
Contributor

@staktrace staktrace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good overall! I didn't review rust-indexer.rs in too much detail since it looks like Emilio already did. Most comments are just nits, I think the only functional ones are for merging the source lines properly, and gracefully handling missing source files.

clang-plugin/MozsearchIndexer.cpp Outdated Show resolved Hide resolved
clang-plugin/MozsearchIndexer.cpp Outdated Show resolved Hide resolved
clang-plugin/MozsearchIndexer.cpp Outdated Show resolved Hide resolved
clang-plugin/MozsearchIndexer.cpp Show resolved Hide resolved
docs/analysis.md Outdated Show resolved Hide resolved
tools/src/file_format/analysis.rs Show resolved Hide resolved
tools/src/format.rs Outdated Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Show resolved Hide resolved
tools/src/bin/rust-indexer.rs Outdated Show resolved Hide resolved
@asutherland
Copy link
Contributor Author

Okay, so now there are 2 parts: part 1 is running rustfmt on only the rust code I end up touching and part 2 is just my changes (with rustfmt run again afterwards). I'm going to try and do the dev.searchfox.org thing now (or get it started)>

@asutherland asutherland dismissed staktrace’s stale review May 16, 2019 11:50

I'm pretty sure I fixed everything here and I think all my blatant force-pushing confused the github UI. (Or the use of the atom editor to resolve items.)

…emilio

This implements position: sticky display of lines for "source" analysis lines
for which we have a "nestingRange" available.  Support for emitting
"nestingRange" is added to the C   (clang), Rust, and JS analyzer logic.

However, the Rust analyzer changes are very limited because the "save-analysis"
data which forms the basis of the Rust analysis is extremely limited.  It
effectively already is being provided with the result of an analysis pass which
has had all AST information stripped away.  There did not seem to be any
readily available mechanism in the "RLS" to attempt to reconstitute an AST and
relate it to the analysis data.  The best hack I was able to come up with was
to attempt to recursively build up a span covering all of the 'def' children of
a 'def' or an 'impl'.  This means that nesting for rust is largely limited to
"struct" definitions and for inherent "impl" blocks, with the nestingRange only
covering up until the start of the last "fn" in the "impl".
@staktrace
Copy link
Contributor

I'm going to try and do the dev.searchfox.org thing now (or get it started)

So sadly this didn't work because the update script that checks out the repo assumes your github clone is called "mozsearch" and not "searchfox"

$ tail update-log
  sudo install js /usr/local/bin
  sudo install libnspr4.so libplc4.so libplds4.so /usr/local/lib
  sudo ldconfig
  popd
~
  rm -rf mozsearch
  git clone -b scope-nesting https://github.com/asutherland/searchfox
Cloning into 'searchfox'...
  pushd mozsearch
./update.sh: line 44: pushd: mozsearch: No such file or directory

I'll shut down the indexer and file a bug to fix this.

@asutherland
Copy link
Contributor Author

Thanks for investigating this and fixing the underlying issue! Your strong ownership of and professionalism on searchfox (/ mozsearch ;) are inspiring.

When I refactored the nesting_range to point at the braces, I forgot to update
the formatting logic that decides when to push a symbol onto the nesting stack.
As a result, symbols would only be pushed when the opening brace was on the
same line as the symbol.  Now we only require that the opening brace doesn't
happen before the symbol.
@asutherland
Copy link
Contributor Author

@staktrace @emilio So there's a clean run up at https://dev.searchfox.org/ now where the C looks very good, the JS looks good[1], and the rust is better than nothing. It looks like my "impl" hack isn't working, so I'll iterate on that, potentially adding another commit and triggering another "dev" channel re-indexing when I do. I do think we'll need to pursue explicitly enhancing the save-analysis stuff to include significantly more AST info before the rust stuff is actually useful.

1: https://dev.searchfox.org/mozilla-central/source/browser/modules/Sanitizer.jsm#46 should have Sanitizer be sticky.

@staktrace
Copy link
Contributor

I just tried it out on some C code and it does indeed work pretty well. Then I went to nsTArray.h and found some weirdness... if you to e.g. https://dev.searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/xpcom/ds/nsTArray.h#1692 there's a whole stack of declarations that are still stuck even though they shouldn't be.

@staktrace
Copy link
Contributor

(Maybe because the function at https://dev.searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/xpcom/ds/nsTArray.h#2391 is declared and implemented separately in the same file and we're unioning the ranges?)

@emilio
Copy link
Contributor

emilio commented May 23, 2019

I see the sticky line half-under the search bar. The search bar is a bit higher than the CSS variable.

The hard-coded pixel value will also likely not play well with any kind of zoom.

…fix it.

This allows us to stop manually offsetting the "position: sticky" lines to
compensate for the header, which was not reliable using px (and I didn't want
to get into things like rem/etc. or dynamic sizing using JS).

The only real downside to this is that the scrollbar no longer goes to the top
of the content area, but I'm not sure that's a downside.
@asutherland
Copy link
Contributor Author

asutherland commented May 27, 2019

tl;dr: The nestingRange is being emitted for declarations when it should only be emitted for definitions and the emission isn't sanity checking that the range is in the same file. I think the union logic is probably right in general, but because we use the "loc" as the start of the region for nesting and only require that the nestingRange.start is >= loc, when we erroneously tie the nestingRange to a declaration even in the same file, we effectively end up unioning the ranges. (This was a change I made so that the nestingRange could point at the braces for cool feature creep purposes, which changed our constraints in format.rs.)

I will make sure we only emit nestingRange if the "source" we're emitting is the definition instance. I'll also add defensive sanity-checking that the loc and nestingRange are in the same file.

Investigation

Aside: Holy cow, so "wc -l nsTArray.h" for the analysis says there are 1,936,965 lines in there. Which makes sense because we have target entries for every permutation of T for nsTArray<T>.

Grepping that down to just "source" lines shows that we indeed then union all of those permutations into a single "source" which is very hard to scroll through. Installing "jq" and then doing grep "source" nsTArray.h | jq -c 'select(.nestingRange) | {loc, pretty, nestingRange}' | less provides a more sane experience.

Looking at {"loc":"00316:16-36","pretty":"function Gecko_ClearPODTArray","nestingRange":"1280:48-1287:0"} for example, it appears that the problem is that at line 316 we have a declaration in nsTArray.h of extern "C" void Gecko_ClearPODTArray(void* aArray, size_t aElementSize, size_t aElementAlign); but the definition is in GeckoBindings on lines 1279-1287, with the opening brace on 1280.

We already had range validation for PeekRange, so we can just reuse that for
NestingRange.  I ensured that the range validation protected us from the
cross-file glitch on its own before adding the definition check.
Update some documentation and the help.html to aid in this not happening again.
@asutherland
Copy link
Contributor Author

So the other variation I'm trying to deal with now is that nsTArray.h:1337

  template <class Item, typename ActualAlloc = Alloc>
  elem_type* ReplaceElementsAt(index_type aStart, size_type aCount,
                               const Item* aArray, size_type aArrayLen);

is turning into "syntax":"decl,def,function" from a simplified line of

{"loc":"01337:13-30","pretty":"function nsTArray_Impl::ReplaceElementsAt","syntax":"decl,def,function","nestingRange":"2207:18-2222:0"}

as produced via grep "source" nsTArray.h | jq -c 'select(.nestingRange) | {loc, pretty, syntax, nestingRange}' | less.

So presumably we're merging "source" entries that have syntax values of "decl,function" and "def,function". The clang indexer can't actually output a syntax of "decl,def,function" on its own that I'm aware of. I've been trying to re-create the problem in big_header.h after seeing the import list at the top of nsTArray.h and deciding adding nsTArray.h to the test repo was not going to be sufficiently straightforward.

Oh, and here's the direct clang ast-dump of the header (aka not through a build system):

| |-ClassTemplateDecl 0x24ab0f0 <line:1336:3, line:1339:7> line:1337:12 is_base_of
| | |-TemplateTypeParmDecl 0x24aaf68 <line:1336:12, col:21> col:21 referenced typename depth 0 index 0 _Base
| | |-TemplateTypeParmDecl 0x24aafd8 <col:28, col:37> col:37 referenced typename depth 0 index 1 _Derived
| | `-CXXRecordDecl 0x24ab068 <line:1337:5, line:1339:7> line:1337:12 struct is_base_of definition
| |   |-DefinitionData empty aggregate standard_layout trivially_copyable pod trivial literal has_constexpr_non_copy_move_ctor can_const_default_init
| |   | |-DefaultConstructor exists trivial constexpr needs_implicit defaulted_is_constexpr
| |   | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param
| |   | |-MoveConstructor exists simple trivial needs_implicit
| |   | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param
| |   | |-MoveAssignment exists simple trivial needs_implicit
| |   | `-Destructor simple irrelevant trivial needs_implicit
| |   `-CXXRecordDecl 0x24ab3d8 <col:5, col:12> col:12 implicit struct is_base_of

@asutherland
Copy link
Contributor Author

Okay, managed to reproduce the following non-merged analysis output:

{"loc":"00064:12-52","source":1,"nestingRange":"81:35-89:0","syntax":"def,function","pretty":"function WhatsYourVector_impl::forwardDeclaredTemplateThingInlinedBelow","sym":"_ZN20WhatsYourVector_impl40forwardDeclaredTemplateThingInlinedBelowEPKT_"}
{"loc":"00064:12-52","source":1,"syntax":"decl,function","pretty":"function WhatsYourVector_impl::forwardDeclaredTemplateThingInlinedBelow","sym":"_ZN20WhatsYourVector_impl40forwardDeclaredTemplateThingInlinedBelowEPKT_"}
{"loc":"00064:12-52","source":1,"syntax":"decl,function","pretty":"function WhatsYourVector_impl::forwardDeclaredTemplateThingInlinedBelow","sym":"_ZN20WhatsYourVector_implIN7outerNS5HumanE9CoolAllocE40forwardDeclaredTemplateThingInlinedBelowEPKT_"}
{"loc":"00064:12-52","source":1,"syntax":"decl,function","pretty":"function WhatsYourVector_impl::forwardDeclaredTemplateThingInlinedBelow","sym":"_ZN20WhatsYourVector_implIN7outerNS9SuperheroE9CoolAllocE40forwardDeclaredTemplateThingInlinedBelowEPKT_"}

From the following mess. Interestingly, the decl's are somewhat superfluous, although obviously the def is the unexpected, no good, very bad type definition:

class CoolAlloc {
public:
  static void ChewGumAndAllocateMemory() {
    // nom nom nom
  }
};

template <class T>
class JokeBase {
  int walkIntoBar() {
    int i = 0;
    return i--;
  }
};

template <class T, class Alloc>
class WhatsYourVector_impl
  : public JokeBase<T> {
public:
  typedef T* elem_type;

  elem_type mStorage;

  WhatsYourVector_impl(elem_type thing)
  : mStorage(thing) {
  }

  int forwardDeclaredThingInlinedBelow();

  template <class Item, typename ActualAlloc = Alloc>
  elem_type forwardDeclaredTemplateThingInlinedBelow(
    const Item* aThing);

};

template <class T, class Alloc>
int WhatsYourVector_impl<T, Alloc>::forwardDeclaredThingInlinedBelow() {
  int i = 0;
  i  ;
  i--;
  return i;
}


template <typename T, class Alloc>
template <class Item, typename ActualAlloc>
auto WhatsYourVector_impl<T, Alloc>::forwardDeclaredTemplateThingInlinedBelow(
  const Item* aThing) -> elem_type {
  int i = 0;
  if (aThing)  {
    i  ;
  }
  i--;
  ActualAlloc::ChewGumAndAllocateMemory();
  return mStorage;
}

template <class T>
class WhatsYourVector: public WhatsYourVector_impl<T, CoolAlloc> {
public:
  typedef T* elem_type;

  WhatsYourVector(elem_type thing)
  : WhatsYourVector_impl<T, CoolAlloc>(thing) {
  }

};

@asutherland
Copy link
Contributor Author

And here's what the AST excerpt for that looks like:

| | |-FunctionTemplateDecl 0x3a10240 <line:63:3, line:65:23> line:64:13 forwardDeclaredTemplateThingInlinedBelow
| | | |-TemplateTypeParmDecl 0x3a0ff10 <line:63:13, col:19> col:19 class depth 0 index 0 Item
| | | |-TemplateTypeParmDecl 0x3a0ff78 <col:25, col:48> col:34 typename depth 0 index 1 ActualAlloc
| | | | `-TemplateArgument type 'CoolAlloc':'CoolAlloc'
| | | |-CXXMethodDecl 0x3a101a8 <line:64:3, line:65:23> line:64:13 forwardDeclaredTemplateThingInlinedBelow 'WhatsYourVector_impl<outerNS::Superhero, CoolAlloc>::elem_type (const Item *)'
| | | | `-ParmVarDecl 0x3a100b8 <line:65:5, col:17> col:17 aThing 'const Item *'
| | | `-CXXMethodDecl 0x3a155b8 <line:80:1, line:89:1> line:64:13 used forwardDeclaredTemplateThingInlinedBelow 'WhatsYourVector_impl<outerNS::Superhero, CoolAlloc>::elem_type (const outerNS::Couch *)'
| | |   |-TemplateArgument type 'outerNS::Couch'
| | |   |-TemplateArgument type 'CoolAlloc'
| | |   |-ParmVarDecl 0x3a154c8 <line:65:5, col:17> col:17 used aThing 'const outerNS::Couch *'
| | |   `-CompoundStmt 0x3a1d980 <line:81:36, line:89:1>
| | |     |-DeclStmt 0x3a1d720 <line:82:3, col:12>
| | |     | `-VarDecl 0x3a1d6c0 <col:3, col:11> col:7 used i 'int' cinit
| | |     |   `-IntegerLiteral 0x3a04d60 <col:11> 'int' 0
| | |     |-IfStmt 0x3a1d7d8 <line:83:3, line:85:3>
| | |     | |-ImplicitCastExpr 0x3a1d770 <line:83:7> 'bool' <PointerToBoolean>
| | |     | | `-ImplicitCastExpr 0x3a1d758 <col:7> 'const outerNS::Couch *' <LValueToRValue>
| | |     | |   `-DeclRefExpr 0x3a1d738 <col:7> 'const outerNS::Couch *' lvalue ParmVar 0x3a154c8 'aThing' 'const outerNS::Couch *'
| | |     | `-CompoundStmt 0x3a1d7c0 <col:16, line:85:3>
| | |     |   `-UnaryOperator 0x3a1d7a8 <line:84:5, col:6> 'int' postfix '  '
| | |     |     `-DeclRefExpr 0x3a1d788 <col:5> 'int' lvalue Var 0x3a1d6c0 'i' 'int'
| | |     |-UnaryOperator 0x3a1d810 <line:86:3, col:4> 'int' postfix '--'
| | |     | `-DeclRefExpr 0x3a1d7f0 <col:3> 'int' lvalue Var 0x3a1d6c0 'i' 'int'
| | |     |-CallExpr 0x3a1d8f8 <line:87:3, col:41> 'void'
| | |     | `-ImplicitCastExpr 0x3a1d8e0 <col:3, col:16> 'void (*)()' <FunctionToPointerDecay>
| | |     |   `-DeclRefExpr 0x3a1d8b0 <col:3, col:16> 'void ()' lvalue CXXMethod 0x3a015f8 'ChewGumAndAllocateMemory' 'void ()'
| | |     `-ReturnStmt 0x3a1d970 <line:88:3, col:10>
| | |       `-ImplicitCastExpr 0x3a1d958 <col:10> 'WhatsYourVector_impl<outerNS::Superhero, CoolAlloc>::elem_type':'outerNS::Superhero *' <LValueToRValue>
| | |         `-MemberExpr 0x3a1d928 <col:10> 'WhatsYourVector_impl<outerNS::Superhero, CoolAlloc>::elem_type':'outerNS::Superhero *' lvalue ->mStorage 0x3a0fc30
| | |           `-CXXThisExpr 0x3a1d918 <col:10> 'WhatsYourVector_impl<outerNS::Superhero, CoolAlloc> *' this

compare with the AST for its non-template friend just above it:

| | |-CXXMethodDecl 0x3a0fe68 <line:61:3, col:40> col:7 forwardDeclaredThingInlinedBelow 'int ()'

… nesting purposes.

See discussion in the pull request, but in general the situation this addresses is:
- A template method is declared (on a templated class, not sure if that matters.)
- The actual body of the template method is outside the template class definition.
  While normal humans like myself think of that body as the point of definition...
- When the template gets instantiated, clang generates a CXXMethodDecl at the point
  of declaration that looks entirely like a definition.  (And all of the source
  locations reference the sane-human body location, which is semantically
  discontinuous from the location of the template method declaration.)
@asutherland
Copy link
Contributor Author

That seems to have done it locally, I'm starting the try server dance.

@asutherland
Copy link
Contributor Author

@staktrace @emilio Okay, the try server dance was successful. nsTArray.h now looks fine, in particular the example line :kats gave of https://dev.searchfox.org/mozilla-central/source/xpcom/ds/nsTArray.h#1692

I addressed the searchbar positioning issues by changing from using position: fixed to using flexbox and have tested it in Firefox on linux/os x/windows and it seems fine with that and zoom used.

Copy link
Contributor

@staktrace staktrace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly skimming the additional changes, and playing around with the dev deployment, everything looks good. I'm happy to merge this unless @emilio has comments

* positions for each nested sticky level, and it was hard to know for sure how
* big the search box would actually be without involving JS.
*
* So our solition is to use flexbox to create a non-scrolling fixed-header,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: solition

// - This is a definition AND
// - This isn't a template instantiation. Function templates'
// instantiations can end up as a definition with a Loc at their point
// of declaration but with the CompountStmt of the template's
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: CompountStmt

@asutherland
Copy link
Contributor Author

In preparation for landing, I've prepared https://phabricator.services.mozilla.com/D33701 to include the clang indexer changes.

While it would potentially be cool to uplift it to branches other than nightly, in practice, we don't have to because the only schema change was to add the nestingRange. No nestingRange means no nesting. We can uplift as appropriate, although if older branches don't use clang 8 maybe they will get sad.

@asutherland asutherland merged commit 4d95395 into mozsearch:master Jun 4, 2019
@asutherland asutherland deleted the scope-nesting branch June 4, 2019 21:41
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Jun 5, 2019
These are the changes from mozsearch/mozsearch#212
(including the typo fixes requested today.

Differential Revision: https://phabricator.services.mozilla.com/D33701

--HG--
extra : moz-landing-system : lando
xeonchen pushed a commit to xeonchen/gecko that referenced this pull request Jun 5, 2019
These are the changes from mozsearch/mozsearch#212
(including the typo fixes requested today.

Differential Revision: https://phabricator.services.mozilla.com/D33701
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this pull request Oct 4, 2019
These are the changes from mozsearch/mozsearch#212
(including the typo fixes requested today.

Differential Revision: https://phabricator.services.mozilla.com/D33701

UltraBlame original commit: 0342dc8a3781dbb73501802c83aa5cfdf4c548a2
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Oct 4, 2019
These are the changes from mozsearch/mozsearch#212
(including the typo fixes requested today.

Differential Revision: https://phabricator.services.mozilla.com/D33701

UltraBlame original commit: 0342dc8a3781dbb73501802c83aa5cfdf4c548a2
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Oct 4, 2019
These are the changes from mozsearch/mozsearch#212
(including the typo fixes requested today.

Differential Revision: https://phabricator.services.mozilla.com/D33701

UltraBlame original commit: 0342dc8a3781dbb73501802c83aa5cfdf4c548a2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants