Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix jsx formatting #42671

Merged
merged 5 commits into from
Mar 13, 2021
Merged

Fix jsx formatting #42671

merged 5 commits into from
Mar 13, 2021

Conversation

armanio123
Copy link
Member

@armanio123 armanio123 commented Feb 5, 2021

Fixes #41925

This PR changes on how we scan JSX during formatting. It now limits the jsx text range to the last non-whitespace character (allows the formatting of </div> to be correct)

<div>|
    text|
</div>

Fixes the following issues:

  • Jsx not enclosed on parenthesis didn't format correctly. e.g.:
const a = 
    <div>
    foo
          </div>
  • Identifiers on the middle of jsx strings didn't format. e.g.:
const b = (
    <div>
  {     foo  }
          </div>
);
  • Last line on a jsx text didn't format. e.g.:
const c = (
    <div>
    foo
  {     foobar  }
  bar
          </div>
);

@typescript-bot typescript-bot added the For Uncommitted Bug PR for untriaged, rejected, closed or missing bug label Feb 5, 2021
@weswigham
Copy link
Member

Issue link in the OP links to another PR - typo?

@typescript-bot typescript-bot added For Milestone Bug PRs that fix a bug with a specific milestone and removed For Uncommitted Bug PR for untriaged, rejected, closed or missing bug labels Feb 5, 2021
@armanio123
Copy link
Member Author

Issue link in the OP links to another PR - typo?

Typo. Got it fixed now. Give it another try.

Comment on lines 762 to 770
const siblings = parent.getChildren(sourceFile);
const currentIndex = findIndex(siblings, arg => arg.pos === child.pos);
const previousNode = siblings[currentIndex - 1];
let tempNode: Node;
let previousNode: Node | undefined;
forEachChild(parent, childNode => {
if (childNode.pos === child.pos) {
previousNode = tempNode;
return true;
}
tempNode = childNode;
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this change for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fixes a bug for when the JsxElement contains multiple nodes. It used to be that only the first node would be formatted.

In the scenario below only some text would be formatted before the fix:

const foo = <div>
             some text
                            {bar}
                    more text
</div>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I mean, I don't understand how the new code manages to do something different from the old code. They look like they should do the same thing, except that forEachChild doesn’t include non-node children (like punctuation tokens). But it seems like that doesn’t matter in your example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Before it used to search for the siblings of JsxText using the getChildren method of the parent. This will result in not being able to find the correct sibling as is will be looking for JsxOpeningElement, JsxClosingElement and in this example a SyntaxList. Non of them is really a sibling.

Instead, I have changed it to use ForEachChild which now the search will include the missing JsxExpression and JsxText, thus being able to correctly format the source code.

Just for visualization
With getChildren:
image

With forEachChild:
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn’t realize getChildren() was returning a SyntaxList. That makes sense. Just as a small readability optimization, I would write it like this:

if (isJsxText(child)) {
    const range: TextRange = { pos: child.getStart(), end: child.getEnd() };
    if (range.pos !== range.end) { // don't indent zero-width jsx text
        const previousNode = child.parent.children[child.parent.children.indexOf(child) - 1] || child.parent.openingElement;

You can simplify a lot because you know that child is JsxText here.

Copy link
Member

@andrewbranch andrewbranch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so, I think this approach is generally correct. This PR basically fixes two very loosely related bugs at the same time:

  1. The crash in the linked bug, along with the whitespace not being trimmed from inside the braces in JsxExpressions, is the really bad bug here caused by the formatting scanner returning the completely wrong token kind. That was fixed by your change to shouldReScanJsxText and seems to be exactly the right fix.
  2. You’ve also fixed indentation of multiline JsxText. This seems unrelated to me, but I’m guessing the reason it’s in this PR is because fixing the bug in (1) caused an existing indentation test to fail.

If I understand correctly, the formatter basically has no mechanism to adjust the indentation within multiline tokens. It generally assumes that line breaks and indentation only fall between tokens, but that’s not true for JsxText, which can span multiple lines. So there are two ways you could approach this problem:

  1. Build some special logic into the formatter to split JsxText into lines and rewrite process indentation for each line.
  2. Make the scanner pretend that one single line JsxText token is actually a series of consecutive JsxText tokens separated by trivia. That way, the core formatter logic Just Works™.

You’ve chosen the second option here. I don’t really have a strong opinion on which way is better, though I think option (2) is definitely more creative and unexpected, therefore needs to be well-documented in comments. What I’m interested in is seeing if it’s possible to move this token-faking from the proper scanner into the formatting scanner. Would it be possible to let the scanner return something multiline, but then split it up by line in the formatting scanner after the fact? I’m just a little hesitant to mix this formatting concern into a function also used for normal scanning.

@armanio123
Copy link
Member Author

You’ve also fixed indentation of multiline JsxText. This seems unrelated to me, but I’m guessing the reason it’s in this PR is because fixing the bug in (1) caused an existing indentation test to fail.

It's exactly what happened. Once the crash was fixed indentation still didn't worked as expected, so I started fixing that.

Build some special logic into the formatter to split JsxText into lines and rewrite process indentation for each line.

That's not entirely true, as on indentMultilineCommentOrJsxText function there is already special logic for indenting each of the JsxText lines. Also, I tried this approach first, but I found it more challenging to change it for this scenario since it was not designed to behave like this.

Make the scanner pretend that one single line JsxText token is actually a series of consecutive JsxText tokens separated by trivia. That way, the core formatter logic Just Works™.

Only the last portion of the JsxText is considered trivia if it is a break line followed by whitespace. Not all of the tokens will behave like this due to whitespace restrictions on jsx text.
In this example the JsxClosingElement </div> will not contain any trivia:

const foo = <div>text with trailing with space                          </div>

Would it be possible to let the scanner return something multiline, but then split it up by line in the formatting scanner after the fact?

I consider this as well. I could let the scanner work as is and in the formatting scanner iterate back the startPos of the scanner until a non-whitespace is found. I choose not to go this path tho, as I wasn't sure of the rules of whitespace I need to follow. I can take a deeper look if you feel really strong about it, just let me know.

Working on doing a better job in documentation.

@andrewbranch
Copy link
Member

andrewbranch commented Feb 27, 2021

Make the scanner pretend that one single line JsxText token is actually a series of consecutive JsxText tokens separated by trivia. That way, the core formatter logic Just Works™.

Oops, I said this wrong. I meant

Make the scanner pretend that one multiline JsxText token is actually a series of consecutive JsxText tokens (one per line), separated by trivia.

E.g.

const foo = <div>
  one
    two
      three
  </div>

You want to treat one, two, and three as separate tokens in the formatting scanner, right? Just checking if I understood that correctly.

I can take a deeper look if you feel really strong about it, just let me know.

Working on doing a better job in documentation.

I actually think your code comments are good, they’re just hard to find because they’re two whole files away from the formatter. You wouldn’t really know in formatting.ts or formattingScanner.ts that you were going to create “synthesized” tokens, so to speak, unless you actually went into the scanner itself. This is part of the reason why I do think it’s worthwhile to at least investigate moving that logic out of the scanner.

Copy link
Member

@andrewbranch andrewbranch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, I love how much simpler this got. Nice work! A couple suggestions but overall good to go.

@@ -2242,7 2242,7 @@ namespace ts {
return token = SyntaxKind.QuestionToken;
}

function scanJsxToken(): JsxTokenSyntaxKind {
function scanJsxToken(isFormatting?: boolean): JsxTokenSyntaxKind {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest renaming and flipping this boolean (and the one in reScan) so it’s explicit about what it’s doing:

Suggested change
function scanJsxToken(isFormatting?: boolean): JsxTokenSyntaxKind {
function scanJsxToken(allowMultilineJsxText = true): JsxTokenSyntaxKind {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.

@@ -2401,7 2395,7 @@ namespace ts {
if (text.charCodeAt(pos) === CharacterCodes.lineFeed) {
pos ;
}
// falls through
// falls through
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stray auto-format

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate this ones. Fixed.

@andrewbranch
Copy link
Member

API baselines need to be updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
For Milestone Bug PRs that fix a bug with a specific milestone
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

TypeScript Server Error: "Debug Failure. False expression: Token end is child end"
5 participants