Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swc as a javascript parser #13425

Open
kdy1 opened this issue May 21, 2021 · 22 comments
Open

swc as a javascript parser #13425

kdy1 opened this issue May 21, 2021 · 22 comments

Comments

@kdy1
Copy link

kdy1 commented May 21, 2021

Feature request

Follow up of #13408 (reply in thread)

I'm author of the swc project and I'm looking for way to improve the performance of webpack.

Parsing something in javascript is a very expensive task, so I think using the parser of swc with libuv worker thread will improve performance a lot. (Also, it does not block js thread, so other codes can run while parsing.)

But currently, webpack does not allow overriding dependency analysis and it seems like I have to reimplement the whole JavascriptPlugin. If there's an easier way to improve performance, it would be great.

I'll create a standalone parser package if required.

What is the expected behavior?

Returning dependency graph from a loader.

What is motivation or use case for adding/changing the behavior?

Huge performance improvements.

How should this be implemented in your opinion?

By returning graph object or list from a loader.

Are you willing to work on this yourself?

Yes.

@alexander-akait
Copy link
Member

@kdy1 Let's start, does swc is using estree compatibility tree?

Because I think we can add hook(s) on parser stage so you can change acorn parser on swc. Or you can provide more information? Do you want avoid double parsing right?

@kdy1
Copy link
Author

kdy1 commented May 27, 2021

Parsing in javascript is slow, and blocks the js thread for a while, so I want to see if it's possible to offload parsing to nodejs thread pool using swc. I think double parsing with swc is not a big problem, as it the not block js thread.

But I'm not sure if it's estree compatible. It can emit the ast of babel, but I'm not sure if the babel ast is estree.

add hook(s) on parser stage so you can change acorn parser on swc.

If this is possible, it would make the job much easier
I was actually considering other things like forking JavascriptPlugin

@alexander-akait
Copy link
Member

But I'm not sure if it's estree compatible. It can emit the ast of babel, but I'm not sure if the babel ast is estree.

Should be yes (due https://github.com/estree/estree#estree-steering-committee, but need check), but here other problem webpack understand only stage 4, so if you return new/unknown AST node the result will be unpredictable.

Parsing in javascript is slow, and blocks the js thread for a while, so I want to see if it's possible to offload parsing to nodejs thread pool using swc. I think double parsing with swc is not a big problem, as it the not block js thread.

Can you describe it deeply? Do you want handle js parsring/code generation in multi threading? Maybe you can provide place where you need inject cwd?

@kdy1
Copy link
Author

kdy1 commented May 28, 2021

here other problem webpack understand only stage 4, so if you return new/unknown AST node the result will be unpredictable.

swc supports stage 3 and swc has compatibility transforms so it would not be a problem.

Can you describe it deeply? Do you want handle js parsring/code generation in multi threading? Maybe you can provide place where you need inject cwd?

What I want to do is reducing time used by the parser.
Using other threads is a kind of way to achieve it.

And by

I think double parsing with swc is not a big problem, as it the not block js thread.

I mean parsing twice would not affect performance greatly.
Parsing twice, once from the loader and once from the alternative JavascriptPlugin would result in acceptable performance.

@alexander-akait
Copy link
Member

What I want to do is reducing time used by the parser.
I mean parsing twice would not affect performance greatly.

But why do not use loader in this case? I'm a little confused, maybe you can show it using pseudo code on JavascriptPlugin?

@kdy1
Copy link
Author

kdy1 commented May 28, 2021

Oh, you are right.
Loader is not required at all in this case.

@webpack-bot
Copy link
Contributor

This issue had no activity for at least three months.

It's subject to automatic issue closing if there is no activity in the next 15 days.

@webpack-bot
Copy link
Contributor

Issue was closed because of inactivity.

If you think this is still a valid issue, please file a new issue with additional information.

@alexander-akait
Copy link
Member

@kdy1 still valid?

@kdy1
Copy link
Author

kdy1 commented Sep 12, 2021

@alexander-akait Of course, I'd be happy to make it faster.

@vankop vankop changed the title Allow returning dependency data from loader swc as a javascript parser Nov 17, 2021
@markjm
Copy link
Contributor

markjm commented Jan 15, 2022

For at least initial/demonstration purposes, I think tacking on to swc-loader makes sense here (otherwise, we would have to make a bunch of changes to asyncify the parser parsing - not impossible, but isnt necessarily blocking).

Assuming ast's are compatible (which small test indicates they arent, but they may be close)

The simplest way to get this working using swc-loader, IMO:

  1. Modify (either always, or by opt-in) the return type of @swc/core transform[Sync] to include an ast property (essentially just the output of swc.parse)
  2. Return the parsed AST property as the fourth argument in swc-loader callback { webpackAST: JSON.parse(output.ast)) }
    https://github.com/swc-project/swc-loader/blob/da457e55ac9bc8365e434de95d62324a3bf41cd2/src/index.js#L92
  3. Setting this "known" property will allow webpack to skip its built-in acorn parsing, so we are off to the races!

The main incompatible property set that stands out right now, though, is start/end/loc/range properties. I dont see an option to get the exact loc info from swc, and the start/end properties are encapsulated in a seperate span object
image

Additionally, webpack expects comment blocks to be parsed and available in the AST, which seems to also be missing from swc
@kdy1 lemme know if you need assistance from me, as I am also interested in seeing this happen

edit:i also have done some reading on the issues related to sending back ASTs and see the perf hits, so thats definitely also something to be considered, because the whole JavascriptParser class probabably cant be moved to native due to all the hooks usage

@kdy1
Copy link
Author

kdy1 commented Jan 15, 2022

Yeah actually I made rust module that can generate acorn AST from source code. I'll check how it performs using swc-loader first. Of course, it has start/end/loc/comments/etc..
We need benchmark, but definitely worth trying.

@mohsen1
Copy link
Contributor

mohsen1 commented Feb 2, 2022

@kdy1 as @markjm mentioned AST has to have comments embedded in it. Does you parser include comments too?

@kdy1
Copy link
Author

kdy1 commented Feb 3, 2022

@mohsen1 The swc codebase has rust code to create babel/acorn ast and of course those have comments stored in the struct.
Additionally, acorn version is designed to work with webpack (because next.js uses webpack)

I'll add a way to get webpack ast from node js. Filed swc-project/swc#3437

@kdy1
Copy link
Author

kdy1 commented Feb 3, 2022

I have one question. Should I add the API to @swc/core?
I'm worried about the size.

@swc/core is quite big because it has lots of code and even a wasm runtime.

@coderaiser
Copy link

@kdy1 @swc/core is great place, it already has parseSync, so why not :).

About comments, does it supported or not? I’m confused a bit 🤔 (swc-project/swc#2964 (comment))

@alexander-akait
Copy link
Member

@coderaiser You can don't worry, because if decide to do it (and most likely it will), it will be experiment, so you can report about any problems in swc

@kdy1
Copy link
Author

kdy1 commented Feb 9, 2022

It's supported and rust type definitions for acorn/babel AST store comments inline.
But I'm not sure about the binary size and I think I should be careful about adding an API.

@coderaiser
Copy link

@alexander-akait I also waiting for ability to use rust-based solution to produce estree or Babel AST.

I’m working on linter 🐊Putout that can fix everything it can find and on tool that converts swc AST to Babel and SWC AST has no comments support, I’m also working on tool that converts ESTree to Babel which does it’s job perfectly.

So, thank you, but I cannot not worry :), since it’s related to software I’m working on.

@kdy1 Here is one of the ways how API can look like:

const {parseSync} = require('@swc/core');
const ast = parseSync(`const hello = 'world'`, {
    type: ‘estree | babel | swc (default)
});

No need to use any experimental flags, such API is obvious and straight forward, and of course any bug will be reported to swc repo :).

Worry not about bundle size, if it will be less then one that TypeScript has it will be already amazing 😏. But practical usage is more useful then very long term planning.

@alexander-akait
Copy link
Member

@coderaiser Yep, it is not hard task, we need add new options to allow setup parser in webpack (and mark it as experimental so we can do breaking changes between versions) and add cross tests to check all our tests (I think we can catch some problems on early stage).

For for your interest we are working on rust based linter solution https://github.com/swc-project/swc/tree/main/crates/swc_ecma_lints, so if you want to be more faster and safely you can help ⭐

@coderaiser
Copy link

coderaiser commented Feb 9, 2022

For for your interest we are working on rust based linter solution https://github.com/swc-project/swc/tree/main/crates/swc_ecma_lints, so if you want to be more faster and safely you can help ⭐.

I am interested. I’m not as good in rust as in JavaScript 😅, but I have two questions:

🤷 Why so much code for such simple task, as warning about DebuggerStatement?

☝️49 Lines! Compare it to 7 lines of remove-debugger:

'use strict';


module.exports.report = () => 'Unexpected "debugger" statement';


module.exports.replace = () => ({
    debugger: '',
});

It reminded me the case with hasOwn, when 🐊Putout implementation had 6 lines vs 112 of ESLint 😏. So don’t tell me that this is because of Rust syntax 😸.

🤷 Where is the fix?

Look, I don’t think that program should tell me what to do, what it really should do: is does things 🤖! So ability to fix is mandatory for linter, that’s why 🐊Putout exists.

🎈Anyways…

…Anyways, I have something to suggest you: 🐊PutoutScript: JavaScript with superpowers: identifiers with additional meaning 🎉.

There is already similar concepts like one that ruleguard uses and a lot more.

@nyngwang
Copy link

Just a little question about swc: If I use swc-loader as a babel-loader replacement, what are the trade-offs excluding the compilation speed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Status: Done
Development

No branches or pull requests

8 participants