-
Notifications
You must be signed in to change notification settings - Fork 29.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Detect ESM syntax in every ambiguous file #50064
Comments
I'd like to add a Pro:
Per general discussion: I'd note if this is considered an error recovery mechanism rather than discrete choice of the module author swapping the default type to be ESM is likely not an issue as it would still fall into the error category. This would also make it sensible to have the ability to warn upon encountering this and allow fixing in an iterative manner. If this is framed as an explicit design choice for userland authors rather than a recovery, it would be something authors see as reliable and likely would never be getting warnings about. I'd generally like to avoid this framing to allow a swap over and/or removing the ambiguity being seen as a good thing. I would also note parsing files is DX cheaper than restarting a process and forcing updates to the code before doing so. Performance impacts are likely something that could also be left to package authors. Generally, authors do like their library to be more performant. Additionally this impact is only once, during startup. The errors from import would generally be near the top of the file bailing out the parse early as well. The impact is further potentially lessened if we can reuse code cache between the parses. |
Something we would need to decide is whether we’d want to print a warning on double parse. Something like “Node.js tried to run I think we wouldn’t want to warn on dependencies, as it would be annoying to see warnings triggered by third-party code. We could use the “under The benefit here is that the user is made aware that they’re running their code more slowly than it needs to be run, and how to fix it. The downside is that this is another bit of friction for new users; it makes it seem like they’re doing something wrong (which in a sense, they are) which means that any ESM-first beginner tutorials would need to address this. Also is Node really “ESM-first” if it prints a warning if you don’t opt into ESM. We should also do a benchmark to see just how much of a penalty this is. If it’s trivial, or if we can find a way to make it trivial, maybe it’s not worth warning about. |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
I think this proposal would slow things down for the average user, as well as printing a warning for everybody, as we all have quite old dependencies that are unlikely to get an update. |
The warning isn’t part of the proposal per se; we don’t necessarily need to print any warnings at all, for anyone. It was just an idea, of something that we might additionally want to consider. I wouldn’t want any warning that was triggered by dependencies; it would need to somehow be scoped to user code. As for performance, we don’t know until we test. By definition this new mode would do detection only for files that currently error, so no app that works today would get any slower. The question is how slow this “ESM by detection” mode is, compared against the baseline of “explicit ESM” (i.e. One last thing about performance: module loading time generally only affects initial startup time, for the typical app that isn’t doing dynamic |
These should remain unaffected by this specific proposal since this only affects a well known error condition and doesn't propagate results from one file to others. |
I don't think we need a WASM module for for the proposed use case, V8 should be fast enough because we are looking for early SyntaxErrors, instead of a particular legal way of property assignment. I doubt a WASM dependency can be faster than V8 in this regard, the V8 parser (or, to be precise, the V8 preparser that's used for compiling top-level code) is already built for looking for early syntax errors quickly. It's also likely that these files would have an |
I think having slightly slow ambiguous ESM is better than no ambiguous ESM if it's a fallback like this which should not impact CJS performance unless it fails with a SyntaxError. It gives us a path to be able to drop CJS someday in the future, if that possibility ever comes. (I personally have my doubts we will ever be able to drop CJS, but I see no harm in setting ourselves up to be able to if it ever becomes a reasonable possibility.) |
I expect most of the time the error to be super early as import syntax is usually on top, not sure how much of a concrete slow down it is. I also would expect no warnings unless a flag is set … when such flag is set I’d expect all warnings including This has the same goal of informing users which is not clear if they would ever benefit from such warning for their own entry point of choice, they could benefit as well for their dependency tree, imho. Other than this, I’d love to see this happening ❤️ |
Why is that? (I think I know but I'd like to see it explained explicitly.) |
I wonder how this would interact with the customisation hooks. It seems to me that it should happen when two conditions are present:
|
Because you can't require an ES module. |
Not if the parent module is ESM; if the module in question was imported. A CommonJS module can I think realistically we should maybe have a new |
The explicit hints via extension or type in package json are the opt out already, right? |
I meant how could the hook author force detection to happen or not happen, not how could the application or library author opt out. Generally customization hooks should be able to do whatever overrides or custom behaviors the hook author desires. |
My point is that if a dependency uses dynamic import you can’t add a new format down that road … I think this feature as enabled by default with a flag that shows warnings if desired at all levels would both solve concerns and help developers moving forward. If there is a default, that acts differently in node modules, with a new format to add at runtime, the branching becomes rather unbearable and it will be harder to understand anything happening behind the scene, imho. |
I’m not sure what this means. The “fallback to evaluate as ESM” only occurs on syntaxes that currently error in CommonJS. Since
This sounds like a separate feature that could be implemented as a follow-up. It’s also probably achievable via customization hooks. |
The reason this proposal includes dependencies is to avoid the hazard where a library works for its author, because the author is working in “ambiguous detection” mode, but then the library wouldn’t work for any consumers who install it if the detection stops at the |
tl;dr - I'm in favor of this proposal pretty much exactly how @GeoffreyBooth has written it here, and rejecting the suggestion I made above.
Good point, I was just thinking the same thing not long after posting that 😅 I think a better way to refine the suggestion (which I'm thinking I'm actually not advocating, but it's worth exploring to see why it's a bad idea) is that the setting of default type should be package-specific rather than module-specific. Ie:
More poking at this: What happens in this case?
// esm
import b from './bar.js'
// esm
export * from './baz.js'
Would the autodetection happen at the entry point, then evaluate all the other modules as ESM? Or is it going to autodetect each module, and ultimately treat them as if they were named It seems like if the state of the entry point sets the default If it was package-specific, then it would at least behave the same way locally as when installed. But, for example in the case above, |
I was initially leaning against this proposal in favor of #50043 because I was assuming that a) there’s no way this version could be performant, and b) there surely must be edge cases that would make this approach nonviable. I’ve since been persuaded that maybe this could be performant, or at least there’s no obvious reason why it couldn’t be, so it’s probably worth the effort of attempting an implementation to measure how fast or slow it is. And I keep trying to think of edge cases, and start writing out scenarios like “but what about The closest I can come to a case that breaks is the ambiguous-syntax file with no |
It's not nearly as rare as you'd think; there's a ton of heavily used packages from over a decade ago that only work when in sloppy mode, many of which have authors who are dead or burnt out and retired, and thus will never be updated. If one of these files is ran in strict mode, it will break and not be fixable. |
I’ve started a branch for this at https://github.com/GeoffreyBooth/node/tree/ambiguous-detection. I’m not sure whether this or #50043 is the more viable approach but I think it’s probably worth trying to implement this one first, to answer the question of how significant the performance impact is when running detection on every ambiguous file. If you’d like to help me implement this please ping on https://openjs-foundation.slack.com/archives/C053UCCP940 |
I assume you’re referring to package.json {
"name": "pkg",
"exports": {
".": {
"import": {
"node": "./main.js",
"default": "./main.mjs"
},
"require": "./main.js"
}
}
} In this case, would import "pkg" // exports["."].import.node -> detects as ESM require("pkg") // exports["."].require -> detects as CJS -> executes again? |
No, I’m referring to |
Ok, very glad I misinterpreted that 😅 |
Landed in #50096. |
Building off of #50043 (comment), this is an alternate proposal to #50043. This aims to achieve the same goals of allowing ESM syntax in “loose” files, where there are no
package.json
files present; and in files whose nearest parentpackage.json
lacks atype
field. This would permit ESM syntax without needing to opt in, while avoiding breaking existing scripts and tutorials. Like #50043, this proposal would avoid any breaking changes, and the aim is to eventually make this enabled by default without requiring a flag.The new behavior would be as follows:
.mjs
or.cjs
extension, either nopackage.json
or one that lacks atype
field) as CommonJS, as it already does today. If a file throws aSyntaxError
for something that is only allowed in ES modules (import
orexport
statement,import.meta
, top-levelawait
), Node would try again to evaluate the file as an ES module.node_modules
.import
. It would not apply to files referenced viarequire
..mjs
or.cjs
extension, apackage.json
type
field,--input-type
or--experimental-default-type
.Pros, in comparison with #50043:
Cons:
package.json
module
scope.@nodejs/loaders @nodejs/tsc
The text was updated successfully, but these errors were encountered: