-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should HTML parsers created for fragment parsing (i.e. innerHTML) or DOMParser spin the event loop in "stop parsing"? #8646
Comments
Does this mean DOMParser currently permits forcing arbitrary queued microtasks to be flushed “synchronously” (from the POV of the flush-inducing code) at will? I’d have expected (naively, anyway — I may not be picturing this right) the microtask checkpoint algorithm to be a noop any time user code is already on the execution stack. |
Just wondering, is it possible that there is something "that delays the load event" here? If I'm not mistaken, scripting is disabled by lack of a browsing context, so I guess the two first lists of scripts from step 5 and step 7 should always already be empty, but I'm not quite sure about this step 8, though it seems that most requests are also blocked because the Document is not "fully active". Asking because this reminds me of #6230 where it was already spotted that browsers apparently don't really spin the event loop when the condition is already met. I guess the same could be happening here (assuming in your tests you don't have the microtask-checkpoint). Note: I agree that a strict reading of the specs indeed would ask for it to be visited, and incur a microtask-checkpoint, and it also seems true that most of this "the end" seems off for these parsers. |
This will examine the algorithm known as "the end" from the HTML specification, which executes when parsing HTML markup has completed, and it's potential to observably run script or change certain attributes. This currently executes in our engine when parsing HTML received from the internet during navigation, using document.{open,write,close}, setting the innerHTML attribute or using DOMParser. The latter two are only possible by executing script. This has been causing some issues in our engine, which will be shown later, so we are considering removing the call to "the end" for these two cases. Spoiler: the implications of running "the end" for DOMParser will be considered in the future. It is the only script-created HTML/XML parser remaining after this commit that uses HTMLParser::run(AK::URL const&) and thus "the end", including it's XML variant implemented as XMLDocumentBuilder::document_end(). This will only focus on setting the innerHTML attribute, which falls under "HTML fragment parsing", which starts here in the specification: https://html.spec.whatwg.org/multipage/parsing.html#parsing-html-fragments https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L3491 While you may notice our HTMLParser::parse_html_fragment returns `void` and assume this means no scripts are executed because of our use of `WebIDL::ExceptionOr<T>` and `JS::ThrowCompletionOr<T>`, note that dispatched events will execute arbitrary script via a callback, catch any exceptions, report them and not propagate them. This means that while a function does not return an exception type, it can still potentially execute script. A breakdown of the steps of "the end" in the context of HTML fragment parsing and its observability follows: https://html.spec.whatwg.org/multipage/parsing.html#the-end https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L221 1. No-op, as we don't currently have speculative HTML parsing. Even if we did, we would instantly return after stopping the speculative HTML parser anyway. 2. No-op, document.{open,write,close} are not accessible from the temporary document. 3. No-op, document.readyState, window.navigation.timing and the readystatechange event are not accessible from the created temporary document. 4. This is presumably done so that reentrant invocation of the HTML parser from document.{write,close} during the firing of the events after step 4 ends up parsing from a clean state. This is a no-op, as the events after step 4 do not fire and are not accessible. 5. No-op, we set HTMLScriptElement::m_already_started to true when creating it whilst parsing an HTML fragment, which causes HTMLScriptElement::prepare_script to instantly bail, meaning `scripts_to_execute_when_parsing_has_finished` is always empty. 6. No-op, tasks are considered not runnable when the document does not have a browsing context, which is always the case in fragment parsing. Additionally, window.navigation.timing and the DOMContentLoaded event aren't reachable from the temporary document. 7. Almost a no-op, `scripts_to_execute_as_soon_as_possible` is always empty for the same reason as step 4. However, this step uses an unconditional `spin_until` call, which _is_ observable and causes one of the alluded to issues, which will be talked about later. 8. No-op, as delaying the load event has no purpose in this case, as the task in step 9 will set the current document readiness to "complete" and then return immediately after, as the temporary document has no browsing context, skipping the Window load event. However, this step causes another alluded to issue, which will be talked about later. 9. No-op, for the same reason as step 6. Additionally, document.readyState is not accessible from the temporary document and the temporary document has no browsing context, so navigation timing, the Window load event, the pageshow event, the Document load event and the `<iframe>` load steps are not executed at all. 10. No-op, as this flag is only set from window.print(), which is not accessible for this document. 11. No-op, as the temporary document is not accessible from anything else and will be immediately destroyed after HTML fragment parsing. Additionally, browsing context containers (`<iframe>`, `<frame>` and `<object>`) cannot run in documents with no browsing context: - `<iframe>` and `<frame>` use "create a new child navigable": https://html.spec.whatwg.org/multipage/document-sequences.html#create-a-new-child-navigable https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/BrowsingContextContainer.cpp#L43-L45 > 2. Let group be element's node document's browsing context's top-level browsing context's group. This requires the element's node document's browsing context to be non-null, but it is always null with the temporary document created for HTML fragment parsing. This is protected against here for `<iframe>`: https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:the-iframe-element-6 https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLIFrameElement.cpp#L45 > When an iframe element element is inserted into a document whose browsing context is non-null, the user agent must run these steps: 1. Create a new child navigable for element. This is currently not protected against for `<frame>` in the specification: https://html.spec.whatwg.org/multipage/obsolete.html#active-frame-element > A frame element is said to be an active frame element when it is in a document. > When a frame element element is created as an active frame element, or becomes an active frame element after not having been one, the user agent must run these steps: > 1. Create a new child navigable for element. However, since this would cause a null dereference, this is actually a specification issue. See: whatwg/html#9136 - `<object>` uses "queue an element task" and has a browsing context null check. https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-object-element:queue-an-element-task https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp#L58 https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp#L105 > ...the user agent must queue an element task on the DOM manipulation task source given the object element to run the following steps to (re)determine what the object element represents. As established above, tasks are not runnable in documents with null browsing contexts. However, for avoidance of doubt, it checks if the document's browsing context is null, and if so, it falls back to representing the element's children and gets rid of any child navigable the `<object>` element may have. > 2. If the element has an ancestor media element, or has an ancestor object element that is not showing its fallback content, or if the element is not in a document whose browsing context is non-null, or if the element's node document is not fully active, or if the element is still in the stack of open elements of an HTML parser or XML parser, or if the element is not being rendered, then jump to the step below labeled fallback. > 4. Fallback: The object element represents the element's children. This is the element's fallback content. Destroy a child navigable given the element. This check also protects against an `<object>` element being adopted from a document which has a browsing context to one that doesn't during the time between the element task being queued and then executed. This means a browsing context container cannot be ran, meaning browsing context containers cannot access their parent document and access the properties and events mentioned in steps 1-11 above, or use document.{open,write,close} on the parent document. Another potential avenue of running script via HTML fragment parsing is via custom elements being in the markup, which need to be synchronously upgraded. For example: ``` <custom-element></custom-element> ``` However, this is already protected against in the spec: https://html.spec.whatwg.org/multipage/parsing.html#create-an-element-for-the-token https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L643 > 7. If definition is non-null and the parser was not created as part of the HTML fragment parsing algorithm, then let will execute script be true. Otherwise, let it be false. It is protected against overall by disabling custom elements via returning `null` for all custom element definition lookups if the document has no browsing context, which is the case for the temporary document: https://html.spec.whatwg.org/multipage/custom-elements.html#look-up-a-custom-element-definition https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Document.cpp#L2106-L2108 > 2. If document's browsing context is null, return null. This is because the document doesn't have an associated Window, meaning there will be no associated CustomElementRegistry object. After running the HTML fragment parser, all of the child nodes are removed the temporary document and then adopted into the context element's node document. Skipping the `pre_remove` steps as they are not relevant in this case, let's first examine Node::remove()'s potential to execute script, then examine Document::adopt_node() after. https://dom.spec.whatwg.org/#concept-node-remove https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Node.cpp#L534 1-7. Does not run any script, it just keeps a copy of some data that will be needed later in the algorithm and directly modifies live range attributes. However, since this relies on Range objects containing the temporary document, the Range steps are no-ops. 8. Though this uses the temporary document, it does not contain any NodeIterator objects as no script should have run, thus this callback will not be entered. Even if the document _did_ have associated NodeIterators, NodeIterator::run_pre_removing_steps does not execute any script. 9-11. Does not run any script, it just keeps a copy of some data that will be needed later in the algorithm and performs direct tree mutation to remove the node from the node tree. 12-14. "assign slottables" and step 13 queue mutation observer microtasks via "signal a slot change". However, since this is done _after_ running "the end", the "spin the event loop" steps in that algorithm does not affect this. Remember that queued microtasks due not execute during this algorithm for the next few steps. Sidenote: Microtasks are supposed to be executed when the JavaScript execution context stack is empty. Since HTMLParser::parse_html_fragment is only called from script, the stack will never be empty whilst it is running, so microtasks will not run until some time after we exit this function. 15. This could potentially run script, let's have a look at the removal steps we currently have implemented in our engine: - HTMLIFrameElement::removed_from() https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:the-iframe-element-7 https://github.com/SerenityOS/serenity/blob/44cf92616e59bda951b67cdae78a6361bdd76f7a/Userland/Libraries/LibWeb/HTML/HTMLIFrameElement.cpp#L102 Since browsing context containers cannot create child browsing contexts (as shown above), this code will do nothing. This will also hold true when we implement HTMLFrameElement::removed_from() in the future. - FormAssociatedElement::removed_from() https://github.com/SerenityOS/serenity/blob/44cf92616e59bda951b67cdae78a6361bdd76f7a/Userland/Libraries/LibWeb/HTML/FormAssociatedElement.h#L36 This calls `form_node_was_removed` which can then potentially call `reset_form_owner`. However, `reset_form_owner` only does tree traversal to find the appropriate form owner and does not execute any script. After calling `form_node_was_removed` it then calls `form_associated_element_was_removed`, which is a virtual function that no one currently overrides, meaning no script is executed. - HTMLBaseElement::removed_from() https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLBaseElement.cpp#L45 This will call `Document::update_base_element` to do tree traversal to find out the new first `<base>` element with an href attribute and thus does not execute any script. - HTMLStyleElement::removed_from() https://html.spec.whatwg.org/multipage/semantics.html#update-a-style-block https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLStyleElement.cpp#L49 This will call `update_a_style_block`, which will parse the `<style>` element's text content as CSS and create a style sheet from it. This does not execute any script. In summary, step 15 does not currently execute any script and ideally shouldn't in the future when we implement more `removed_from` steps. 16. Does not run any script, just saves a copy of a variable. 17. Queues a "disconnectedCallback" custom elements callback. This will execute script in the future, but not here. 18. Performs step 15 and 17 in combination for each of the node's descendants. This will not execute any script. 19. Does not run any script, it performs a requirement of mutation observers by adding certain things to a list. 20. Does not execute any script, as mutation observer callbacks are done via microtasks. 21. This will not execute script, as the parent is always the temporary document in HTML fragment parsing. There is no Document children changed steps, so this step is a no-op. We then do layout invalidation which is our own addition, but this also does not execute any script. In short, removing a node does not execute any script. It could execute script in the future, but since this is done by tasks, it will not execute until we are outside of HTMLParser::parse_html_fragment. Let's look at adopting a node: https://dom.spec.whatwg.org/#concept-node-adopt https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Document.cpp#L1414 1. Does not run script, it just keeps a reference to the temporary document. 2. No-op, we removed the node above. 3.1. Does not execute script, it simply updates all descendants of the removed node to be in the context element's node document. 3.2. Does not execute script, see node removal step 17. 3.3. This could potentially execute script, let's have a look at the adopting steps we have implemented in our engine: - HTMLTemplateElement::adopted_from() https://html.spec.whatwg.org/multipage/scripting.html#the-template-element:concept-node-adopt-ext https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLTemplateElement.cpp#L38 This simply adopts the `<template>` element's DocumentFragment node into its inert document. This does not execute any script. We then have our own addition of adopting NodeIterators over to the context element's document, but this does not execute any script. In short, adopting a node does not execute any script. After adopting the nodes to the context element's document, HTML fragment parsing is complete and the temporary document is no longer accessible at all. Document and element event handlers are also not accessible, even if the event bubbles. This is simply because the temporary document is not accessible, so tree traversal, IDL event handler attributes and EventTarget#addEventListener are not accessible, on the document or any descendants. Document is also not an Element, so element event handler attributes do not apply. In summary, this establishes that HTML fragment parsers should not run any user script or internal C code that relies on things set up by "the end". This means that the attributes set up and events fired by "the end" are not observable in this case. This may have not explored every single possible avenue, but the general assertion should still hold. However, this assertion is violated by "the end" containing two unconditional "spin the event loop" invocations and causes issues with live web content, so we seek to avoid them. As WebKit, Blink and Gecko have been able to get away with doing fast path optimizations for HTML fragment parsing which don't setup navigation timing, run events, etc. it is presumed we are able to get away with not running "the end" for HTML fragment parsing as well. WebKit: https://github.com/WebKit/WebKit/blob/c69be377e17c2977681fef9113d13d91b62d1ee4/Source/WebCore/dom/DocumentFragment.cpp#L90-L98 Blink: https://github.com/chromium/chromium/blob/15444426f98a99830338697cce54e686e988815c/third_party/blink/renderer/core/editing/serializers/serialization.cc#L681-L702 Gecko: https://github.com/mozilla/gecko-dev/blob/6fc2f6d5335fb6f70f780b5fea5ed77b0719c3b5/dom/base/FragmentOrElement.cpp#L1991-L2002 Removing the call to "the end" fixes at least a couple of issues: - Inserting `<img>` elements via innerHTML causes us to spin forever. This regressed in SerenityOS@2413de7 This is because `m_load_event_delayer.clear()` is performed inside an element task callback. Because of the reasons stated above, this will never execute. This caused us to spin forever on step 8 of "the end", which is delaying the load event. This affected Google Docs and Google Maps, never allowing them to progress after performing this action. I have also seen it cause a Scorecard Research `<img>` beacon in a `<noscript>` element inserted via innerHTML to spin forever. This presumably affects many more sites as well. Given that the Window load event is not fired for HTML fragment parsers, spinning the event loop to delay the load event does not change anything, meaning this step can be skipped entirely. - Microtask timing is messed up by the unconditional `spin_until`s on steps 7 and 8. "Spin the event loop" causes an unconditional microtask checkpoint: https://html.spec.whatwg.org/multipage/webappapis.html#spin-the-event-loop https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/EventLoop/EventLoop.cpp#L54 > 3. Let old stack be a copy of the JavaScript execution context stack. > 4. Empty the JavaScript execution context stack. > 5. Perform a microtask checkpoint. > 6.2.1. Replace the JavaScript execution context stack with old stack. This broke YouTube with the introduction of custom elements, as custom elements use microtasks to upgrade elements and call callbacks. See whatwg/html#8646 for a full example reduced from YouTube's JavaScript. Another potential fix for this issue is to remove the above steps from "spin the event loop". However, since we have another issue with the use of "spin the event loop", it would be best to just avoid both calls to it. Considering all of the above, removing the call to "the end" is the way forward for HTML fragment parsing, as all of it should be a no-op. This is done by not using HTMLParser::run(AK::URL const&) but instead HTMLParser::run(). Because of certain things that use URLs that need to parse relative to the document's URL, such as `<img>`'s src attribute, we still set the temporary document's URL to the context element's node document URL. We then detach the parser from the temporary document, as HTMLParser::run(AK::URL const&) was doing for us. The end.
This will examine the algorithm known as "the end" from the HTML specification, which executes when parsing HTML markup has completed, and it's potential to observably run script or change certain attributes. This currently executes in our engine when parsing HTML received from the internet during navigation, using document.{open,write,close}, setting the innerHTML attribute or using DOMParser. The latter two are only possible by executing script. This has been causing some issues in our engine, which will be shown later, so we are considering removing the call to "the end" for these two cases. Spoiler: the implications of running "the end" for DOMParser will be considered in the future. It is the only script-created HTML/XML parser remaining after this commit that uses "the end", including it's XML variant implemented as XMLDocumentBuilder::document_end(). This will only focus on setting the innerHTML attribute, which falls under "HTML fragment parsing", which starts here in the specification: https://html.spec.whatwg.org/multipage/parsing.html#parsing-html-fragments https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L3491 While you may notice our HTMLParser::parse_html_fragment returns `void` and assume this means no scripts are executed because of our use of `WebIDL::ExceptionOr<T>` and `JS::ThrowCompletionOr<T>`, note that dispatched events will execute arbitrary script via a callback, catch any exceptions, report them and not propagate them. This means that while a function does not return an exception type, it can still potentially execute script. A breakdown of the steps of "the end" in the context of HTML fragment parsing and its observability follows: https://html.spec.whatwg.org/multipage/parsing.html#the-end https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L221 1. No-op, as we don't currently have speculative HTML parsing. Even if we did, we would instantly return after stopping the speculative HTML parser anyway. 2. No-op, document.{open,write,close} are not accessible from the temporary document. 3. No-op, document.readyState, window.navigation.timing and the readystatechange event are not accessible from the created temporary document. 4. This is presumably done so that reentrant invocation of the HTML parser from document.{write,close} during the firing of the events after step 4 ends up parsing from a clean state. This is a no-op, as the events after step 4 do not fire and are not accessible. 5. No-op, we set HTMLScriptElement::m_already_started to true when creating it whilst parsing an HTML fragment, which causes HTMLScriptElement::prepare_script to instantly bail, meaning `scripts_to_execute_when_parsing_has_finished` is always empty. 6. No-op, tasks are considered not runnable when the document does not have a browsing context, which is always the case in fragment parsing. Additionally, window.navigation.timing and the DOMContentLoaded event aren't reachable from the temporary document. 7. Almost a no-op, `scripts_to_execute_as_soon_as_possible` is always empty for the same reason as step 4. However, this step uses an unconditional `spin_until` call, which _is_ observable and causes one of the alluded to issues, which will be talked about later. 8. No-op, as delaying the load event has no purpose in this case, as the task in step 9 will set the current document readiness to "complete" and then return immediately after, as the temporary document has no browsing context, skipping the Window load event. However, this step causes another alluded to issue, which will be talked about later. 9. No-op, for the same reason as step 6. Additionally, document.readyState is not accessible from the temporary document and the temporary document has no browsing context, so navigation timing, the Window load event, the pageshow event, the Document load event and the `<iframe>` load steps are not executed at all. 10. No-op, as this flag is only set from window.print(), which is not accessible for this document. 11. No-op, as the temporary document is not accessible from anything else and will be immediately destroyed after HTML fragment parsing. Additionally, browsing context containers (`<iframe>`, `<frame>` and `<object>`) cannot run in documents with no browsing context: - `<iframe>` and `<frame>` use "create a new child navigable": https://html.spec.whatwg.org/multipage/document-sequences.html#create-a-new-child-navigable https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/BrowsingContextContainer.cpp#L43-L45 > 2. Let group be element's node document's browsing context's top-level browsing context's group. This requires the element's node document's browsing context to be non-null, but it is always null with the temporary document created for HTML fragment parsing. This is protected against here for `<iframe>`: https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:the-iframe-element-6 https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLIFrameElement.cpp#L45 > When an iframe element element is inserted into a document whose browsing context is non-null, the user agent must run these steps: 1. Create a new child navigable for element. This is currently not protected against for `<frame>` in the specification: https://html.spec.whatwg.org/multipage/obsolete.html#active-frame-element > A frame element is said to be an active frame element when it is in a document. > When a frame element element is created as an active frame element, or becomes an active frame element after not having been one, the user agent must run these steps: > 1. Create a new child navigable for element. However, since this would cause a null dereference, this is actually a specification issue. See: whatwg/html#9136 - `<object>` uses "queue an element task" and has a browsing context null check. https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-object-element:queue-an-element-task https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp#L58 https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp#L105 > ...the user agent must queue an element task on the DOM manipulation task source given the object element to run the following steps to (re)determine what the object element represents. As established above, tasks are not runnable in documents with null browsing contexts. However, for avoidance of doubt, it checks if the document's browsing context is null, and if so, it falls back to representing the element's children and gets rid of any child navigable the `<object>` element may have. > 2. If the element has an ancestor media element, or has an ancestor object element that is not showing its fallback content, or if the element is not in a document whose browsing context is non-null, or if the element's node document is not fully active, or if the element is still in the stack of open elements of an HTML parser or XML parser, or if the element is not being rendered, then jump to the step below labeled fallback. > 4. Fallback: The object element represents the element's children. This is the element's fallback content. Destroy a child navigable given the element. This check also protects against an `<object>` element being adopted from a document which has a browsing context to one that doesn't during the time between the element task being queued and then executed. This means a browsing context container cannot be ran, meaning browsing context containers cannot access their parent document and access the properties and events mentioned in steps 1-11 above, or use document.{open,write,close} on the parent document. Another potential avenue of running script via HTML fragment parsing is via custom elements being in the markup, which need to be synchronously upgraded. For example: ``` <custom-element></custom-element> ``` However, this is already protected against in the spec: https://html.spec.whatwg.org/multipage/parsing.html#create-an-element-for-the-token https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L643 > 7. If definition is non-null and the parser was not created as part of the HTML fragment parsing algorithm, then let will execute script be true. Otherwise, let it be false. It is protected against overall by disabling custom elements via returning `null` for all custom element definition lookups if the document has no browsing context, which is the case for the temporary document: https://html.spec.whatwg.org/multipage/custom-elements.html#look-up-a-custom-element-definition https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Document.cpp#L2106-L2108 > 2. If document's browsing context is null, return null. This is because the document doesn't have an associated Window, meaning there will be no associated CustomElementRegistry object. After running the HTML fragment parser, all of the child nodes are removed the temporary document and then adopted into the context element's node document. Skipping the `pre_remove` steps as they are not relevant in this case, let's first examine Node::remove()'s potential to execute script, then examine Document::adopt_node() after. https://dom.spec.whatwg.org/#concept-node-remove https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Node.cpp#L534 1-7. Does not run any script, it just keeps a copy of some data that will be needed later in the algorithm and directly modifies live range attributes. However, since this relies on Range objects containing the temporary document, the Range steps are no-ops. 8. Though this uses the temporary document, it does not contain any NodeIterator objects as no script should have run, thus this callback will not be entered. Even if the document _did_ have associated NodeIterators, NodeIterator::run_pre_removing_steps does not execute any script. 9-11. Does not run any script, it just keeps a copy of some data that will be needed later in the algorithm and performs direct tree mutation to remove the node from the node tree. 12-14. "assign slottables" and step 13 queue mutation observer microtasks via "signal a slot change". However, since this is done _after_ running "the end", the "spin the event loop" steps in that algorithm does not affect this. Remember that queued microtasks due not execute during this algorithm for the next few steps. Sidenote: Microtasks are supposed to be executed when the JavaScript execution context stack is empty. Since HTMLParser::parse_html_fragment is only called from script, the stack will never be empty whilst it is running, so microtasks will not run until some time after we exit this function. 15. This could potentially run script, let's have a look at the removal steps we currently have implemented in our engine: - HTMLIFrameElement::removed_from() https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:the-iframe-element-7 https://github.com/SerenityOS/serenity/blob/44cf92616e59bda951b67cdae78a6361bdd76f7a/Userland/Libraries/LibWeb/HTML/HTMLIFrameElement.cpp#L102 Since browsing context containers cannot create child browsing contexts (as shown above), this code will do nothing. This will also hold true when we implement HTMLFrameElement::removed_from() in the future. - FormAssociatedElement::removed_from() https://github.com/SerenityOS/serenity/blob/44cf92616e59bda951b67cdae78a6361bdd76f7a/Userland/Libraries/LibWeb/HTML/FormAssociatedElement.h#L36 This calls `form_node_was_removed` which can then potentially call `reset_form_owner`. However, `reset_form_owner` only does tree traversal to find the appropriate form owner and does not execute any script. After calling `form_node_was_removed` it then calls `form_associated_element_was_removed`, which is a virtual function that no one currently overrides, meaning no script is executed. - HTMLBaseElement::removed_from() https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLBaseElement.cpp#L45 This will call `Document::update_base_element` to do tree traversal to find out the new first `<base>` element with an href attribute and thus does not execute any script. - HTMLStyleElement::removed_from() https://html.spec.whatwg.org/multipage/semantics.html#update-a-style-block https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLStyleElement.cpp#L49 This will call `update_a_style_block`, which will parse the `<style>` element's text content as CSS and create a style sheet from it. This does not execute any script. In summary, step 15 does not currently execute any script and ideally shouldn't in the future when we implement more `removed_from` steps. 16. Does not run any script, just saves a copy of a variable. 17. Queues a "disconnectedCallback" custom elements callback. This will execute script in the future, but not here. 18. Performs step 15 and 17 in combination for each of the node's descendants. This will not execute any script. 19. Does not run any script, it performs a requirement of mutation observers by adding certain things to a list. 20. Does not execute any script, as mutation observer callbacks are done via microtasks. 21. This will not execute script, as the parent is always the temporary document in HTML fragment parsing. There is no Document children changed steps, so this step is a no-op. We then do layout invalidation which is our own addition, but this also does not execute any script. In short, removing a node does not execute any script. It could execute script in the future, but since this is done by tasks, it will not execute until we are outside of HTMLParser::parse_html_fragment. Let's look at adopting a node: https://dom.spec.whatwg.org/#concept-node-adopt https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Document.cpp#L1414 1. Does not run script, it just keeps a reference to the temporary document. 2. No-op, we removed the node above. 3.1. Does not execute script, it simply updates all descendants of the removed node to be in the context element's node document. 3.2. Does not execute script, see node removal step 17. 3.3. This could potentially execute script, let's have a look at the adopting steps we have implemented in our engine: - HTMLTemplateElement::adopted_from() https://html.spec.whatwg.org/multipage/scripting.html#the-template-element:concept-node-adopt-ext https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLTemplateElement.cpp#L38 This simply adopts the `<template>` element's DocumentFragment node into its inert document. This does not execute any script. We then have our own addition of adopting NodeIterators over to the context element's document, but this does not execute any script. In short, adopting a node does not execute any script. After adopting the nodes to the context element's document, HTML fragment parsing is complete and the temporary document is no longer accessible at all. Document and element event handlers are also not accessible, even if the event bubbles. This is simply because the temporary document is not accessible, so tree traversal, IDL event handler attributes and EventTarget#addEventListener are not accessible, on the document or any descendants. Document is also not an Element, so element event handler attributes do not apply. In summary, this establishes that HTML fragment parsers should not run any user script or internal C code that relies on things set up by "the end". This means that the attributes set up and events fired by "the end" are not observable in this case. This may have not explored every single possible avenue, but the general assertion should still hold. However, this assertion is violated by "the end" containing two unconditional "spin the event loop" invocations and causes issues with live web content, so we seek to avoid them. As WebKit, Blink and Gecko have been able to get away with doing fast path optimizations for HTML fragment parsing which don't setup navigation timing, run events, etc. it is presumed we are able to get away with not running "the end" for HTML fragment parsing as well. WebKit: https://github.com/WebKit/WebKit/blob/c69be377e17c2977681fef9113d13d91b62d1ee4/Source/WebCore/dom/DocumentFragment.cpp#L90-L98 Blink: https://github.com/chromium/chromium/blob/15444426f98a99830338697cce54e686e988815c/third_party/blink/renderer/core/editing/serializers/serialization.cc#L681-L702 Gecko: https://github.com/mozilla/gecko-dev/blob/6fc2f6d5335fb6f70f780b5fea5ed77b0719c3b5/dom/base/FragmentOrElement.cpp#L1991-L2002 Removing the call to "the end" fixes at least a couple of issues: - Inserting `<img>` elements via innerHTML causes us to spin forever. This regressed in SerenityOS@2413de7 This is because `m_load_event_delayer.clear()` is performed inside an element task callback. Because of the reasons stated above, this will never execute. This caused us to spin forever on step 8 of "the end", which is delaying the load event. This affected Google Docs and Google Maps, never allowing them to progress after performing this action. I have also seen it cause a Scorecard Research `<img>` beacon in a `<noscript>` element inserted via innerHTML to spin forever. This presumably affects many more sites as well. Given that the Window load event is not fired for HTML fragment parsers, spinning the event loop to delay the load event does not change anything, meaning this step can be skipped entirely. - Microtask timing is messed up by the unconditional `spin_until`s on steps 7 and 8. "Spin the event loop" causes an unconditional microtask checkpoint: https://html.spec.whatwg.org/multipage/webappapis.html#spin-the-event-loop https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/EventLoop/EventLoop.cpp#L54 > 3. Let old stack be a copy of the JavaScript execution context stack. > 4. Empty the JavaScript execution context stack. > 5. Perform a microtask checkpoint. > 6.2.1. Replace the JavaScript execution context stack with old stack. This broke YouTube with the introduction of custom elements, as custom elements use microtasks to upgrade elements and call callbacks. See whatwg/html#8646 for a full example reduced from YouTube's JavaScript. Another potential fix for this issue is to remove the above steps from "spin the event loop". However, since we have another issue with the use of "spin the event loop", it would be best to just avoid both calls to it. Considering all of the above, removing the call to "the end" is the way forward for HTML fragment parsing, as all of it should be a no-op. This is done by not simply returning from "the end" if the HTML parser was created for HTML fragment parsing. The end.
This will examine the algorithm known as "the end" from the HTML specification, which executes when parsing HTML markup has completed, and it's potential to observably run script or change certain attributes. This currently executes in our engine when parsing HTML received from the internet during navigation, using document.{open,write,close}, setting the innerHTML attribute or using DOMParser. The latter two are only possible by executing script. This has been causing some issues in our engine, which will be shown later, so we are considering removing the call to "the end" for these two cases. Spoiler: the implications of running "the end" for DOMParser will be considered in the future. It is the only script-created HTML/XML parser remaining after this commit that uses "the end", including it's XML variant implemented as XMLDocumentBuilder::document_end(). This will only focus on setting the innerHTML attribute, which falls under "HTML fragment parsing", which starts here in the specification: https://html.spec.whatwg.org/multipage/parsing.html#parsing-html-fragments https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L3491 While you may notice our HTMLParser::parse_html_fragment returns `void` and assume this means no scripts are executed because of our use of `WebIDL::ExceptionOr<T>` and `JS::ThrowCompletionOr<T>`, note that dispatched events will execute arbitrary script via a callback, catch any exceptions, report them and not propagate them. This means that while a function does not return an exception type, it can still potentially execute script. A breakdown of the steps of "the end" in the context of HTML fragment parsing and its observability follows: https://html.spec.whatwg.org/multipage/parsing.html#the-end https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L221 1. No-op, as we don't currently have speculative HTML parsing. Even if we did, we would instantly return after stopping the speculative HTML parser anyway. 2. No-op, document.{open,write,close} are not accessible from the temporary document. 3. No-op, document.readyState, window.navigation.timing and the readystatechange event are not accessible from the created temporary document. 4. This is presumably done so that reentrant invocation of the HTML parser from document.{write,close} during the firing of the events after step 4 ends up parsing from a clean state. This is a no-op, as the events after step 4 do not fire and are not accessible. 5. No-op, we set HTMLScriptElement::m_already_started to true when creating it whilst parsing an HTML fragment, which causes HTMLScriptElement::prepare_script to instantly bail, meaning `scripts_to_execute_when_parsing_has_finished` is always empty. 6. No-op, tasks are considered not runnable when the document does not have a browsing context, which is always the case in fragment parsing. Additionally, window.navigation.timing and the DOMContentLoaded event aren't reachable from the temporary document. 7. Almost a no-op, `scripts_to_execute_as_soon_as_possible` is always empty for the same reason as step 4. However, this step uses an unconditional `spin_until` call, which _is_ observable and causes one of the alluded to issues, which will be talked about later. 8. No-op, as delaying the load event has no purpose in this case, as the task in step 9 will set the current document readiness to "complete" and then return immediately after, as the temporary document has no browsing context, skipping the Window load event. However, this step causes another alluded to issue, which will be talked about later. 9. No-op, for the same reason as step 6. Additionally, document.readyState is not accessible from the temporary document and the temporary document has no browsing context, so navigation timing, the Window load event, the pageshow event, the Document load event and the `<iframe>` load steps are not executed at all. 10. No-op, as this flag is only set from window.print(), which is not accessible for this document. 11. No-op, as the temporary document is not accessible from anything else and will be immediately destroyed after HTML fragment parsing. Additionally, browsing context containers (`<iframe>`, `<frame>` and `<object>`) cannot run in documents with no browsing context: - `<iframe>` and `<frame>` use "create a new child navigable": https://html.spec.whatwg.org/multipage/document-sequences.html#create-a-new-child-navigable https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/BrowsingContextContainer.cpp#L43-L45 > 2. Let group be element's node document's browsing context's top-level browsing context's group. This requires the element's node document's browsing context to be non-null, but it is always null with the temporary document created for HTML fragment parsing. This is protected against here for `<iframe>`: https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:the-iframe-element-6 https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLIFrameElement.cpp#L45 > When an iframe element element is inserted into a document whose browsing context is non-null, the user agent must run these steps: 1. Create a new child navigable for element. This is currently not protected against for `<frame>` in the specification: https://html.spec.whatwg.org/multipage/obsolete.html#active-frame-element > A frame element is said to be an active frame element when it is in a document. > When a frame element element is created as an active frame element, or becomes an active frame element after not having been one, the user agent must run these steps: > 1. Create a new child navigable for element. However, since this would cause a null dereference, this is actually a specification issue. See: whatwg/html#9136 - `<object>` uses "queue an element task" and has a browsing context null check. https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-object-element:queue-an-element-task https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp#L58 https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp#L105 > ...the user agent must queue an element task on the DOM manipulation task source given the object element to run the following steps to (re)determine what the object element represents. As established above, tasks are not runnable in documents with null browsing contexts. However, for avoidance of doubt, it checks if the document's browsing context is null, and if so, it falls back to representing the element's children and gets rid of any child navigable the `<object>` element may have. > 2. If the element has an ancestor media element, or has an ancestor object element that is not showing its fallback content, or if the element is not in a document whose browsing context is non-null, or if the element's node document is not fully active, or if the element is still in the stack of open elements of an HTML parser or XML parser, or if the element is not being rendered, then jump to the step below labeled fallback. > 4. Fallback: The object element represents the element's children. This is the element's fallback content. Destroy a child navigable given the element. This check also protects against an `<object>` element being adopted from a document which has a browsing context to one that doesn't during the time between the element task being queued and then executed. This means a browsing context container cannot be ran, meaning browsing context containers cannot access their parent document and access the properties and events mentioned in steps 1-11 above, or use document.{open,write,close} on the parent document. Another potential avenue of running script via HTML fragment parsing is via custom elements being in the markup, which need to be synchronously upgraded. For example: ``` <custom-element></custom-element> ``` However, this is already protected against in the spec: https://html.spec.whatwg.org/multipage/parsing.html#create-an-element-for-the-token https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L643 > 7. If definition is non-null and the parser was not created as part of the HTML fragment parsing algorithm, then let will execute script be true. Otherwise, let it be false. It is protected against overall by disabling custom elements via returning `null` for all custom element definition lookups if the document has no browsing context, which is the case for the temporary document: https://html.spec.whatwg.org/multipage/custom-elements.html#look-up-a-custom-element-definition https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Document.cpp#L2106-L2108 > 2. If document's browsing context is null, return null. This is because the document doesn't have an associated Window, meaning there will be no associated CustomElementRegistry object. After running the HTML fragment parser, all of the child nodes are removed the temporary document and then adopted into the context element's node document. Skipping the `pre_remove` steps as they are not relevant in this case, let's first examine Node::remove()'s potential to execute script, then examine Document::adopt_node() after. https://dom.spec.whatwg.org/#concept-node-remove https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Node.cpp#L534 1-7. Does not run any script, it just keeps a copy of some data that will be needed later in the algorithm and directly modifies live range attributes. However, since this relies on Range objects containing the temporary document, the Range steps are no-ops. 8. Though this uses the temporary document, it does not contain any NodeIterator objects as no script should have run, thus this callback will not be entered. Even if the document _did_ have associated NodeIterators, NodeIterator::run_pre_removing_steps does not execute any script. 9-11. Does not run any script, it just keeps a copy of some data that will be needed later in the algorithm and performs direct tree mutation to remove the node from the node tree. 12-14. "assign slottables" and step 13 queue mutation observer microtasks via "signal a slot change". However, since this is done _after_ running "the end", the "spin the event loop" steps in that algorithm does not affect this. Remember that queued microtasks due not execute during this algorithm for the next few steps. Sidenote: Microtasks are supposed to be executed when the JavaScript execution context stack is empty. Since HTMLParser::parse_html_fragment is only called from script, the stack will never be empty whilst it is running, so microtasks will not run until some time after we exit this function. 15. This could potentially run script, let's have a look at the removal steps we currently have implemented in our engine: - HTMLIFrameElement::removed_from() https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:the-iframe-element-7 https://github.com/SerenityOS/serenity/blob/44cf92616e59bda951b67cdae78a6361bdd76f7a/Userland/Libraries/LibWeb/HTML/HTMLIFrameElement.cpp#L102 Since browsing context containers cannot create child browsing contexts (as shown above), this code will do nothing. This will also hold true when we implement HTMLFrameElement::removed_from() in the future. - FormAssociatedElement::removed_from() https://github.com/SerenityOS/serenity/blob/44cf92616e59bda951b67cdae78a6361bdd76f7a/Userland/Libraries/LibWeb/HTML/FormAssociatedElement.h#L36 This calls `form_node_was_removed` which can then potentially call `reset_form_owner`. However, `reset_form_owner` only does tree traversal to find the appropriate form owner and does not execute any script. After calling `form_node_was_removed` it then calls `form_associated_element_was_removed`, which is a virtual function that no one currently overrides, meaning no script is executed. - HTMLBaseElement::removed_from() https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLBaseElement.cpp#L45 This will call `Document::update_base_element` to do tree traversal to find out the new first `<base>` element with an href attribute and thus does not execute any script. - HTMLStyleElement::removed_from() https://html.spec.whatwg.org/multipage/semantics.html#update-a-style-block https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLStyleElement.cpp#L49 This will call `update_a_style_block`, which will parse the `<style>` element's text content as CSS and create a style sheet from it. This does not execute any script. In summary, step 15 does not currently execute any script and ideally shouldn't in the future when we implement more `removed_from` steps. 16. Does not run any script, just saves a copy of a variable. 17. Queues a "disconnectedCallback" custom elements callback. This will execute script in the future, but not here. 18. Performs step 15 and 17 in combination for each of the node's descendants. This will not execute any script. 19. Does not run any script, it performs a requirement of mutation observers by adding certain things to a list. 20. Does not execute any script, as mutation observer callbacks are done via microtasks. 21. This will not execute script, as the parent is always the temporary document in HTML fragment parsing. There is no Document children changed steps, so this step is a no-op. We then do layout invalidation which is our own addition, but this also does not execute any script. In short, removing a node does not execute any script. It could execute script in the future, but since this is done by tasks, it will not execute until we are outside of HTMLParser::parse_html_fragment. Let's look at adopting a node: https://dom.spec.whatwg.org/#concept-node-adopt https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Document.cpp#L1414 1. Does not run script, it just keeps a reference to the temporary document. 2. No-op, we removed the node above. 3.1. Does not execute script, it simply updates all descendants of the removed node to be in the context element's node document. 3.2. Does not execute script, see node removal step 17. 3.3. This could potentially execute script, let's have a look at the adopting steps we have implemented in our engine: - HTMLTemplateElement::adopted_from() https://html.spec.whatwg.org/multipage/scripting.html#the-template-element:concept-node-adopt-ext https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLTemplateElement.cpp#L38 This simply adopts the `<template>` element's DocumentFragment node into its inert document. This does not execute any script. We then have our own addition of adopting NodeIterators over to the context element's document, but this does not execute any script. In short, adopting a node does not execute any script. After adopting the nodes to the context element's document, HTML fragment parsing is complete and the temporary document is no longer accessible at all. Document and element event handlers are also not accessible, even if the event bubbles. This is simply because the temporary document is not accessible, so tree traversal, IDL event handler attributes and EventTarget#addEventListener are not accessible, on the document or any descendants. Document is also not an Element, so element event handler attributes do not apply. In summary, this establishes that HTML fragment parsers should not run any user script or internal C code that relies on things set up by "the end". This means that the attributes set up and events fired by "the end" are not observable in this case. This may have not explored every single possible avenue, but the general assertion should still hold. However, this assertion is violated by "the end" containing two unconditional "spin the event loop" invocations and causes issues with live web content, so we seek to avoid them. As WebKit, Blink and Gecko have been able to get away with doing fast path optimizations for HTML fragment parsing which don't setup navigation timing, run events, etc. it is presumed we are able to get away with not running "the end" for HTML fragment parsing as well. WebKit: https://github.com/WebKit/WebKit/blob/c69be377e17c2977681fef9113d13d91b62d1ee4/Source/WebCore/dom/DocumentFragment.cpp#L90-L98 Blink: https://github.com/chromium/chromium/blob/15444426f98a99830338697cce54e686e988815c/third_party/blink/renderer/core/editing/serializers/serialization.cc#L681-L702 Gecko: https://github.com/mozilla/gecko-dev/blob/6fc2f6d5335fb6f70f780b5fea5ed77b0719c3b5/dom/base/FragmentOrElement.cpp#L1991-L2002 Removing the call to "the end" fixes at least a couple of issues: - Inserting `<img>` elements via innerHTML causes us to spin forever. This regressed in SerenityOS@2413de7 This is because `m_load_event_delayer.clear()` is performed inside an element task callback. Because of the reasons stated above, this will never execute. This caused us to spin forever on step 8 of "the end", which is delaying the load event. This affected Google Docs and Google Maps, never allowing them to progress after performing this action. I have also seen it cause a Scorecard Research `<img>` beacon in a `<noscript>` element inserted via innerHTML to spin forever. This presumably affects many more sites as well. Given that the Window load event is not fired for HTML fragment parsers, spinning the event loop to delay the load event does not change anything, meaning this step can be skipped entirely. - Microtask timing is messed up by the unconditional `spin_until`s on steps 7 and 8. "Spin the event loop" causes an unconditional microtask checkpoint: https://html.spec.whatwg.org/multipage/webappapis.html#spin-the-event-loop https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/EventLoop/EventLoop.cpp#L54 > 3. Let old stack be a copy of the JavaScript execution context stack. > 4. Empty the JavaScript execution context stack. > 5. Perform a microtask checkpoint. > 6.2.1. Replace the JavaScript execution context stack with old stack. This broke YouTube with the introduction of custom elements, as custom elements use microtasks to upgrade elements and call callbacks. See whatwg/html#8646 for a full example reduced from YouTube's JavaScript. Another potential fix for this issue is to remove the above steps from "spin the event loop". However, since we have another issue with the use of "spin the event loop", it would be best to just avoid both calls to it. Considering all of the above, removing the call to "the end" is the way forward for HTML fragment parsing, as all of it should be a no-op. This is done by not simply returning from "the end" if the HTML parser was created for HTML fragment parsing. The end.
This will examine the algorithm known as "the end" from the HTML specification, which executes when parsing HTML markup has completed, and it's potential to observably run script or change certain attributes. This currently executes in our engine when parsing HTML received from the internet during navigation, using document.{open,write,close}, setting the innerHTML attribute or using DOMParser. The latter two are only possible by executing script. This has been causing some issues in our engine, which will be shown later, so we are considering removing the call to "the end" for these two cases. Spoiler: the implications of running "the end" for DOMParser will be considered in the future. It is the only script-created HTML/XML parser remaining after this commit that uses "the end", including it's XML variant implemented as XMLDocumentBuilder::document_end(). This will only focus on setting the innerHTML attribute, which falls under "HTML fragment parsing", which starts here in the specification: https://html.spec.whatwg.org/multipage/parsing.html#parsing-html-fragments https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L3491 While you may notice our HTMLParser::parse_html_fragment returns `void` and assume this means no scripts are executed because of our use of `WebIDL::ExceptionOr<T>` and `JS::ThrowCompletionOr<T>`, note that dispatched events will execute arbitrary script via a callback, catch any exceptions, report them and not propagate them. This means that while a function does not return an exception type, it can still potentially execute script. A breakdown of the steps of "the end" in the context of HTML fragment parsing and its observability follows: https://html.spec.whatwg.org/multipage/parsing.html#the-end https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L221 1. No-op, as we don't currently have speculative HTML parsing. Even if we did, we would instantly return after stopping the speculative HTML parser anyway. 2. No-op, document.{open,write,close} are not accessible from the temporary document. 3. No-op, document.readyState, window.navigation.timing and the readystatechange event are not accessible from the created temporary document. 4. This is presumably done so that reentrant invocation of the HTML parser from document.{write,close} during the firing of the events after step 4 ends up parsing from a clean state. This is a no-op, as the events after step 4 do not fire and are not accessible. 5. No-op, we set HTMLScriptElement::m_already_started to true when creating it whilst parsing an HTML fragment, which causes HTMLScriptElement::prepare_script to instantly bail, meaning `scripts_to_execute_when_parsing_has_finished` is always empty. 6. No-op, tasks are considered not runnable when the document does not have a browsing context, which is always the case in fragment parsing. Additionally, window.navigation.timing and the DOMContentLoaded event aren't reachable from the temporary document. 7. Almost a no-op, `scripts_to_execute_as_soon_as_possible` is always empty for the same reason as step 4. However, this step uses an unconditional `spin_until` call, which _is_ observable and causes one of the alluded to issues, which will be talked about later. 8. No-op, as delaying the load event has no purpose in this case, as the task in step 9 will set the current document readiness to "complete" and then return immediately after, as the temporary document has no browsing context, skipping the Window load event. However, this step causes another alluded to issue, which will be talked about later. 9. No-op, for the same reason as step 6. Additionally, document.readyState is not accessible from the temporary document and the temporary document has no browsing context, so navigation timing, the Window load event, the pageshow event, the Document load event and the `<iframe>` load steps are not executed at all. 10. No-op, as this flag is only set from window.print(), which is not accessible for this document. 11. No-op, as the temporary document is not accessible from anything else and will be immediately destroyed after HTML fragment parsing. Additionally, browsing context containers (`<iframe>`, `<frame>` and `<object>`) cannot run in documents with no browsing context: - `<iframe>` and `<frame>` use "create a new child navigable": https://html.spec.whatwg.org/multipage/document-sequences.html#create-a-new-child-navigable https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/BrowsingContextContainer.cpp#L43-L45 > 2. Let group be element's node document's browsing context's top-level browsing context's group. This requires the element's node document's browsing context to be non-null, but it is always null with the temporary document created for HTML fragment parsing. This is protected against here for `<iframe>`: https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:the-iframe-element-6 https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLIFrameElement.cpp#L45 > When an iframe element element is inserted into a document whose browsing context is non-null, the user agent must run these steps: 1. Create a new child navigable for element. This is currently not protected against for `<frame>` in the specification: https://html.spec.whatwg.org/multipage/obsolete.html#active-frame-element > A frame element is said to be an active frame element when it is in a document. > When a frame element element is created as an active frame element, or becomes an active frame element after not having been one, the user agent must run these steps: > 1. Create a new child navigable for element. However, since this would cause a null dereference, this is actually a specification issue. See: whatwg/html#9136 - `<object>` uses "queue an element task" and has a browsing context null check. https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-object-element:queue-an-element-task https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp#L58 https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLObjectElement.cpp#L105 > ...the user agent must queue an element task on the DOM manipulation task source given the object element to run the following steps to (re)determine what the object element represents. As established above, tasks are not runnable in documents with null browsing contexts. However, for avoidance of doubt, it checks if the document's browsing context is null, and if so, it falls back to representing the element's children and gets rid of any child navigable the `<object>` element may have. > 2. If the element has an ancestor media element, or has an ancestor object element that is not showing its fallback content, or if the element is not in a document whose browsing context is non-null, or if the element's node document is not fully active, or if the element is still in the stack of open elements of an HTML parser or XML parser, or if the element is not being rendered, then jump to the step below labeled fallback. > 4. Fallback: The object element represents the element's children. This is the element's fallback content. Destroy a child navigable given the element. This check also protects against an `<object>` element being adopted from a document which has a browsing context to one that doesn't during the time between the element task being queued and then executed. This means a browsing context container cannot be ran, meaning browsing context containers cannot access their parent document and access the properties and events mentioned in steps 1-11 above, or use document.{open,write,close} on the parent document. Another potential avenue of running script via HTML fragment parsing is via custom elements being in the markup, which need to be synchronously upgraded. For example: ``` <custom-element></custom-element> ``` However, this is already protected against in the spec: https://html.spec.whatwg.org/multipage/parsing.html#create-an-element-for-the-token https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/Parser/HTMLParser.cpp#L643 > 7. If definition is non-null and the parser was not created as part of the HTML fragment parsing algorithm, then let will execute script be true. Otherwise, let it be false. It is protected against overall by disabling custom elements via returning `null` for all custom element definition lookups if the document has no browsing context, which is the case for the temporary document: https://html.spec.whatwg.org/multipage/custom-elements.html#look-up-a-custom-element-definition https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Document.cpp#L2106-L2108 > 2. If document's browsing context is null, return null. This is because the document doesn't have an associated Window, meaning there will be no associated CustomElementRegistry object. After running the HTML fragment parser, all of the child nodes are removed the temporary document and then adopted into the context element's node document. Skipping the `pre_remove` steps as they are not relevant in this case, let's first examine Node::remove()'s potential to execute script, then examine Document::adopt_node() after. https://dom.spec.whatwg.org/#concept-node-remove https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Node.cpp#L534 1-7. Does not run any script, it just keeps a copy of some data that will be needed later in the algorithm and directly modifies live range attributes. However, since this relies on Range objects containing the temporary document, the Range steps are no-ops. 8. Though this uses the temporary document, it does not contain any NodeIterator objects as no script should have run, thus this callback will not be entered. Even if the document _did_ have associated NodeIterators, NodeIterator::run_pre_removing_steps does not execute any script. 9-11. Does not run any script, it just keeps a copy of some data that will be needed later in the algorithm and performs direct tree mutation to remove the node from the node tree. 12-14. "assign slottables" and step 13 queue mutation observer microtasks via "signal a slot change". However, since this is done _after_ running "the end", the "spin the event loop" steps in that algorithm does not affect this. Remember that queued microtasks due not execute during this algorithm for the next few steps. Sidenote: Microtasks are supposed to be executed when the JavaScript execution context stack is empty. Since HTMLParser::parse_html_fragment is only called from script, the stack will never be empty whilst it is running, so microtasks will not run until some time after we exit this function. 15. This could potentially run script, let's have a look at the removal steps we currently have implemented in our engine: - HTMLIFrameElement::removed_from() https://html.spec.whatwg.org/multipage/iframe-embed-object.html#the-iframe-element:the-iframe-element-7 https://github.com/SerenityOS/serenity/blob/44cf92616e59bda951b67cdae78a6361bdd76f7a/Userland/Libraries/LibWeb/HTML/HTMLIFrameElement.cpp#L102 Since browsing context containers cannot create child browsing contexts (as shown above), this code will do nothing. This will also hold true when we implement HTMLFrameElement::removed_from() in the future. - FormAssociatedElement::removed_from() https://github.com/SerenityOS/serenity/blob/44cf92616e59bda951b67cdae78a6361bdd76f7a/Userland/Libraries/LibWeb/HTML/FormAssociatedElement.h#L36 This calls `form_node_was_removed` which can then potentially call `reset_form_owner`. However, `reset_form_owner` only does tree traversal to find the appropriate form owner and does not execute any script. After calling `form_node_was_removed` it then calls `form_associated_element_was_removed`, which is a virtual function that no one currently overrides, meaning no script is executed. - HTMLBaseElement::removed_from() https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLBaseElement.cpp#L45 This will call `Document::update_base_element` to do tree traversal to find out the new first `<base>` element with an href attribute and thus does not execute any script. - HTMLStyleElement::removed_from() https://html.spec.whatwg.org/multipage/semantics.html#update-a-style-block https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLStyleElement.cpp#L49 This will call `update_a_style_block`, which will parse the `<style>` element's text content as CSS and create a style sheet from it. This does not execute any script. In summary, step 15 does not currently execute any script and ideally shouldn't in the future when we implement more `removed_from` steps. 16. Does not run any script, just saves a copy of a variable. 17. Queues a "disconnectedCallback" custom elements callback. This will execute script in the future, but not here. 18. Performs step 15 and 17 in combination for each of the node's descendants. This will not execute any script. 19. Does not run any script, it performs a requirement of mutation observers by adding certain things to a list. 20. Does not execute any script, as mutation observer callbacks are done via microtasks. 21. This will not execute script, as the parent is always the temporary document in HTML fragment parsing. There is no Document children changed steps, so this step is a no-op. We then do layout invalidation which is our own addition, but this also does not execute any script. In short, removing a node does not execute any script. It could execute script in the future, but since this is done by tasks, it will not execute until we are outside of HTMLParser::parse_html_fragment. Let's look at adopting a node: https://dom.spec.whatwg.org/#concept-node-adopt https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/DOM/Document.cpp#L1414 1. Does not run script, it just keeps a reference to the temporary document. 2. No-op, we removed the node above. 3.1. Does not execute script, it simply updates all descendants of the removed node to be in the context element's node document. 3.2. Does not execute script, see node removal step 17. 3.3. This could potentially execute script, let's have a look at the adopting steps we have implemented in our engine: - HTMLTemplateElement::adopted_from() https://html.spec.whatwg.org/multipage/scripting.html#the-template-element:concept-node-adopt-ext https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/HTMLTemplateElement.cpp#L38 This simply adopts the `<template>` element's DocumentFragment node into its inert document. This does not execute any script. We then have our own addition of adopting NodeIterators over to the context element's document, but this does not execute any script. In short, adopting a node does not execute any script. After adopting the nodes to the context element's document, HTML fragment parsing is complete and the temporary document is no longer accessible at all. Document and element event handlers are also not accessible, even if the event bubbles. This is simply because the temporary document is not accessible, so tree traversal, IDL event handler attributes and EventTarget#addEventListener are not accessible, on the document or any descendants. Document is also not an Element, so element event handler attributes do not apply. In summary, this establishes that HTML fragment parsers should not run any user script or internal C code that relies on things set up by "the end". This means that the attributes set up and events fired by "the end" are not observable in this case. This may have not explored every single possible avenue, but the general assertion should still hold. However, this assertion is violated by "the end" containing two unconditional "spin the event loop" invocations and causes issues with live web content, so we seek to avoid them. As WebKit, Blink and Gecko have been able to get away with doing fast path optimizations for HTML fragment parsing which don't setup navigation timing, run events, etc. it is presumed we are able to get away with not running "the end" for HTML fragment parsing as well. WebKit: https://github.com/WebKit/WebKit/blob/c69be377e17c2977681fef9113d13d91b62d1ee4/Source/WebCore/dom/DocumentFragment.cpp#L90-L98 Blink: https://github.com/chromium/chromium/blob/15444426f98a99830338697cce54e686e988815c/third_party/blink/renderer/core/editing/serializers/serialization.cc#L681-L702 Gecko: https://github.com/mozilla/gecko-dev/blob/6fc2f6d5335fb6f70f780b5fea5ed77b0719c3b5/dom/base/FragmentOrElement.cpp#L1991-L2002 Removing the call to "the end" fixes at least a couple of issues: - Inserting `<img>` elements via innerHTML causes us to spin forever. This regressed in 2413de7 This is because `m_load_event_delayer.clear()` is performed inside an element task callback. Because of the reasons stated above, this will never execute. This caused us to spin forever on step 8 of "the end", which is delaying the load event. This affected Google Docs and Google Maps, never allowing them to progress after performing this action. I have also seen it cause a Scorecard Research `<img>` beacon in a `<noscript>` element inserted via innerHTML to spin forever. This presumably affects many more sites as well. Given that the Window load event is not fired for HTML fragment parsers, spinning the event loop to delay the load event does not change anything, meaning this step can be skipped entirely. - Microtask timing is messed up by the unconditional `spin_until`s on steps 7 and 8. "Spin the event loop" causes an unconditional microtask checkpoint: https://html.spec.whatwg.org/multipage/webappapis.html#spin-the-event-loop https://github.com/SerenityOS/serenity/blob/44dd8247647474df95137452b3c9cad9b83326be/Userland/Libraries/LibWeb/HTML/EventLoop/EventLoop.cpp#L54 > 3. Let old stack be a copy of the JavaScript execution context stack. > 4. Empty the JavaScript execution context stack. > 5. Perform a microtask checkpoint. > 6.2.1. Replace the JavaScript execution context stack with old stack. This broke YouTube with the introduction of custom elements, as custom elements use microtasks to upgrade elements and call callbacks. See whatwg/html#8646 for a full example reduced from YouTube's JavaScript. Another potential fix for this issue is to remove the above steps from "spin the event loop". However, since we have another issue with the use of "spin the event loop", it would be best to just avoid both calls to it. Considering all of the above, removing the call to "the end" is the way forward for HTML fragment parsing, as all of it should be a no-op. This is done by not simply returning from "the end" if the HTML parser was created for HTML fragment parsing. The end.
When the HTML parser stops parsing, it performs one conditional event loop spin on step 5.1 and two unconditional event loop spins on step 7 and 8. Spinning the event loop performs a microtask checkpoint on step 5.
However, this causes microtasks to execute at the wrong time. Take this example with custom elements:
This affects YouTube, which sets innerHTML of a
<template>
element in some custom element constructors before setting up class attributes, causing unhandled exceptions in connectedCallback when it tries to use the attributes it assumes were setup in the constructor.It appears WebKit and Blink skip most of
the end
for parsing fragments, roughly doing steps 2 and 4. It doesn't appear they skip any steps for DOMParser, as they don't usecreateFragmentForInnerOuterHTML
orcreateFragmentForMarkup
for DOMParser.WebKit:
https://github.com/WebKit/WebKit/blob/44374328f96d4c74b57ce233a54d72406492774b/Source/WebCore/html/parser/HTMLDocumentParser.cpp#L140-L142
https://github.com/WebKit/WebKit/blob/46837e56d6f57f7e52bf92c9534cc7683909eee5/Source/WebCore/html/parser/HTMLTreeBuilder.cpp#L3024
https://github.com/WebKit/WebKit/blob/a36f6efecc490ae267d0a2103e86f1f0db533c62/Source/WebCore/xml/DOMParser.cpp#L41
Blink:
https://source.chromium.org/chromium/chromium/src/ /main:third_party/blink/renderer/core/html/parser/html_document_parser.cc;l=553-555;drc=f9ebf0deab3efad7c0db6843af187bb36f0ccb4c?q=html_document_parser&ss=chromium/chromium/src
https://source.chromium.org/chromium/chromium/src/ /main:third_party/blink/renderer/core/html/parser/html_tree_builder.cc;l=2950;drc=f9ebf0deab3efad7c0db6843af187bb36f0ccb4c?q=html_tree_bui&ss=chromium/chromium/src
https://source.chromium.org/chromium/chromium/src/ /main:third_party/blink/renderer/core/xml/dom_parser.cc;l=32;drc=f9ebf0deab3efad7c0db6843af187bb36f0ccb4c?q=dom_parser.cc&ss=chromium/chromium/src
I have not looked into what Gecko does for this.
There are other issues with
the end
in these cases, such as unnecessary events likereadystatechange
,DOMContentLoaded
andload
which are not observable and the WebDriver steps assuming Document's browsing context is non-null. However, I focused on spinning the event loop as it is observable by user code.The text was updated successfully, but these errors were encountered: