Page MenuHomePhabricator

Provide a robust way of logging events without blocking until network request completes; use sendBeacon
Closed, ResolvedPublic

Description

Use HTML5 storage as an intermediary queue so if an image doesn't successfully fire, we have a chance to pull the log entry out of storage and try again.

Duplicate firings are a possibility (the browser could fire without us getting confirmation), so we will probably have to generate a unique ID and de-duplicate on the server.


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=52287

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 12:55 AM
bzimport set Reference to bz42815.
bzimport added a subscriber: Unknown Object (MLST).

The localStorage API is synchronous, so it's possible to reliably set an item in localStorage in an on click handler without having to resort to voodoo.

(According to http://www.nczonline.net/blog/2012/03/07/in-defense-of-localstorage/, IE 8's implementation is asynchronous. But I'm sure we could hack around it somehow.)

There are serious questions to answer, though:

  • What if the link navigates away from Wikipedia entirely? You might simply assume that the user will come back eventually, and you'll fire the event then. But then the server timestamp is no longer an accurate indicator of when an event took place. We'd have to reintroduce client-side timestamps, I think.
  • Does it make sense to implement this in ext.eventLogging.core.js? We could rewrite mw.eventLog.logEvent to write to localStorage by default, and poll it asynchronously using setInterval to actually log events to the server.

Ideas?

If we had CORS, we could use $.ajax( { asynchronous: false, ... } ).

Batching events will make us much more reliant on client-side timestamps and thus open up a can of worms. A robust alternative to $.fn.stall is needed, but localStorage is just one possible approach. I'm going to create a separate bug for $.fn.stall and close this as WONTFIX for now.

(In reply to comment #3)

I'm going to create a separate bug for $.fn.stall and
close this as WONTFIX for now.

Ugh, I changed my mind about this but submitted the comment by accident when I changed the title. This should be the bug for tracking synchronous event logging.

I confirmed that $.ajax({ async: false }) requests work. You only need CORS if you want to handle the response, which we don't care about: we only care that the request is made against bits, which it is.

I don't think we really want synchronous network requests. They block the entire event loop and can cause other problems (https://developer.mozilla.org/en-US/docs/Web/API/XMLHttpRequest/Synchronous_and_Asynchronous_Requests?redirectlocale=en-US&redirectslug=DOM/XMLHttpRequest/Synchronous_and_Asynchronous_Requests).

I think we can distinguish between persistent logging and synchronous logging. 'Persistent' means it will try again until it's logged successfully (or it reaches a maximum number of attempts). 'Synchronous' means it logs, blocking the event loop, immediately when logEvent is called.

The Beacon API proposal from the W3C Web Performance Working Group, published last month, takes a stab at solving the exact problem that this bug describes: https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/Beacon/Overview.html

The proposal itself is just that -- a proposal. I don't think any browser vendors implement it yet. But it's highly useful, because it tells us that this is indeed a thorny problem, not just something we haven't figured out, and because it gives a concise overview of existing techniques & their limitations. This part of the proposal is worth quoting in full:

"User agents will typically ignore asynchronous XMLHttpRequests made in an unload handler. To solve this problem, analytics and diagnostics code will typically make a synchronous XMLHttpRequest in an unload or beforeunload handler to submit the data. The synchronous XMLHttpRequest forces the user agent to delay unloading the document, and makes the next navigation appear to be slower. There is little the next page can do to avoid this perception of poor page load performance.

"In addition, there are other poor patterns used to ensure that data is submitted. One such pattern to delay the unload in order to submit data is to create an Image element and set its src attribute within the unload handler. As most user agents will delay the unload to complete the pending image load, data can be submitted during the unload. Another technique is to create a no-op loop for several seconds within the unload handler to delay the unload and submit data to a server.

"Not only do these techniques represent poor coding patterns, some of them are unreliable and also result in the perception of poor page load performance for the next navigation."

[moving from MediaWiki extensions to Analytics product - see bug 61946]

Bumping the priority on this and renaming somewhat, since this seems to be the preferred approach (for performance reasons) for moving forward on the link click problem. See discussion at https://bugzilla.wikimedia.org/show_bug.cgi?id=52287 .

I propose using jQuery jStorage, which is a localStorage wrapper (already in core) with support for globalStorage and IE userData as a fallback.

(In reply to Matthew Flaschen from comment #9)

I propose using jQuery jStorage, which is a localStorage wrapper (already in
core) with support for globalStorage and IE userData as a fallback.

To be clearer, jStorage has support for globalStorage and userData too. We wouldn't need to add that.

(In reply to Ori Livneh from comment #7)

The Beacon API proposal from the W3C Web Performance Working Group,
published last month, takes a stab at solving the exact problem that this
bug describes:
https://dvcs.w3.org/hg/webperf/raw-file/tip/specs/Beacon/Overview.html

The proposal itself is just that -- a proposal. I don't think any browser
vendors implement it yet.

Firefox landed it recently in stable:
https://developer.mozilla.org/en-US/docs/Web/API/navigator.sendBeacon
https://developer.mozilla.org/en-US/Firefox/Releases/31

Chrome landed it in canary (Chrome 37), soon to be in stable:
https://code.google.com/p/chromium/issues/detail?id=360603

(In reply to Krinkle from comment #11)

Firefox landed it recently in stable:
https://developer.mozilla.org/en-US/docs/Web/API/navigator.sendBeacon
https://developer.mozilla.org/en-US/Firefox/Releases/31

Chrome landed it in canary (Chrome 37), soon to be in stable:
https://code.google.com/p/chromium/issues/detail?id=360603

This seems like a good choice then. We can use sendBeacon when available, and use jStorage when it's not. However, ideally, the issue with localStorage being used up should be solved first, or the fallback will be less effective (or have to use cookies).

Change 162194 had a related patch set uploaded by Prtksxna:
Add an experimental function to log events using the sendBeacon API.

https://gerrit.wikimedia.org/r/162194

Change 162194 merged by jenkins-bot:
Add experimental function to log events using the sendBeacon API

https://gerrit.wikimedia.org/r/162194

Change 175953 had a related patch set uploaded (by Mattflaschen):
Add experiment for testing sendBeacon reliability

https://gerrit.wikimedia.org/r/175953

Patch-For-Review

Change 175953 merged by jenkins-bot:
Add experiment for testing sendBeacon reliability

https://gerrit.wikimedia.org/r/175953

Change 178408 had a related patch set uploaded (by Mattflaschen):
Add experiment for testing sendBeacon reliability

https://gerrit.wikimedia.org/r/178408

Patch-For-Review

Change 178409 had a related patch set uploaded (by Mattflaschen):
Add experiment for testing sendBeacon reliability

https://gerrit.wikimedia.org/r/178409

Patch-For-Review

Change 178408 merged by jenkins-bot:
Add experiment for testing sendBeacon reliability

https://gerrit.wikimedia.org/r/178408

Change 178409 merged by jenkins-bot:
Add experiment for testing sendBeacon reliability

https://gerrit.wikimedia.org/r/178409

sendBeacon is done now and the local storage fallback is what remains to be implemented, right?

Nuria renamed this task from Provide a robust way of logging events without blocking until network request completes; use sendBeacon with client-side storage fallback to Provide a robust way of logging events without blocking until network request completes; use sendBeacon.Jan 23 2015, 6:10 PM

@Gilles: we have no plans to implement the local storage fallback at this time.

Send beacon fallback (when supported) is available and deployed to production.

sendBeacon is the default now when available, the old method is the fallback

Is there a follow-on task to implement the client-side storage fallback? I couldn't see one.

No, at this time we do not have plans to implement a client-side storage mechanism.

The MobileFrontend extension has some code that deals with deferring logging events until the next full page load. We could take a stab at cleaning it up and moving it into the EventLogging extension for others to use.

The MobileFrontend extension has some code that deals with deferring logging events until the next full page load. We >could take a stab at cleaning it up and moving it into the EventLogging extension for others to use

There is no way to do to truly w/o events asynchronously w/o sendBeacon. Our position is that you should log your event, if it is a page transition it will only be log reliably if the browser supports sendBeacon. That should be acceptable, sendbeacon support is on the raise and given that a big percentage of our traffic comes from chrome we will be getting quite a bit of data.
Please see: http://www.mediawiki.org/wiki/Extension:EventLogging/SendBeacon#Browsers_that_support_sendBeacon_as_expected

Our position is that you should log your event, if it is a page transition it will only be log reliably if the browser supports sendBeacon.

Just to check: is it still the case that there is no recommended workaround for logging an event when when navigating away from a page, for browsers that don't support sendBeacon? For Fundraising, we do need data from all browsers. Thanks!

P.S. Note also that in the FR use case the users would navigate to a different site.

Yes, logEvent uses only two paths internally, sendBeacon (if available) and the img element. Only sendBeacon is designed for this use case.

There is a workaround we used to use in the GettingStarted extension: https://git.wikimedia.org/blob/mediawiki/extensions/GettingStarted.git/b66689ef823c759067d4f16c8a894758b727db60/resources/ext.gettingstarted.return.js#L227

However, this has a negative performance impact you should be aware of. It basically worked by:

  • Delaying link navigation until logging completes (but at most 0.5 s, and you could change that number)
  • Attempting to log
  • Navigating after either the timeout expires or logging completes (whichever comes first).

Although it looks the new EventLogging API no longer exposes when logging is complete (since you don't get that information by definition from native sendBeacon).

Does the GettingStarted workaround even work? Looks like the LogEvent promise is always resolved or rejected when the function returns: https://git.wikimedia.org/blob/mediawiki/extensions/EventLogging.git/2c563bc9bc3f281a567c9654af0b5aadd00cbbd2/modules/ext.eventLogging.core.js#L262 , so you always navigate immediately.

This oughtta wait to resolve till the img tag loads: https://gerrit.wikimedia.org/r/238623/

Change 270137 had a related patch set uploaded (by Ori.livneh):
Add handling for unload events

https://gerrit.wikimedia.org/r/270137

Closing as changes are final for sendBeacon

Replied at https://gerrit.wikimedia.org/r/#/c/238623/ . This is still a valid issue to consider (the answer might be WONTFIX, but if so we should clearly document why).

This is still a valid issue to consider (the answer might be WONTFIX, but if so we should clearly document why).

Never mind, I see @Nuria addressed this above.

Change 379468 had a related patch set uploaded (by Zoranzoki21; owner: Mattflaschen):
[mediawiki/extensions/GuidedTour@master] Fire-and-forget EventLogging for link clicks

https://gerrit.wikimedia.org/r/379468

Change 379468 merged by jenkins-bot:
[mediawiki/extensions/GuidedTour@master] Fire-and-forget EventLogging for link clicks

https://gerrit.wikimedia.org/r/379468