Page MenuHomePhabricator

JobRunner transaction fname for Job::run() can mismatch __METHOD__ in a subclass
Open, HighPublic

Description

If a Job class has a subclass that has the same run() method as its superclass, and tries to commit during the job, things can get very confused.

In JobRunner::executeJob (introduced in 6c73b32fd5b2a0c17d0fb32935804b6457ccd992):

			$fnameTrxOwner = get_class( $job ) . '::run'; // give run() outer scope
			$lbFactory->beginMasterChanges( $fnameTrxOwner );

If for instance there's a job class WebVideoTranscodeJob and a subclass WebVideoTranscodeJobHighPriority (doing some exploratory work on T155098), then the METHOD value passed as $fname inside 'WebVideoTranscodeJobHighPriority::run' is actually 'WebVideoTranscodeJob::run'. This results in the LoadBalancer throwing an error:

DBTransactionError: WebVideoTranscodeJob::run: transaction round 'WebVideoTranscodeJobHighPriority::run' still running.

This at least happens if the job tries to close out the existing transaction, as the transcode jobs do.

I can probably work around this on my end by using separate command queues but not a subclass...

Affected:

Event Timeline

(Have worked around it, but it could bite other folks.)

Looks like the main use case where this becomes a problem is when the Job's run() method wants to commit the transaction earlier, e.g. not to wait until the end of the run() method (at which point the job runner will perform the transaction).

@aaron Would it make sense to facilitate this in some way? E.g. should we discourage jobs from obtaining their own LBFactory reference and instead inject it somehow? In addition, we could provide them with the means to start/end an atomic section, which automatically forego the outer transaction if it wasn't used for anything.

Or perhaps a way to declaratively opt-out from the default transaction. E.g. through a boolean method or protected property like useOuterTransaction. Disabling this would give them a CLI/Maintenance-like environment that defaults to auto-commit, and allows them to manually begin/commit where needed (e.g. using lbfactory begin/commit, or db start/endAtomic).

Krinkle renamed this task from JobRunner sets up transaction fname that may not match __METHOD__ from a Job subclass to JobRunner transaction fname for Job::run() can mismatch __METHOD__ in a subclass.Jul 7 2017, 10:21 PM
Krinkle triaged this task as High priority.

You can always do what extensions/CentralAuth/includes/LocalRenameJob/LocalRenameJob.php does AFAIK.

You can always do what extensions/CentralAuth/includes/LocalRenameJob/LocalRenameJob.php does AFAIK.

abstract class LocalRenameJob extends Job { public function run() {
	$fnameTrxOwner = get_class( $this ) . '::' . __FUNCTION__; // T145596
	$factory = MediaWikiServices::getInstance()->getDBLoadBalancerFactory();
	$factory->commitMasterChanges( $fnameTrxOwner );

	/* .. */
} }

I'm not sure we should encourage this. It means classes 1) assume an active transaction, 2) rely on a specific owner name, 3) commit a transaction they did not start. Aside from point 1, these seem like anti-patterns. What about upstreaming this solution? e.g. Job::commitDefaultTransaction().

Alternatively, the opt-out solution might also be interesting.

JobRunner always starts an LBFactory transaction.

Most solutions seem to require the Job code author having to remember to "use this fname" or"use the transaction ticket" or "use this special commit function for jobs". I'd rather this just be documented unless some easier solution could be found, maybe have JobRunner tell LBFactory to alias the parent run() methods using http://php.net/manual/en/function.class-parents.php. Ownership should be run(), which that method passing tickets/fnames to any function that can act on it's behalf, so functions other than run() trying to commit should be passed these things anyway.