Database upgrade MariaDB 10: 600 seconds timeout
Closed, ResolvedPublic
Actions

Assigned To

Authored By

	Incola
	Aug 4 2014, 5:27 PM

Description

I'm one of the maintainers of the Lists tool: http://tools.wmflabs.org/lists/

This tool executes a series of queries every day: after each query it runs another query to record some statistic information.

If the primary query runs for more than 600 seconds, the secondary one fails with the error "General error: 2006 MySQL server has gone away".

This issue has begun after the migration to MariaDB 10.

Version: unspecified
Severity: normal

Details

Reference: bz69110

Related Objects

Mentioned In: T122658: MySQL connections die in less than 30 seconds using tools-login tunnels

Event Timeline

• bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:34 AM

• bzimport added a project: Cloud-VPS.

• bzimport set Reference to bz69110.

Incola created this task.Aug 4 2014, 5:27 PM

Duplicate of bug 68753?

I've found that the problem is related with the primary queries: some of them now are slower then before and exceeds the 600 seconds limit.

For example the query [1] runned in 170 seconds and now runs in more than 1000 seconds.

[1] http://tools.wmflabs.org/lists/itwiki/Voci/Voci_senza_uscita

Please post examples of both queries.

The query is at the bottom of the previous link. The query that fails is not important, the problem is that this query takes too long to run.

I realize you think the speed is the problem, which I agree is an issue. However there is no "over 600s" type kill mechanism, so I'm interested in establishing two things:

Why the first query is slow. Thanks, I see the example now.

Why the second query dies and whether it is, in fact, related to the speed of the first query, or to something else unexpected. Hence I asked to see it too...

Incola: "not important" doesn't really exist when trying to find steps to reproduce. :)

The second query is something like:

insert into executions (query_id, time, duration, results) values (23, 2014-08-05 14:01:05, 1879, 5290)

The first query is heavily dependent on disk IO. It runs in ~1000s on both MariaDB 10 and 5.5 if data is cold, or if any other concurrent query is also bottle necked on disk. This should be reviewed once the switch back to SSD is done (to be scheduled very shortly after labsdb1003 migrates).

Regarding the second query dying or losing connection, which still seems odd, it would be useful to know:

If the first query always completes regardless of slow runtime, or sometimes fails/is-killed itself.

If there is any delay between issuing the two queries on the same DB connection (seconds, minutes, etc ..).

If there is any transaction in use, either via explicit BEGIN or AUTO_COMMIT=0.

What client connector or library is used, and whether it could have any custom timeout settings.

The first query always runs correctly.

They are on different connections.

I don't know because I'm not the original author of the code and I don't know how works the framework that was used.

The first query runs via shell command invocated by a PHP script, the second one via the PHP script directly. The script is this one: https://git.wikimedia.org/blob/labs/tools/lists/d291a438ef6e1aa0e4630d501cd9a28bedb014cc/app/commands/ExecCrontab.php

After switching back to SSD no error is reported and the queries are run with their previous timing.

Aklapper added a project: Cloud-Services.Apr 29 2015, 12:18 PM

• Springle closed this task as Resolved.Aug 28 2015, 4:23 AM

valhallasw mentioned this in T122658: MySQL connections die in less than 30 seconds using tools-login tunnels.Dec 30 2015, 5:14 PM

Database upgrade MariaDB 10: 600 seconds timeoutClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

Database upgrade MariaDB 10: 600 seconds timeout
Closed, ResolvedPublic
Actions