Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vdsm hangs deadlocked after SD detach/reattach cycle #319

Open
ahadas opened this issue Sep 23, 2022 · 0 comments
Open

vdsm hangs deadlocked after SD detach/reattach cycle #319

ahadas opened this issue Sep 23, 2022 · 0 comments
Labels

Comments

@ahadas
Copy link
Member

ahadas commented Sep 23, 2022

From time to time vdsm ends up in a deadlock being completely unresponsive in OST. The problem is observed on el8stream and it's always host-0 that is affected. While OST reports it as a 'test_use_ovn_provider' failure, a quick look at the 'vdsm.log' shows that the problem happens earlier - while in 'vdsm.log' we can see entries up until some point in time, 'messages' and other log files show that the host was up for about 8 more minutes.

After attaching to 'vdsm' process with gdb we can see all the threads waiting on some locks.

The deadlock timing always aligns with the SD detach/reattach tests:
https://github.com/oVirt/ovirt-system-tests/blob/master/basic-suite-master/test-scenarios/test_007_sd_reattach.py

Even after analyzing a couple of such failures it's hard to pinpoint one specific thing that causes this problem, but the logs always end in storage parts of vdsm.

Version-Release number of selected component (if applicable):
Latest vdsm version.

How reproducible:
Rarely, ~1 in 10 runs.

Steps to Reproduce:

  1. Run basic suite master on elstream
  2. Check if it failed on 'test_use_ovn_provider'
  3. Check if 'vdsm.log' entries end a couple of minutes earlier than those from i.e. 'messages'

Actual results:
vdsm ends up in a deadlock.

Expected results:
vdsm continues to operate normally

Original bz: https://bugzilla.redhat.com/show_bug.cgi?id=2111187

@ahadas ahadas added the storage label Sep 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant