View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0010248 | ardour | bugs | public | 2026-03-13 09:00 | 2026-03-13 09:00 |
| Reporter | wykwit | Assigned To | |||
| Priority | normal | Severity | crash | Reproducibility | sometimes |
| Status | new | Resolution | open | ||
| Platform | Arch | OS | Linux | OS Version | (any) |
| Product Version | 9.2 | ||||
| Summary | 0010248: Memory corruption due to faulty PBD::Cond::wait mutex handling | ||||
| Description | I've been using Ardour 9.2 lately and experiencing many seemingly random crashes. After investigating a little, and with LLM help, I've managed to pinpoint the issue to a faulty implementation of `PBD::Cond::wait` and `PBD::Cond::wait_for` that, if my understanding is correct, boils down to a simple lack of a mutex release. I'm not submitting a patch here, since that would go against the project terms, but it should be a pretty straight-forward fix. Below I attach a summary from my LLM session that hopefully sheds more light on the issue and a possible resolution without giving away any tainted code. --- Observed crash signatures from multiple coredumps: - `SIGSEGV` in `ArdourWaveView::WaveViewThreads::_dequeue_draw_request()` - `SIGABRT` in `WaveViewThreads::_dequeue_draw_request()` on `assert (!_queue_mutex.trylock())` - `SIGSEGV` in canvas/render code such as `ArdourCanvas::DumbLookupTable::get()`, `Item::visible()`, and `bounding_box()` - occasional heap-corruption-style crashes in `g_malloc`, `operator new`, and `Cairo::Context::create()` After checking a series of coredumps the common pattern was always WaveView worker-thread activity. That led to `libs/pbd/pbd/mutex.h`, where `PBD::Cond::wait()` and `wait_for()` create a `std::unique_lock` with `std::adopt_lock`, call `wait()`, and then return without releasing ownership of the temporary lock. That means the `unique_lock` destructor unlocks the mutex before returning to the caller, even though callers assume they still hold the mutex after `Cond::wait()` returns. This breaks code such as `WaveViewThreads::_thread_proc()` / `_dequeue_draw_request()`: - the thread enters `Cond::wait()` with `_queue_mutex` locked - it wakes up and returns with `_queue_mutex` unintentionally unlocked - it then continues accessing `_queue` as if it still owns the lock That explains both the direct assertion failure in `_dequeue_draw_request()` and the other crash variants: once the WaveView request queue is raced/corrupted, the resulting memory corruption later shows up in unrelated render/canvas/heap code. Because all observed crash signatures stem from the same broken mutex-ownership contract and the resulting queue corruption, fixing `PBD::Cond::wait()` should fix the whole crash family, not just the assertion case. --- I hope this can be helpful. Thanks for going through this bug report and for your amazing work on Ardour! | ||||
| Steps To Reproduce | I was able to reproduce the Wave View crash a few times: 1. Working with 5+ tracks grouped together, each 4 minutes long, split with very sensitive Rhythm Ferret into hundreds of short clips. 2. Trying to highlight all of them by selecting the last clip on a track, then scrolling back to the start and Shift+Clicking other tracks to highlight them all. I've tried this 4-5 times and it crashed for me every time before applying the fix. | ||||
| Additional Information | My version of Ardour 9.2 was built from source through ardour-git PKGBUILD on Arch Linux. It was crashing the same both on Niri (wayland) and i3 (X11). | ||||
| Tags | No tags attached. | ||||
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2026-03-13 09:00 | wykwit | New Issue |