Primary root cause
- In Spaces, Trackio forces SQLite into DELETE journal mode on every connection, not just once at DB creation. See trl- internal/lib/python3.11/site-packages/trackio/sqlite_storage.py:43 and trl-internal/lib/python3.11/site-packages/trackio/ sqlite_storage.py:101.
- Write paths do take a per-project file lock, for example in init_db and bulk_log. See trl-internal/lib/python3.11/site- packages/trackio/sqlite_storage.py:143 and trl-internal/lib/python3.11/site-packages/trackio/sqlite_storage.py:623.
- Read/UI paths do not take that lock. get_alerts and get_logs open DB connections directly, and those connections still run _configure_sqlite_pragmas(). See trl-internal/lib/python3.11/site-packages/trackio/sqlite_storage.py:809 and trl-internal/ lib/python3.11/site-packages/trackio/sqlite_storage.py:967. Your container traces are failing in that pragma path before the actual SELECT.
- So the UI polling endpoints are not really “read-only” from a DB-behavior perspective. They open unsynchronized connections and run a journal-setting pragma while concurrent writes are happening.
That gives a plausible failure chain:
- Concurrent log writes enter bulk_log.
- One writer acquires the Trackio file lock and then blocks inside SQLite because of concurrent readers / connection setup on the same DB.
- Other writers wait on the Trackio file lock and hit Could not acquire database lock after 10 seconds.
- Once the DB gets into a bad state, read paths start throwing sqlite3.DatabaseError: database disk image is malformed and file is not a database.
Why Trackio doesn’t recover
- The server retry queue only catches sqlite3.OperationalError. See trl-internal/lib/python3.11/site-packages/trackio/ server.py:39 and trl-internal/lib/python3.11/site-packages/trackio/server.py:417.
- Your actual failures are OSError and sqlite3.DatabaseError, so they bypass the queue entirely and bubble up.
Secondary corruption vector
- If this Space has dataset sync enabled, Trackio has another unsafe path: export_to_parquet() and import_from_parquet() touch the same DB without taking the same process lock. See trl-internal/lib/python3.11/site-packages/trackio/ sqlite_storage.py:330, trl-internal/lib/python3.11/site-packages/trackio/sqlite_storage.py:442, and trl-internal/lib/ python3.11/site-packages/trackio/sqlite_storage.py:545.
- import_from_parquet() rewrites tables with if_exists="replace" and no process lock. If that path is active, it can turn lock contention into actual DB corruption very easily.
Bottom line
The core design bug is: Trackio serializes writes with its own file lock, but it does not serialize reads, and in Spaces every connection still mutates SQLite connection state with PRAGMA journal_mode = DELETE. Under heavy concurrent logging plus UI polling, that is enough to produce exactly the lock timeout and malformed-database pattern you saw.
If you want, I can turn this into a concrete minimal patch against the installed Trackio code to validate the hypothesis.

Debug script