This document summarizes the common workflow for obtaining, processing, and querying the CBDB SQLite database.
- Operating systems: macOS, Linux, and other Unix-like systems such as Windows WSL.
- Required tools:
python3,sqlite3,wget, and7z. You may also needunrarto unpack older releases.
- Debian/Ubuntu and other Debian-based distributions:
sudo apt update sudo apt install -y sqlite3 python3 wget p7zip-full
- macOS (via Homebrew):
brew update brew install sqlite python@3 wget p7zip
Install unrar if you need to handle older version archives compresswed with rar tool.
Use tools such as pyenv if you need to maintain multiple Python versions and create project-specific environments.
-
Fetch the latest database archive
latest.7z(current dataset release date: 2025-05-20):wget https://github.com/cbdb-project/cbdb_sqlite/raw/refs/heads/master/latest.7z
Historical releases are available at: https://huggingface.co/datasets/cbdb/cbdb-sqlite/tree/main/history
-
Extract the archive with 7-Zip or unrar, depending on the file extension:
7z x latest.7z
After extraction the database is placed in the project root. You can rename it to
CBDB_20250520.dband remove the archive:mv latest.db CBDB_20250520.db rm latest.7z
Use the interactive sqlite3 shell for quick data inspection:
sqlite3 CBDB_20250520.dbExit the sqlite3 shell with .quit.
You can also run individual queries directly:
- Table schema:
sqlite3 CBDB_20250520.db '.schema BIOG_MAIN' - Person count:
sqlite3 CBDB_20250520.db 'SELECT COUNT(*) FROM BIOG_MAIN;' - Fuzzy name search:
sqlite3 CBDB_20250520.db 'SELECT c_personid, c_alt_name, c_alt_name_chn FROM ALTNAME_DATA WHERE c_alt_name_chn LIKE "%王%" LIMIT 20;' - Person-to-place relationships:
sqlite3 CBDB_20250520.db 'SELECT * FROM BIOG_ADDR_DATA WHERE c_personid = 100;'
- Database is locked: ensure no lingering
sqlite3or Python process is holding the file. Kill the offending process if necessary. - 7-Zip is missing: the latest release uses 7-Zip compression. Install it with
apt install p7zip-fullon Ubuntu orbrew install p7zipon macOS.