Convert USJ (JSON) Bible files back to the BSB word-level interlinear TSV format used by bereanbible.com.
This is the reverse companion to bsb2usfm, which converts the TSV into USFM/USJ/USX.
The BSB interlinear table (bsb_tables.tsv) is a 23-column, word-level TSV where each row represents a single word aligned to the original Hebrew or Greek. The forward converter (bsb2usfm) produces USJ files from this table. This repository converts those USJ files back to TSV format and provides tools to merge, verify, and diff the results against the original source.
Two editions are supported:
| Edition | ID | Scope | Source |
|---|---|---|---|
| Berean Standard Bible | BSB | Full Bible (OT + NT) | bereanbible.com |
| Majority Standard Bible | MSB | New Testament only | majoritybible.com |
Copy one or more .usj files (even a single book or partial book) into the appropriate directory:
content_usj/bereanbible/-- for BSB (OT or NT books)content_usj/majoritybible/-- for MSB (NT books only)
These can be generated by bsb2usfm or any tool that produces USJ output.
make all # run pipeline for all editions that have content
make bereanbible # download BSB source + convert + merge
make majoritybible # download MSB source + convert + mergemake all automatically detects which edition directories contain .usj files and runs only those pipelines. You can also target a specific edition directly.
This automatically downloads and caches the source TSV (only re-downloading when the remote file has been updated), converts the USJ files to TSV, and runs the full merge pipeline. Output goes to output/bereanbible/ or output/majoritybible/.
Note: Running make majoritybible will warn about any OT books (which are not part of the MSB) and will fail if no NT books are found.
The Makefile runs these steps in order:
- Download source TSV into
cache/<edition>/source.tsv - Convert USJ to TSV (
usj2tsv.py->output/<edition>/output.tsv) - Merge with source (
merge_tsv.py->output/<edition>/merged.tsv) - Restore footnotes (
merge_footnotes_from_tsv.py->output/<edition>/merged_fn.tsv) - Insert missing words (
insert_missing_words_from_tsv.py->output/<edition>/merged_full.tsv)
python3 verify_roundtrip.py output/bereanbible/merged_full.tsv content_usj/bereanbible/ --bsb2usfm ../bsb2usfm/bsb2usfm.py
python3 diff_fast.py output/bereanbible/merged_full.tsv cache/bereanbible/source.tsv
python3 diff_order.py output/bereanbible/merged_full.tsv cache/bereanbible/source.tsv| Script | Purpose |
|---|---|
usj2tsv.py |
Main converter: reads USJ files and produces a 23-column TSV |
merge_tsv.py |
Merges USJ-derived TSV with the original source TSV (source-driven) |
merge_footnotes_from_tsv.py |
Restores original footnote styling (fq/fqa) from source TSV |
insert_missing_words_from_tsv.py |
Re-inserts placeholder word rows absent from the USJ variant |
verify_roundtrip.py |
Roundtrip verification: merged TSV -> bsb2usfm -> USJ -> diff |
diff_fast.py |
Fast column-by-column diff between two TSV files |
diff_order.py |
Verifies row ordering between two TSV files |
The 23-column format matches bsb_tables.tsv from bereanbible.com:
| # | Column | Description |
|---|---|---|
| 0 | Heb Sort | Hebrew word sort order |
| 1 | Greek Sort | Greek word sort order |
| 2 | BSB Sort | BSB translation word order |
| 3 | Verse | Verse number within chapter |
| 4 | Language | "Hebrew" or "Greek" |
| 5 | WLC / Nestle Base | Original language text (plain) |
| 6 | WLC / Nestle Base (variants) | Original language text with textual apparatus |
| 7 | Translit | Transliteration |
| 8 | Parsing | Short grammatical parsing |
| 9 | Parsing | Long grammatical parsing |
| 10 | Str Heb | Strong's Hebrew number |
| 11 | Str Grk | Strong's Greek number |
| 12 | VerseId | Full verse reference (e.g. "Genesis 1:1") |
| 13 | Hdg | Section heading (HTML) |
| 14 | Crossref | Cross-references (HTML) |
| 15 | Par | Paragraph/formatting marker (HTML) |
| 16 | Space | Spacing |
| 17 | begQ | Opening quotation mark |
| 18 | BSB version | English translation text |
| 19 | pnc | Punctuation |
| 20 | endQ | Closing quotation mark |
| 21 | footnotes | Footnote text (HTML) |
| 22 | End text | Additional end-of-verse text |
Columns 0-11 (original-language data) are left blank by usj2tsv.py and filled in by merge_tsv.py.
content_usj/
bereanbible/ # user-provided BSB .usj files (git-ignored)
majoritybible/ # user-provided MSB .usj files (git-ignored)
cache/
bereanbible/ # cached bsb_tables.tsv (git-ignored)
majoritybible/ # cached msb_nt_tables.tsv (git-ignored)
output/ # generated TSV files (git-ignored)
| Target | Description |
|---|---|
make all |
Run pipeline for all editions that have content (default) |
make help |
Show available targets and usage |
make bereanbible |
Full pipeline for BSB edition |
make majoritybible |
Full pipeline for MSB edition (NT only) |
make clean |
Remove generated output files |
make clean-cache |
Remove cached source files (forces re-download) |
- Python 3.6+
- No external dependencies (standard library only)
curl(for downloading source tables)make(GNU Make)- For roundtrip verification: bsb2usfm and its
usfmtcdependency
- bsb2usfm -- Forward converter (TSV to USFM/USJ/USX)
- bereanbible.com -- Berean Standard Bible
- majoritybible.com -- Majority Standard Bible
- Bible text (BSB/MSB): Public Domain
- Software tools: MIT License
See LICENSE for details.