Skip to content

Raze-Systems/EmbBERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EmbBERT Bundle

This folder is a standalone EmbBERT bundle centered on the verified checkpoint-616000 pretraining artifact.

It includes:

  • checkpoints/pretraining/checkpoint-616000/
  • tokenizers/bpe_book_corpus_8192.json
  • configs/EmbBERT_config.json
  • manifest.json and loaders.py
  • lib/
  • runnable scripts such as pretrain.py, finetune.py, quantize.py, evaluate_embbert_bundle.py, and embbert_semantic_search_test.py
  • pyproject.toml and uv.lock
  • datasets/embbert_semantic_search_benchmark.json

Visualization utilities are intentionally excluded from this bundle. In particular, plotting helpers such as graphing.py and loss_plotter.py are not included.

Example

Run from inside this directory:

from loaders import load_latest_pretraining_checkpoint

model, tokenizer = load_latest_pretraining_checkpoint()

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages