STAC (Spiking Transformer Augmenting Cognition) is a research framework with two distinct approaches:
- STAC V1: Complete end-to-end training pipeline with learnable AdEx neurons (see
stac-v1/) - STAC V2: Experimental conversion framework that transforms pretrained transformer LLMs (DistilGPT-2, SmolLM2-1.7B-Instruct) into Spiking Neural Networks (SNNs) for potential energy savings while retaining multi-turn conversational ability in simulation
⚠️ Important: This repository currently runs software-level SNN simulations only. No metrics have been collected on physical neuromorphic hardware yet. Energy savings figures are theoretical projections based on spike-count analysis, not measured hardware data.
✔️ Proof-of-concept ANN→SNN conversion using SpikingJelly
✔️ Multi-turn context retention via a Temporal Spike Processor
✔️ Extensive software tests for position IDs, KV-cache, and spike-rate sanity
➖ Hardware power profiling — planned, not implemented
➖ Full operator coverage & optimisation — work in progress
# 1. Install dependencies
pip install -r requirements.txt
# 2. Convert DistilGPT-2 to SNN (fast)
python run_conversion.py --model_name distilgpt2 --timesteps 8 --simplified
# 3. Test multi-turn conversation
python snn_multi_turn_conversation_test.py --mode snn --turns 3 --timesteps 8
# 4. Run comprehensive validation
python test_conversational_snn.py --model_name distilgpt2 --test_all --timesteps 8| Component | Purpose |
|---|---|
smollm2_converter.py |
Specialized converter with TemporalSpikeProcessor |
convert.py |
Generic ANN→SNN conversion pipeline |
run_conversion.py |
Main CLI entry point for conversions |
spikingjelly_compat.py |
Cross-version compatibility layer |
test_conversational_snn.py |
Comprehensive test suite (1K+ lines) |
snn_multi_turn_conversation_test.py |
Simple conversation smoke test |
| Component | Purpose |
|---|---|
stac-v1/stacv1.ipynb |
Complete end-to-end training pipeline with learnable AdEx neurons |
stac-v1/README.md |
V1 documentation and research contributions |
Completed (prototype level)
- ✅ Core conversion flow (GELU→ReLU, quantization, ann2snn)
- ✅ Temporal dynamics & KV-cache handling in PyTorch
- ✅ Spike-count telemetry hooks and unit tests
Pending / In Progress
- ⏳ Hardware benchmarking on Loihi-2 / Akida
- ⏳ Expanded operator support (e.g., rotary embeddings, flash-attention variants)
- ⏳ Integration with SCANUE multi-agent alignment layer
- ⏳ Robust CLI/UX and documentation polish
Completed (research prototype)
- ✅ End-to-end training pipeline with learnable AdEx neurons
- ✅ Hyperdimensional Memory Module (HEMM) integration
- ✅ Surrogate gradient training on WikiText-2
- ✅ L1 spike regularization for energy efficiency
- ✅ Comprehensive validation suite
- 🔄 Conversion Workflow - Step-by-step conversion guide
- 📚 API Reference - Function and class documentation
- 🖥️ Hardware Requirements - System specifications
- 📖 STAC V1 Documentation - End-to-end training pipeline documentation
- 🧠 STAC V1 Implementation - Complete Jupyter notebook with learnable AdEx neurons
The repository includes extensive testing for multi-turn conversational correctness:
# Test specific components
python test_conversational_snn.py --model_name distilgpt2 --test_position_boundaries
python test_conversational_snn.py --model_name distilgpt2 --test_attention_mask
python test_conversational_snn.py --model_name distilgpt2 --test_multi_turn
python test_conversational_snn.py --model_name distilgpt2 --test_energy
# Run all tests
python test_conversational_snn.py --model_name distilgpt2 --test_allThis project is licensed under the MIT License - see the LICENSE file for details.