I engineer retrieval systems, not demos.
My work focuses on how information is:
- ingested,
- indexed,
- retrieved,
- ranked,
- and safely exposed to downstream models.
I specialize in bridging classical Information Retrieval (IR) with modern Retrieval-Augmented Generation (RAG) β designing systems that remain correct under real-world constraints.
Core belief:
Language models are only as reliable as the retrieval systems feeding them.
π https://github.com/Kas-sim/MQNotebook
π https://mqnotebook.streamlit.app/
A production-oriented RAG system designed for the messy reality of enterprise documents β not clean PDFs.
What it demonstrates
- OCR-first ingestion (scanned PDFs, flattened text)
- Structured parsing (PPTX speaker notes, Excel tables)
- Hybrid retrieval (Vector search + Cross-Encoder reranking)
- Local-first, BYOK security model
- OS-level robustness (Windows file-lock mitigation)
This system exists to solve retrieval correctness, not prompt cleverness.
π https://github.com/Kas-sim/DevShelf
π https://kas-sim.github.io/systems/devshelf/
A classical vertical search engine for Computer Science literature, built without Lucene or ElasticSearch.
What it demonstrates
- Positional inverted indices
- Offline indexing vs online querying
- TF-IDF and behavioral re-ranking
- Deterministic, explainable retrieval pipelines
DevShelf forms the theoretical foundation behind my RAG work.
| Project | Focus | Stack |
|---|---|---|
| BabyGPT | Character-level language modeling from scratch | Python TensorFlow LSTM |
| Sentiment Filter | NLP edge cases (negation paradox) | Python Scikit-Learn |
| MQ Banking Core | Low-level transactional system | C++ File I/O |
| Digital Eye | CNN-based vision system | TensorFlow Keras |
These projects support my core specialization in retrieval and applied AI systems.




