About
I work on deep learning — writing and training models as well as researching open problems in optimisation and algorithmic fairness. That currently covers spectral clustering, fairness constraints in graph-based learning, and loss landscape geometry.
On the software side I build end-to-end applications and ML pipelines — cross-platform and web applications, fine-tuning SLMs for specific tasks, and training pipelines for computer vision and NLP. I work primarily in Python, Dart, and TypeScript, with Rust for performance-critical components.
Skills
| Languages | Python TypeScript Dart Rust SQL Java C C++ Go PowerShell Bash |
| AI / ML | PyTorch TensorFlow Scikit-learn Hugging Face Keras XGBoost CatBoost LightGBM OpenCV YOLO MediaPipe spaCy Transformers Ultralytics |
| GenAI / LLM | LangChain LlamaIndex LoRA / PEFT Unsloth llama.cpp Ollama Vector DBs LangGraph CrewAI AutoGen vLLM |
| Data | Pandas NumPy Polars DuckDB SciPy Matplotlib Seaborn Plotly |
| Backend / Cloud | FastAPI Docker Flask Django Celery Redis AWS MLFlow AzureML Gunicorn Vercel |
| Frontend / Mobile | Flutter Next.js Tauri Vue.js Angular JavaFX |
| Databases | PostgreSQL OracleDB SQLite MongoDB MySQL Supabase |
| Languages spoken | English (C1 · IELTS) Hindi (Native) French (Elementary) Japanese (Elementary) |
Highlighted = core proficiency
Projects
Kivixa Productivity Workspace
Flutter, Dart, Rust, llama.cpp
Architected a privacy-first, cross-platform productivity system with a Rust-native AI engine supporting multi-model SLM inference while running fully offline.
- Architected a privacy-first, cross-platform productivity system with a Rust-native AI engine supporting multi-model SLM inference (Phi-4, Qwen 3.5, Gemma) with automatic task-based model routing, GPU acceleration via Vulkan/Metal, and a full Model Context Protocol (MCP) implementation for sandboxed AI-driven file operations — all fully offline.
- Built a Rust-backed audio intelligence pipeline integrating Whisper STT with real-time word-level timestamps, Kokoro neural TTS, voice activity detection, and semantic audio indexing; alongside a vector database with local embeddings for semantic search and automatic note clustering.
- Engineered version control using SHA-256 content-addressable blob storage with automatic snapshots, a Lua-scriptable plugin system with a full programmatic notes API, an interactive knowledge graph with force-directed simulation, and a split-screen editor supporting handwritten, markdown, and PDF formats.
Unsupervised Cipher Cracking
Python, NumPy, MCMC, HMM, Genetic Algorithms
Built an unsupervised cryptanalysis toolkit with multiple solvers and shared language modeling for robust ciphertext recovery across families.
- Implemented three independent solvers — Metropolis-Hastings MCMC with simulated annealing, Baum-Welch HMM/EM, and a tournament-selection genetic algorithm — over a shared quadgram language model trained on 6M+ characters, with vectorised NumPy decryption in the hot loop achieving ~100k scores/second.
- Built a statistical cipher-type detector using Index of Coincidence, Kasiski examination, entropy, and bigram repeat rate; and a phase-transition analyser that empirically maps minimum ciphertext length against decryption success rate, directly connecting to the theoretical unicity distance.
- Designed a benchmarking and convergence diagnostics framework comparing all three solvers on identical test cases across five cipher families, with multi-chain consensus key reconstruction, per-letter confidence scoring, and publication-quality diagnostic plots.
Hospital Operations System
Flask, Vue.js, Celery, Redis, Docker
Developed a production-ready healthcare operations platform spanning hospital workflows and blood bank management with secure role separation.
- Built a role-separated web platform across two integrated runtime modules — hospital operations and blood bank management — behind a shared Flask-Security authentication layer with four distinct RBAC roles, deployed via Docker Compose with Gunicorn, tini for correct PID handling, and a non-root production runtime.
- Engineered a blood bank allocation engine using SQL triggers, compatibility-based matching logic, and predictive shortage alerting with a full forensic audit trail; donor workflows support whole blood and component-split donations with inventory state tracked across requests.
- Implemented async task infrastructure with Celery workers and Beat scheduling for automated appointment reminders, monthly doctor activity reports, and CSV exports via PyMuPDF and Pandas — all brokered through Redis with a separate caching layer to minimise database hits.
Phantom Local AI Overlay Assistant
Rust, Python, TypeScript, Dart, Kotlin
Implemented a lightweight cross-platform local AI assistant with desktop and Android runtimes focused on low-latency contextual help.
- Built a cross-platform AI assistant that idles at under 15 MB RAM using a multi-process architecture — a Rust watcher handling OS-level hotkeys, UIAutomation context extraction, and IPC via Named Pipes, with a Python inference engine loading GGUF models on-demand via llama-cpp and a Tauri/React floating overlay as the UI.
- Implemented an Android counterpart in Flutter and Kotlin using an Accessibility Service for active-window context traversal and a native llama.cpp bridge for on-device inference, distributed as F-Droid-compatible APKs alongside a Windows WinGet package.
- Designed a style distillation system that automatically extracts personalised writing rulebooks from outgoing message history, applied at inference time to mirror the user's tone without fine-tuning.
NovelCrafter: Fine-Tuned Literary LLM
Automated sequential LoRA fine-tuning framework
An automated framework for sequential LoRA fine-tuning on creative writing datasets. Implements a memory-efficient pipeline for training LLMs on long-form literary works via incremental learning.
- LoRA / PEFT: Injects trainable rank-decomposition matrices into self-attention layers with minimal parameter overhead.
- Contextual SFT: JSON-based instruction-response pairs carry previous chapter context to preserve narrative continuity.
- Auto Hardware Scaling: Detects available compute and scales from 1B-parameter models (CPU) to 3B (GPU).
- Fault-Tolerant Pipeline: Progress-tracking system enabling interruption recovery and sequential file processing.
Signature Verification System with Explainable AI
Siamese Neural Network for offline authentication
Determines signature authenticity by comparing a query signature against a known reference — entirely offline via metric learning.
- Siamese Architecture: Dual-branch CNN sharing weights to map signatures into a common embedding space.
- Metric Learning: Contrastive / Triplet Loss to minimise distance between genuine pairs and maximise it for forgeries.
- Preprocessing Pipeline: Normalisation, binarisation, and noise reduction for robust feature extraction.
Vehicle Parking Management System
Full-stack production ecosystem with Docker & Redis
Production-grade full-stack parking ecosystem using Vue.js components, Flask Blueprints, and a service-layer pattern, deployed via Docker Compose multi-container orchestration.
- Auth & RBAC: JWT authentication with Redis caching to minimise database latency.
- Async Processing: Celery workers for CSV exports and PDF generation with ReportLab.
- Automated Reporting: Celery Beat for scheduled monthly reports.
- High Performance: Gunicorn + Uvicorn ASGI workers for production throughput.
FacultySync: University Schedule & Conflict Manager
Java, JavaFX, SQLite, Gradle, JUnit
Built a modern JavaFX desktop platform for university scheduling, room-conflict detection, and automated conflict resolution with native Windows integration.
- Implemented schedule import/export, conflict detection, and auto-resolution using an IntervalTree-based conflict engine and a backtracking resolver that reassigns events while ensuring no new overlaps are introduced.
- Engineered a custom undecorated JavaFX interface with native-feel window controls, weekly drag-and-drop calendar, multi-tab dashboard, analytics charts, and animated toast notifications.
- Integrated SQLite persistence (WAL mode, foreign keys, indexed overlap queries), GitHub Releases auto-update checks, background-threaded JavaFX tasks for non-blocking I/O, and comprehensive automated test coverage across model, DB, algorithms, and I/O layers.
Real-Time Hand Gesture Recognition System
Temporal sequence classification using LSTM
A temporal sequence classification system recognising dynamic hand gestures in real-time by modelling time-dependent 3D landmark data.
- Sequential Modelling: Deep RNN using LSTM units to capture temporal dependencies.
- 3D Spatial Features: MediaPipe integration for 21 high-fidelity 3D hand landmarks per frame.
- Data Augmentation: Noise injection, scaling, and translation for robustness.
- Production Readiness: Modular inference engine managed with
uv.
IsoFace: CPU-Optimized Face Clustering
Python, ONNX Runtime, ArcFace, DBSCAN, OpenCV
Developed a fully local face-clustering pipeline that groups photos by person using ArcFace embeddings and DBSCAN, optimized for CPU-only systems.
- Built a three-stage inference pipeline with RetinaFace detection, ArcFace 512-dimensional embeddings, and DBSCAN clustering to automatically discover identities without predefined class counts.
- Optimized runtime with ONNX Runtime for CPU inference, delivering practical throughput while preserving high clustering quality via the buffalo_l model stack.
- Implemented automatic photo organization, configurable CLI tuning (`eps`, `min_samples`), dataset and custom-folder workflows, and an API layer for programmatic clustering and statistics retrieval.
Hybrid Image Classification: SVM with Deep Feature Extraction
Fusing deep learning with classical ML
Hybrid classification pipeline using VGG16 as a frozen feature extractor feeding an SVM classifier — high accuracy under varying computational constraints.
- Transfer Learning: VGG16 as feature extractor producing 512-dim vectors.
- Dimensionality Reduction: PCA compresses to 256 dims retaining 95% variance.
- SVM Classifier: RBF kernel fine-tuned via GridSearchCV.
- Custom Serialisation: Unified persistence for the full pipeline.
Interactive Customer Segmentation & Analytics Engine
Full-stack unsupervised learning analytics platform
Identifies distinct customer personas and visualises the internal logic of clustering algorithms through an interactive dashboard.
- K-Means Clustering: Elbow Method & Silhouette Analysis for optimal k selection.
- Animation Engine: Shows centroid initialisation and convergence steps live.
- Statistical Validation: ANOVA tests and cluster stability analysis.
- Full-Stack: FastAPI backend + Plotly for 3D visualisation.
Advanced House Price Prediction System
Intelligent regression with ensemble learning
Predicts real estate values using an automated model selection engine that benchmarks ensemble methods against a linear baseline via cross-validation.
- Ensemble Learning: XGBoost and Random Forest compared against Linear Regression.
- Automated Model Selection: Dynamic evaluation via 5-fold Cross-Validation.
- Statistical Confidence: Prediction engine providing confidence intervals.
- System Integration: FastAPI microservice with interactive dashboard.