S.G. Sakthivel
AI & ML Engineer
Advanced Fine-Tuning
Hands-on experience in fine-tuning and optimizing Large Language Models, including LLaMA-based variants, GPT-2, and GPT-Neo, using parameter-efficient methods such as PEFT and LoRA.
Strong understanding of tokenization behavior, context window constraints, and prompt construction.
Skilled in designing efficient input pipelines that minimize token usage while preserving semantic fidelity, enabling cost-effective inference and improved latency.
Experienced in context engineering, including system prompt structuring, dynamic context injection, and managing context window utilization to balance relevance, coherence, and computational efficiency in real-time applications.
Proficient in working with locally deployed models via Ollama, including model selection, environment configuration, and inference tuning for stable performance.
Familiar with handling model loading strategies, memory constraints, and experience with model quantization techniques to reduce memory footprint and accelerate inference, enabling deployment on limited-resource hardware without significant degradation in output quality.
RAG Architectures
Implementing complex Retrieval-Augmented Generation workflows across diverse vector databases for production-grade intelligence.
Custom MCP Servers
Building custom Model Context Protocol servers to bridge language models with proprietary data and specialized tools.
Technical Focus
Bridging the gap between cutting-edge research and functional deployments. My work focuses on scalable, local-first AI solutions that maintain data sovereignty and high performance.