About Me

🔍 Research Journey

My journey into understanding complex systems began at Indian Institute of Technology Roorkee, where I completed my Bachelor’s in Engineering Physics, graduating with the Department Gold Medal (9.57/10) and Best Thesis Award in 2023. What started as curiosity about signal processing in Dr. R.S. Anand’s lab, working on EEG-based epilepsy detection, evolved into a deeper fascination with neural systems and their computational understanding.

I recently completed my Master’s in Neural Information Processing at the University of Tübingen (March 2025, 1.24/4.0), supported by the Deutschlandstipendium scholarship. My thesis on “Beyond Benchmarks: A Novel Framework for Domain-Specific LLM Evaluation and Knowledge Mapping” under Prof. Matthias Bethge and Dr. Çağatay Yıldiz at the Bethge Lab has now been extended into my current research work.

Throughout my studies, I’ve been exploring the intersection of healthcare and AI as a research assistant at the Mental Health Mapping Lab under Dr. Thomas Wolfers, developing interpretable models for clinical applications. I’m particularly proud of our work on postoperative delirium prediction, where we emphasized model interpretability through SHAP values to ensure clinical relevance.

To complement my research in healthcare applications, I had the opportunity to explore two fascinating domains through my rotations. My essay rotation on “Large Language Models and Psychotherapy: Bridging the Gap with Mechanistic Interpretability” under Dr. Thomas Wolfers earned me the Best Presentation Award from the Graduate Training Centre of Neuroscience. Additionally, I pursued a lab rotation at the Dayan lab (MPI for Biological Cybernetics) under Dr. Sara Ershadmanesh, investigating metacognitive abilities in reversal learning tasks. These experiences deepened my understanding of how both biological and artificial systems learn and adapt to changing environments, while highlighting the crucial role of interpretability in complex systems.

🧪 Current Focus

Since April 2025, I’m working as a Research Assistant at the University of Tübingen under Dr. Thomas Wolfers and Dr. Çağatay Yıldiz, focusing on two exciting projects in mechanistic interpretability:

Steering Vectors for Knowledge Access: I’m developing activation engineering techniques to access latent knowledge in language models without pre-training, analyzing how domain knowledge emerges as targetable directions across model layers for systematic control.

Domain-Specific LLM Evaluation: Building on my Master’s thesis, I recently released “Beyond Benchmarks: A Novel Framework for Domain-Specific LLM Evaluation and Knowledge Mapping” (January 2025). This work presents a deterministic pipeline for creating contamination-free benchmarks from raw corpora, tested on massive datasets including arXiv (1.56M documents) and M2D2 (8.5B tokens).

I’m also proud of our work on “Investigating Continual Pretraining in Large Language Models”, which was accepted at TMLR and has garnered 37+ citations. This research revealed fascinating insights about how model size affects learning and forgetting during continual pretraining, and how domain sequencing strategies impact knowledge transfer.

Parallel to my main research, I continue developing a GAMLSS-based Python package for neuroimaging applications at the Mental Health Mapping Lab. This package implements Generalized Additive Models for Location, Scale, and Shape and is being utilized by lab members, with a public release planned for 2025. I’m also co-supervising a master’s student’s project on ML applications in nerve disease diagnostics.

Research Philosophy: Following Richard Feynman’s principle “What I cannot create, I do not understand,” my work aims to reverse-engineer the internal mechanisms of language models. By understanding how these systems process and represent knowledge, we can build more reliable, controllable, and interpretable AI systems that truly serve human needs.

👨‍🏫 Teaching & Mentoring

I believe in giving back to the academic community. As a Teaching Assistant for the Neuromatch Academy’s Deep Learning Course (2024), I’ve had the privilege of guiding international students through complex concepts. My experience includes mentoring first-year students at IIT Roorkee and leading programming tutorials in the Academic Reinforcement Program.

🌱 Beyond Research

When I’m not delving into neural networks, you’ll find me on the beach volleyball court or training for long-distance runs - I’ve participated in 100km marathons at university! I’m also passionate about community service, having led initiatives like ‘Daan Petika’ during my time as an Executive at NSS IIT Roorkee, organizing blood donation camps and environmental cleanup drives. I believe in maintaining a balanced life that combines intellectual pursuits with physical activity and giving back to the community.

🤝 Let’s Connect

I’m passionate about advancing the field of interpretable AI and am always excited to discuss research, potential collaborations, or simply chat about the fascinating world of mechanistic interpretability. Whether you’re interested in knowledge representation in LLMs, activation engineering, or the intersection of AI and neuroscience, I’d love to connect!

If my work intrigues you, I’d be happy to give a presentation or engage in an in-depth discussion about research synergies. And of course, I’m always up for a chat about volleyball too! Feel free to reach out to me at nitinsharma3150@gmail.com.

Let’s explore how we might work together to push the boundaries of what we understand about these remarkable AI systems!