cv | Kaustubh Sharma

Basics

Name	Kaustubh Sharma
Label	Undergraduate Student / Researcher
Email	kaustubh_s@ee.iitr.ac.in
Url	https://kaustubh202.github.io
Summary	B.Tech (Electrical Engineering) student at IIT Roorkee (CGPA 9.04/10). Research interests and experience in mechanistic interpretability, diffusion models, Gaussian process inference, and ML for science & engineering. Author on multiple papers (ICCV / ICLR workshops) and incoming quantitative research intern.

Work

2025.05 - Present
Undergraduate Research Assistant

P-square Lab, IIT Roorkee

Research on attention mechanisms for Prior-Data Fitted Networks (PFNs) and amortized kernel/hyperparameter inference for GP-style problems.
- Developed Decoupled-Value Attention (DVA) for PFNs to accelerate GP inference for physical equation solving (code: PSquare-Lab/DVA-PFN).
- Working on foundational architectures for amortized kernel hyperparameter inference and scaling PFNs.
2025.03 - 2025.03
Educator

Edufabrica Pvt. Ltd.

Delivered a 2-day workshop lecture to 200+ students across India on Generative AI.
Incoming Quantitative Research Intern (Quantitative Strategist)

Goldman Sachs

Selected for the quantitative strategist internship involving financial modelling, statistical analysis and quantitative research. (Upcoming)

Education

2023.07 - 2027.06
B.Tech

Indian Institute of Technology Roorkee

Electrical Engineering

Awards

2025.01.01

Micron AI Hackathon 2025 — Second Runner Up

Micron Technology

Second Runner Up at Micron AI Hackathon 2025.
2023.06.01

JEE Advanced 2023

IIT / Joint Entrance Examination

Secured All India Rank 1624 among 100k+ applicants.
2023.05.01

JEE Mains 2023

NTA

Secured All India Rank 1898 among 1.2M+ applicants.
2021.01.01

NTSE Scholar 2021

NCERT

Awarded NTSE scholarship; among top ~2000 candidates from ~900k+ applicants.
2022.01.01

KVPY Fellow 2022

IISc Bangalore / KVPY

Secured AIR 509 among ~200k+ applicants; KVPY fellowship recipient.

Publications

2026

Dissecting Attention and MLP Roles: A Study of Domain Specialization in Large Language Models

Under review

First author (Kaustubh Sharma). Study on domain specialization in LLaMA 3-3B: forward pass profiling, probing, zero-out tests and analysis of fine-tuning shifts.
2025.09

Decoupled-Value Attention for Prior-Data Fitted Networks: GP Inference for Physical Equations

arXiv / Under review

First author (Kaustubh Sharma). Proposed DVA attention to improve Gaussian process inference speed/accuracy inside PFNs for physical equation solving.
2025

Image-Alchemy: Advancing Subject Fidelity in Personalized Text-to-Image Generation

DeLTa Workshop, ICLR 2025

First author (Kaustubh Sharma). Pipeline for personalizing Stable Diffusion XL improving subject fidelity using LoRA and segmentation-guided Img2Img to reduce overfitting/forgetting.
2025

Explainable AI-Generated Image Forensics: A Low-Resolution Perspective with Novel Artifact Taxonomy

APAI Workshop, ICCV 2025 (Proceedings)

Solo author (Kaustubh Sharma). Developed an interpretable pipeline for AI-generated image detection at low (32×32) resolution with an artifact taxonomy and explainability framework.

Skills

	Technical Skills
	Python (PyTorch, Transformers, NumPy, Diffusion)
	C++
	Git / GitHub
	Linux

	Machine Learning
	Mechanistic Interpretability
	Diffusion Models
	Autoregressive Modelling
	ML for Science & Engineering
	Language Models

	Mathematics & Statistics
	Probability Theory
	Gaussian Processes
	Optimization
	Statistics

Languages

	Hindi
	Native speaker

	English
	Advanced

	French
	Beginner

Interests

	Music
	Pianist (Music Section, IIT Roorkee)

	Swimming
	Member — IIT Roorkee Swimming Team
	Competitive Swimming

	Data Science
	Data Science Group — IIT Roorkee
	Mechanistic interpretability projects

Projects

2025.04 - Present
Domain Circuit Discovery in LLMs - Mechanistic Interpretability

Investigating domain-specific knowledge emergence in LLaMA 3-3B; mapping domain-specific 'rooms' across transformer architecture and evaluating causal effects, probe separability, zero-out tests, hydra effect and fine-tuning shifts.
- Forward pass profiling
- Probe separability
- Zero-out tests
- Hydra effect analysis
- Fine-tuning shift evaluation
2025.05 - Present
Decoupled-Value Attention (DVA) for PFNs

Designed and implemented DVA attention to speed up GP-like inference in Prior-Data Fitted Networks for physical equation solving; codebase and experiments developed with P-square Lab collaborators.
- DVA attention design and implementation
- GP inference time reduction for physical equation solving
- Scaling PFNs
Sparsity-Aware Representation Learning for Jet Image Generation via Guided Latent Diffusion

Sparsity-aware latent diffusion framework for generating high-energy physics jet images with a custom VAE and a mean-pulling mechanism to improve reconstruction quality.
- Custom VAE with sparsity reconstruction loss
- Latent diffusion mean-pulling mechanism

Basics

Work

P-square Lab, IIT Roorkee

Research on attention mechanisms for Prior-Data Fitted Networks (PFNs) and amortized kernel/hyperparameter inference for GP-style problems.

Edufabrica Pvt. Ltd.

Delivered a 2-day workshop lecture to 200+ students across India on Generative AI.

Goldman Sachs

Selected for the quantitative strategist internship involving financial modelling, statistical analysis and quantitative research. (Upcoming)

Education

Indian Institute of Technology Roorkee

Electrical Engineering

Awards

Micron Technology

Second Runner Up at Micron AI Hackathon 2025.

IIT / Joint Entrance Examination

Secured All India Rank 1624 among 100k+ applicants.

NTA

Secured All India Rank 1898 among 1.2M+ applicants.

NCERT

Awarded NTSE scholarship; among top ~2000 candidates from ~900k+ applicants.

IISc Bangalore / KVPY

Secured AIR 509 among ~200k+ applicants; KVPY fellowship recipient.

Publications

Under review

First author (Kaustubh Sharma). Study on domain specialization in LLaMA 3-3B: forward pass profiling, probing, zero-out tests and analysis of fine-tuning shifts.

arXiv / Under review

First author (Kaustubh Sharma). Proposed DVA attention to improve Gaussian process inference speed/accuracy inside PFNs for physical equation solving.

DeLTa Workshop, ICLR 2025

First author (Kaustubh Sharma). Pipeline for personalizing Stable Diffusion XL improving subject fidelity using LoRA and segmentation-guided Img2Img to reduce overfitting/forgetting.

APAI Workshop, ICCV 2025 (Proceedings)

Solo author (Kaustubh Sharma). Developed an interpretable pipeline for AI-generated image detection at low (32×32) resolution with an artifact taxonomy and explainability framework.

Skills

Languages

Interests

Projects

Investigating domain-specific knowledge emergence in LLaMA 3-3B; mapping domain-specific 'rooms' across transformer architecture and evaluating causal effects, probe separability, zero-out tests, hydra effect and fine-tuning shifts.

Designed and implemented DVA attention to speed up GP-like inference in Prior-Data Fitted Networks for physical equation solving; codebase and experiments developed with P-square Lab collaborators.

Sparsity-aware latent diffusion framework for generating high-energy physics jet images with a custom VAE and a mean-pulling mechanism to improve reconstruction quality.