ICLR 2026 — Published a sole-author paper on time-series foundation models and mechanistic interpretability through the Workshop on Time Series in the Age of Large Models.
Anurag Mishra
I am a researcher focused on understanding how machine learning models work internally. My primary focus is mechanistic interpretability: tracing transformer circuits, testing causal hypotheses, and studying the internal representations and computations that drive model behavior.
I completed an M.S. in Artificial Intelligence at Rochester Institute of Technology in May 2025. My work spans mechanistic interpretability, LLM evaluation and reliability, multimodal deepfake detection, explainable AI, and statistical ML.
News
CVPR 2026 — Served as a reviewer for the AIMS 2026 workshop on AI-generated media and security.
Joined Eaton Ventures (Rochester Appliances) as an AI Engineer.
Completed an M.S. in Artificial Intelligence at Rochester Institute of Technology.
Published research across IEEE, arXiv, and other venues in mechanistic interpretability, translation explainability, time-series foundation models, and applied machine learning.
Completed a B.Tech in Computer Science & Engineering at Sikkim Manipal Institute of Technology.
Research Interests
Mechanistic interpretability and causal circuit analysis. Activation tracing, attention-head profiling, and intervention-based analysis to identify task-relevant transformer mechanisms and measure causal contribution to final behavior.
LLM evaluation, reliability, and hallucination reduction. Benchmark-driven evaluation across frontier model families with retrieval-augmented prompting, stress tests, and regression checks for reasoning and factual consistency.
Performance-aware AI systems. Latency and throughput optimization, profiling, asynchronous orchestration, and runtime regression debugging for production model-serving systems.
Multimodal detection and robustness. CNN-transformer architectures for lip-sync manipulation detection, synthetic data generation, and out-of-distribution evaluation under realistic deployment constraints.
Publications
Dissecting Chronos: Sparse Autoencoders Reveal Causal Feature Hierarchies in Time Series Foundation Models
1st ICLR Workshop on Time Series in the Age of Large Models, 2026
Automatic Short Answer Grading Using a LSTM Based Approach
2023 IEEE World Conference on Applied Intelligence and Computing (AIC), 332-337, 2023
Mechanistic Interpretability of GPT-like Models on Summarization Tasks
arXiv preprint arXiv:2505.17073, 2025
Advancing Explainability in Neural Machine Translation: Analytical Metrics for Attention and Alignment Consistency
arXiv preprint arXiv:2412.18669, 2024
Analyzing the Impact of Climate Change With Major Emphasis on Pollution: A Comparative Study of ML and Statistical Models in Time Series Data
arXiv preprint arXiv:2405.15835, 2024
Projects
Causal Mechanisms of Backtracking in Reasoning Models
Studied backtracking behavior in reasoning models and identified causally important components through targeted interventions, measuring how backtracking dynamics correlate with answer correctness under controlled perturbations.
DeFake: Multimodal Lip-Sync Manipulation Detection
Built CNN + transformer detection pipelines over 18,000+ videos, with custom C++/CUDA tensor kernels, synthetic data expansion, and robustness testing under deployment-like conditions.
LLM Evaluation and Robustness Harness
Designed benchmark workflows to evaluate reasoning, factual consistency, tool-calling behavior, and latency-quality trade-offs, with regression checks and retrieval-augmented prompting strategies that reduced hallucination on factual QA tasks.
Experience
Industry
AI Engineer
- Architected and deployed production AI services supporting 10K+ daily requests at 99.9% uptime.
- Reduced end-to-end API latency by 35% through request-path and model-serving optimization.
- Implemented CI/CD validation and regression pipelines that reduced release-cycle time by 60%.
Research
Machine Learning Research Assistant, Mechanistic Interpretability
- Built activation tracing tooling in PyTorch/C++ for layer-wise transformer analysis.
- Identified summarization-relevant circuits in layers 2, 3, and 5.
- Applied targeted LoRA interventions with 40% faster convergence and 75% fewer trainable parameters.
Machine Learning Research Assistant, LLM Evaluation and Robustness
- Designed evaluation harnesses for frontier models across reasoning, factuality, and tool calling.
- Developed retrieval-augmented prompting workflows that reduced hallucination by 35%.
- Built reproducible benchmark suites for quality and runtime regression tracking.
Machine Learning Research Assistant, DeFake Project
- Developed multimodal CNN + transformer systems on 18,000+ videos for lip-sync manipulation detection.
- Implemented custom C++/CUDA tensor kernels that improved training efficiency by 40%.
- Expanded stress-test data with 50K+ synthetic samples and improved OOD generalization by 18%.
Education
M.S. in Artificial Intelligence
CGPA: 3.89/4.00.
Research emphasis: Mechanistic Interpretability, NLP, Multimodal Imaging, and Statistical Machine Learning.
B.Tech in Computer Science & Engineering
Minor in Artificial Intelligence. AI/ML Lead at Google Developer Student Club. Research intern at DRDO and NLP intern across multiple startups.
Writing
Substack
Current public writing is published externally rather than hosted as on-site blog posts.