Anshul Singh

Experience & Education

Research Associate

IISc, Bangalore

Aug 2025 – Present

Research Intern

LT Group, UHH

Jan 2025 – May 2025

Undergrad Researcher

Dalhousie University, Halifax

Oct 2024 – April 2025

Research Intern

MITACS, Canada

June 2024 – Sep 2024

ML Research Intern

IIT, Roorkee

June 2023 – July 2023

B.E Information Technology

Panjab University

Sep 2021 – June 2025

News

Feb 2026:

Paper - Our paper M4-RAG: A Massive-Scale Multilingual Multi-Cultural Multimodal RAG is accepted at CVPR 2026! 🎉
Nov 2025:

Paper - Our paper Lost in Translation and Noise: A Deep Dive into Failure-Mode of VLLMs on Real-World Tables is accepted at EurIPS Workshop on AI for Tabular Data. (Oral 🎉)
Aug 2025:

Paper - Our paper MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space is accepted at EMNLP 2025 (Findings).
Aug 2025:

Position - Started working as Research Associate at IACV Lab, IISc Bangalore.
May 2025:

Paper - Our new preprint, MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space, is now available on ArXiv.
Jan 2025:

Position - Started a research internship at Language Technology Group, University of Hamburg, Germany.
Oct 2024:

Position - Started working as undergraduate researcher at SMART Lab, Dalhousie University.
July 2024:

Position - Selected for Mitacs Globalink Research Internship at Dalhousie University, Nova Scotia, Canada.
Nov 2023:

Paper - Our work Comparative Analysis of State-of-the-Art Attack Detection Models was accepted at 14th International Conference on Computing Communication and Networking Technologies (ICCCNT).
June 2023:

Position - Started a Machine Learning Research Intern at Virtual Labs, IIT Roorkee.
March 2022:

Position - Started working as Project Intern at Design & Innovation Centre, Panjab University.

Research

MTabVQA: Evaluating Multi-Tabular Reasoning of Language Models in Visual Space
Anshul Singh, Chris Biemann, Jan Strich
Empirical Methods of Natural Language Processing (EMNLP), 2025 Findings
Paper / Dataset / Poster

In this work, we address a critical gap in Vision-Language Model (VLM) evaluation by introducing MTabVQA, a novel benchmark for multi-tabular visual question answering. Our benchmark comprises 3,745 complex question-answer pairs that require multi-hop reasoning across several visually rendered table images, simulating real-world documents. We benchmark state-of-the-art VLMs, revealing significant limitations in their ability to reason over complex visual data. To address this, we release MTabVQA-Instruct, a large-scale instruction-tuning dataset. Our experiments demonstrate that fine-tuning with our dataset substantially improves VLM performance, bridging the gap between existing benchmarks that rely on single or non-visual tables.

Comparative Analysis of State-of-the-Art Attack Detection Models
Priyanka Kumari, Veenu Mangat, and Anshul Singh
14th International Conference on Computing Communication and Networking Technologies (ICCCNT), 2023
Paper

In this work, we address the growing security challenges in IoT networks by conducting a comprehensive comparative analysis of machine learning classifiers for intrusion detection. We evaluated five distinct models on two real-world IoT network traffic datasets to identify the most effective algorithms for detecting malicious activity. Our findings show that tree-based models, specifically Random Forest and Decision Trees, deliver outstanding performance, achieving accuracies exceeding 99%. This research provides a clear benchmark and practical guidance for developing robust and high-performance security systems to protect vulnerable IoT environments.

Reflections

My DIC Journey

thoughts

Research at IIT

thoughts