Present

Hi! I am Anupam. I am currently working as a Postdoctoral Researcher in the Systems Group at TU-Darmstadt. My research interests lie in and around database systems. My current research focuses on the Testing and Benchmarking of Database Systems and Foundation Models for Data Engineering.

Experience

Postdoctoral Researcher  May 2024 - Present

TU-Darmstadt, Germany.
Host: Prof. Carsten Binnig

Research Scientist  Aug 2022 - Feb 2024

IBM Research, Bengaluru, India.

Project Intern  May 2021 - Aug 2021

IBM Research, India.

Technical Project Leader  Aug 2016 - Aug 2017

2012 Labs (Database Team), Huawei Technologies India Pvt. Ltd.

Project Intern  May - July 2013

CSE Dept., IIT Bombay

Teaching


Co-Instructor  Oct 2024 - Mar 2025

Extended Seminar: AI for Data Management, TU Darmstadt

Teaching Assistant  Aug - Dec 2019

Course: E0225 Design and Analysis of Algorithms, CSA Dept., IISc

Teaching Assistant  Aug - Dec 2015, Aug - Dec 2018

Course: E0261 Database Management Systems, CSA/CDS Dept., IISc

Teaching Assistant  Aug - Dec 2018

Course: UE101 Algorithms and Programming, UG Dept., IISc

Teaching Assistant  Jan 2014 - May 2014

Course: Information Systems Lab, CS/IT Dept., JIIT

Education

Ph.D. - Computer Science and Engineering  2017 - 2022

Computer Science and Automation Dept., Indian Institute of Science, Bangalore.
Advisor: Prof. Jayant Haritsa
Thesis: Hydra: A Dynamic Approach to Database Regeneration.

M.E. - Computer Science and Engineering  2014 - 2016

Computer Science and Automation Dept., Indian Institute of Science, Bangalore.

B.Tech. - Information Technology  2010 - 2014

Computer Science and IT Dept., Jaypee Institute of Information Technology, NOIDA.

Higher Secondary Education  2010

No. 1 Air Force School, Gwalior (CBSE)

Publications


Beyond Row Counts: Enhancing Workload-Aware Data Synthesis

A. Sanghi
EDBT Workshop: 27th Intl. Workshop on Design, Optimization, Languages and Analytical Processing of Big Data (DOLAP), Barcelona, Spain, March 2025.

LLMs for Enterprise Data Engineering

J. Bodensohn, L. Vogel, A. Sanghi, C. Binnig
ELLIS Workshop on Representation Learning and Generative Models for Structured Data, Amsterdam, Netherlands, February 2025

Automating Enterprise Data Engineering with LLMs

J. Bodensohn, U. Brackmann, L. Vogel, A. Sanghi, C. Binnig
NeurIPS Workshop: Table Representation Learning Workshop (TRL), Vancouver, Canada, December 2024.

LLMs for Data Engineering on Enterprise Data

J. Bodensohn, U. Brackmann, L. Vogel, M. Urban, A. Sanghi, C. Binnig
VLDB Workshop: Tabular Data Analysis Workshop (TaDA), Guangzhou, China, September 2024.

Surprise Benchmarking: The Why, What, and How

L. Benson, C. Binnig, J. Bodensohn, F. Lorenzi, J. Luo, D. Porobic, T. Rabl, A. Sanghi, R. Sears, P. Tözün, and T. Ziegler (alphabetically sorted)
SIGMOD Workshop: 10th Intl. Workshop on Testing Database Systems (DBTest), Santiago, Chile, June 2024.

Tabular Data Synthesis with GANs for Adaptive AI Models

Sandeep Hans*, Anupam Sanghi*, and Diptikalyan Saha
Proc. of 7th Joint Intl. Conf. on Data Science & Management of Data (CODS-COMAD), Bangalore, India, January 2024.
* (equal contribution)

Synthetic Data Generation for Enterprise DBMS   (tutorial)

A. Sanghi, and J. Haritsa
Proc. of 39th IEEE Intl. Conf. on Data Engineering (ICDE), Anaheim, California, USA, April 2023.

Semantic Automation for Data Discovery   (tutorial)

Rajmohan C, R. Chaudhuri, B. Ganesan, A. Sanghi, A. Agarwal and S. Mehta
Proc. of 6th Joint Intl. Conf. on Data Science & Management of Data (CODS-COMAD), January 2023.

Projection-Compliant Database Generation

A. Sanghi, S. Ahmed and J. Haritsa
PVLDB Journal, 15(5), January 2022, pgs. 998-1010

Towards Generating HiFi Databases

A. Sanghi, Rajkumar S. and J. Haritsa
Proc. of 26th Intl. Conf. on Database Systems for Advanced Applications (DASFAA), Taipei, Taiwan ROC, April 2021

HYDRA: A Dynamic Big Data Regenerator   (demo)

A. Sanghi, R. Sood, D. Singh, J. Haritsa and S. Tirthapura
PVLDB Journal, 11(12), August 2018, pgs. 1974-1977

Scalable and Dynamic Regeneration of Big Data Volumes

A. Sanghi, R. Sood, J. Haritsa and S. Tirthapura
Proc. of 21st Intl. Conf. on Extending DataBase Technology (EDBT), Vienna, Austria, March 2018

Achievements

  • Our paper received the Best Short Paper Award at VLDB Workshop on Tabular Data Analysis (TaDA), Aug. 2024.
  • Received Distinguished Reviewer Award for Applied Data Science Research Track, CODS-COMAD, 2024.
  • Awarded IBM PhD Fellowship 2019. Thanks IBM!
  • Received Microsoft Research Travel Grant to visit VLDB 2018, Rio de Janeiro, Brazil, and VLDB 2022, Sydney, Australia.
  • Received VLDB Travel Fellowship to visit VLDB 2018, Rio de Janeiro, Brazil, and VLDB 2022, Sydney, Australia.
  • Received Best Poster Award at the Young Researchers' Symposium, CoDS-COMAD 2018.
  • Received the Future Star Award 2017 at Huawei Technologies India Pvt. Ltd.
  • Secured All India Rank 22 with a score of 953 in Graduate Aptitude Test in Engineering (GATE), Computer Science, 2014.
  • Won top 0.1% merit certificate from CBSE for English in Class 12th and Mathematics in Class 10th

Service

  • Program committee member in SIGMOD 2026, DBTest 2024, CODS-COMAD 2024 (Jan. edition), CODS-COMAD 2024 (Dec. edition).
  • Participant of the Dagstuhl Seminar on Ensuring Reliability and Robustness of Database Management Systems, in 2021 and 2023.
  • Member of the Diversity Council at IBM Research, India, May 2023 - Feb 2024.
  • Chair for a Session on Benchmarking, Performance Modeling, Tuning, and Testing at ICDE 2023.
  • Member of the Student Welfare Committee at CSA Dept, IISc during 2020-2022.