Featured research
ConTextTab: A Semantics-Aware Tabular In-Context Learner
By leveraging LLM embeddings, ConTextTab successfully integrates semantics from table features into tabular prediction, shining on data with high semantic content such as free text or descriptive categories.
RELATE: A Schema-Agnostic Perceiver Encoder for Multimodal Relational Graphs
We introduce RELATE (Relational Encoder for Latent Aggregation of Typed Entities), a schema-agnostic, plug-and-play feature encoder that can be used with any general purpose GNN.
Foundation Models for Tabular Data within Systemic Contexts Need Grounding
We propose Foundation Models for Semantically Linked Tables (FMSLT) to advance the understanding of structured enterprise data. Enterprise tables are interconnected through operational logic and semantic relationships that define how businesses operate. Recognising and modelling these connections is essential for capturing the true nature of enterprise data.
Open-Source Enterprise Datasets
We introduce SALT and SALT-KG, the first enterprise datasets built from real customer ERP systems. They combine rich, linked business tables with a curated knowledge graph capturing semantic context. Together, they lay the foundation for advancing foundation models that truly understand structured enterprise data.
Paper and Open Source Foundation Model on Tabular Data: SAP-RPT-1-OSS
We have published our ConTextTab research paper at NeurIPS 2025 (spotlight paper) and provided an open weight version of our model as SAP-RPT-1-OSS.
Who we are
At SAP Business AI Research, we serve as the bridge between academia and industry, dedicated to advancing next-generation AI systems. Our research addresses the complexities of real-world enterprise environments by integrating cutting-edge AI techniques with domain-specific challenges. We focus on two main research tracks to ensure that our models are not only powerful but also practical, trustworthy, and scalable.
Research areas
Track A: Structure - Aware Foundation Models
We develop foundation models that reason over complex, linked business data—spanning tables, time series, and graphs. By integrating structural awareness, multimodal inputs, and causal reasoning, our models enable advanced Business AI for analysis, forecasting, and decision-making.
Table representation learning
Learning tabular data representations via table-native and language-based models, integrating business data for advanced reasoning.
Graph neural networks
Using Graph Neural Networks to model relational tabular data, enabling accurate predictions and deeper insights in enterprise AI.
Business knowledge graph
Building enterprise knowledge graphs to enable precise, context-aware queries across diverse business data.
Agentic AI
Building self-improving agents for reliable, goal-driven automation in enterprise systems.
Coding LLM (ABAP)
Empowering enterprise software development with domain-specific ABAP foundation models for intelligent coding assistance.
Track B: Trustworthy AI
Our research develops AI systems that are robust, fair, transparent, and aligned with human values—essential for real-world enterprise use. We focus on robustness, explainability, fairness, privacy, and alignment with domain-specific constraints to ensure reliable and responsible AI deployment.
Differential privacy
We develop efficient deep learning models that save resources and protect privacy.
Data confidentiality
We ensure data confidentiality by protecting structured data and validating privacy through audits and attacks.
Model protection
Analysing sentiments in text using neural embedding and attention.
Security testing
Enhancing model transparency by making predictions explainable.
Human-Alignment
Extracting data from documents using NLP and computer vision.
Careers
Join us and build the future of Business AI
Work with rich datasets to find machine learning-based solutions to real-world problems in close collaboration with our global network of research partners.
PhD Internship (US): Foundation Models on Structured Data
PhD Internship (DE/EU): Foundation Models on Structured Data
PhD Internship (DE/EU): Agents and Knowledge Graph
PhD Internship (SGP): Document AI
Publications
TabGemma: Text-Based Tabular ICL via LLM using Continued Pretraining and Retrieval
Günther Schindler, Maximilian Schambach, Michael Medek, Sam Thelin, AITD Workshop, EurlPS, 2025
Sahil Bansal, Sai Shruthi Sistla, Aarti Arikatala, Sebastian Schreiber, IJCNLP-AACL, 2025
RGP: A Cross-Attention based Graph Transformer for Relational Deep Learning
Divyansha Lachi, Mahmoud Mohammadi, Joe Meyer, Vinam Arora, Tom Palczewski, Eva L Dyer, LoG Conference, 2025
Expanding the Action Space of LLMs to Reason Beyond Language
Zhongqi Yue, Weishi Wang, Yundaichuan Zhan, Juncheng Li, Daniel Dahlmeier, Fredrik D. Johansson, MATH-AI Workshop, NeurlPS, 2025
Smaller Models, Smarter Rewards: A Two-Sided Approach to Process and Outcome Rewards
Jan Groeneveld, Grace(Xi) Qin, Alexander Schaefer, Yaad Oren, FoRLM Workshop, NeurIPS, 2025
RELATE: A Schema-Agnostic Perceiver Encoder for Multimodal Relational Graphs
Joe Meyer, Divyansha Lachi, Mohammadi Reza, Roshan Reddy Upendra, Eva Dyer, Mark Li, Tom Palczewski, NPGML Workshop, NeurlPS, 2025
SPRINT: Scalable Secure & Differentially Private Inference for Transformers
Francesco Capano, Jonas Böhler, Benjamin Weggenmann, PETS, 2026
Mediating Cognitive Biases in Business Decisions Using LLMS
Natalie Friedman; Marcus Krug; S. Joy Mountford, CAI, 2025
Combining Knowledge Graphs and Retrieval Augmented Generation for Enterprise Resource Planning
Amar Viswanathan and Felix Sasaki, ECIR, 2025
ConTextTab: A Semantics-Aware Tabular In-Context Learner
Marco Spinaci, Marek Polewczyk, Maximilian Schambach, Sam Thelin, NeurIPS (Spotlight), 2025
ConTextTab: A Semantics-Aware Tabular In-Context Learner
Marco Spinaci, Marek Polewczyk, Maximilian Schambach, Sam Thelin, VFMSD Workshop, ICML (Best Tabular Paper), 2025
Evaluation and Benchmarking of LLM Agents: A Survey
Mahmoud Mohammadi, Yipeng Li, Jane Lo, Ching-Wa (Wendy) Yip, KDD, 2025
Optimisation before Evaluation: Evaluation with Unoptimised Prompts Can be Misleading
Nicholas Sadjoli, Tim Siefken, Atin Ghosh, Yifan Mai, Daniel Dahlmeier, Industry Track, ACL, 2025
Table dissolution: Adding Salt To Your Data
Francesco Pugnaloni, Tassilo Klein, Felix Naumann, DEEM Workshop, SIGMOD/PODS, 2025
Tassilo Klein and Moin Nabi, ACL, 2025