Business AI Research

Creating AI breakthroughs that redefine businesses.
SAP AI Research hero image
NEWS
Paper and Open Source Foundation Model on Tabular Data: SAP-RPT-1-OSS

We have published our ConTextTab research paper at NeurIPS 2025 (spotlight paper) and provided an open weight version of our model as SAP-RPT-1-OSS.

Learn more

Who we are

placeholder

At SAP Business AI Research, we serve as the bridge between academia and industry, dedicated to advancing next-generation AI systems. Our research addresses the complexities of real-world enterprise environments by integrating cutting-edge AI techniques with domain-specific challenges. We focus on two main research tracks to ensure that our models are not only powerful but also practical, trustworthy, and scalable.

Research areas

Track A: Structure - Aware Foundation Models

We develop foundation models that reason over complex, linked business data—spanning tables, time series, and graphs. By integrating structural awareness, multimodal inputs, and causal reasoning, our models enable advanced Business AI for analysis, forecasting, and decision-making.

Table representation learning

Learning tabular data representations via table-native and language-based models, integrating business data for advanced reasoning.

Graph neural networks

Using Graph Neural Networks to model relational tabular data, enabling accurate predictions and deeper insights in enterprise AI.

Business knowledge graph

Building enterprise knowledge graphs to enable precise, context-aware queries across diverse business data.

Agentic AI

Building self-improving agents for reliable, goal-driven automation in enterprise systems.

Coding LLM (ABAP)

Empowering enterprise software development with domain-specific ABAP foundation models for intelligent coding assistance.

Track B: Trustworthy AI

Our research develops AI systems that are robust, fair, transparent, and aligned with human values—essential for real-world enterprise use. We focus on robustness, explainability, fairness, privacy, and alignment with domain-specific constraints to ensure reliable and responsible AI deployment.

Differential privacy

We develop efficient deep learning models that save resources and protect privacy.

Data confidentiality

We ensure data confidentiality by protecting structured data and validating privacy through audits and attacks.

Model protection

Analyzing sentiments in text using neural embedding and attention.

Security testing

Enhancing model transparency by making predictions explainable.

Human-Alignment

Extracting data from documents using NLP and computer vision.

Careers

placeholder

Join us and build the future of Business AI

Work with rich datasets to find machine learning-based solutions to real-world problems in close collaboration with our global network of research partners.

Publications

TabGemma: Text-Based Tabular ICL via LLM using Continued Pretraining and Retrieval

Günther Schindler, Maximilian Schambach, Michael Medek, Sam Thelin, AITD Workshop, EurlPS, 2025

 

Planning Agents on an Ego-Trip: Leveraging Hybrid Ego-Graph Ensembles for Improved Tool Retrieval in Enterprise Task Planning

Sahil Bansal, Sai Shruthi Sistla, Aarti Arikatala, Sebastian Schreiber, IJCNLP-AACL, 2025

 

RGP: A Cross-Attention based Graph Transformer for Relational Deep Learning

Divyansha Lachi, Mahmoud Mohammadi, Joe Meyer, Vinam Arora, Tom Palczewski, Eva L Dyer, LoG Conference, 2025

 

Expanding the Action Space of LLMs to Reason Beyond Language

Zhongqi Yue, Weishi Wang, Yundaichuan Zhan, Juncheng Li, Daniel Dahlmeier, Fredrik D. Johansson, MATH-AI Workshop, NeurlPS, 2025

 

Smaller Models, Smarter Rewards: A Two-Sided Approach to Process and Outcome Rewards

Jan Groeneveld, Grace(Xi) Qin, Alexander Schaefer, Yaad Oren, FoRLM Workshop, NeurIPS, 2025

 

RELATE: A Schema-Agnostic Perceiver Encoder for Multimodal Relational Graphs

Joe Meyer, Divyansha Lachi, Mohammadi Reza, Roshan Reddy Upendra, Eva Dyer, Mark Li, Tom Palczewski, NPGML Workshop, NeurlPS, 2025

 

SPRINT: Scalable Secure & Differentially Private Inference for Transformers

Francesco Capano, Jonas Böhler, Benjamin Weggenmann, PETS, 2026

Mediating Cognitive Biases in Business Decisions Using LLMS

Natalie Friedman; Marcus Krug; S. Joy Mountford, CAI, 2025

 

Combining Knowledge Graphs and Retrieval Augmented Generation for Enterprise Resource Planning

Amar Viswanathan and Felix Sasaki, ECIR, 2025

 

ConTextTab: A Semantics-Aware Tabular In-Context Learner

Marco Spinaci, Marek Polewczyk, Maximilian Schambach, Sam Thelin, NeurIPS (Spotlight), 2025

 

ConTextTab: A Semantics-Aware Tabular In-Context Learner

Marco Spinaci, Marek Polewczyk, Maximilian Schambach, Sam Thelin, VFMSD Workshop, ICML (Best Tabular Paper), 2025

 

Evaluation and Benchmarking of LLM Agents: A Survey

Mahmoud Mohammadi, Yipeng Li, Jane Lo, Ching-Wa (Wendy) Yip, KDD, 2025

 

Optimization before Evaluation: Evaluation with Unoptimized Prompts Can be Misleading

Nicholas Sadjoli, Tim Siefken, Atin Ghosh, Yifan Mai, Daniel Dahlmeier, Industry Track, ACL, 2025

 

Table dissolution: Adding Salt To Your Data

Francesco Pugnaloni, Tassilo Klein, Felix Naumann, DEEM Workshop, SIGMOD/PODS, 2025

 

Contrastive Perplexity for Controlled Generation: An Application in Detoxifying Large Language Models

Tassilo Klein and Moin Nabi, ACL, 2025

Hybrid Active Learning with Uncertainty-Weighted Embeddings

Yinan He, Lile Cai, Jingyi Liao, Chuan-Sheng Foo, TMLR, 2024

 

ClusterTabNet: Supervised clustering method for table detection and table structure recognition

Marek Polewczyk, Marco Spinaci, ICDAR, 2024

 

Play, Plug, and Fuse: Zero-Shot Joint Decoding via Word-Level Re-ranking Across Diverse Vocabularies

Sai Koneru, Miriam Exel, Matthias Huck, Jan Niehues, WMT, 2024

 

Post-Edits are Preferences Too

Nathaniel Berger, Stefan Riezler, Miriam Exel, Matthias Huck, WMT, 2024

 

How Effective is Synthetic Data and Instruction Fine-tuning for Translation with Markup using LLMs?

Raj Dabre, Haiyue Song, Miriam Exel, Bianka Buschbeck, Johannes Eschbach-Dymanus, Hideki Tanaka, EAMT, 2024

 

Prompting Large Language Models with Human Error Markings for Self-Correcting Machine Translation

Nathaniel Berger, Stefan Riezler, Miriam Exel, Matthias Huck, EAMT, 2024

Exploring the Effectiveness of LLLM Domain Adaptation for Business IT Machine Translation

Johannes Eschbach-Dymanus, Frank Essenberger, Bianka Buschbeck, Miriam Exel, EAMT, 2024

 

Contextual Refinement of Translations: Large Language Models for Sentence and Document-Level Post-Editing

Sai Koneru, Miriam Exel, Matthias Huck, Jan Niehues, NAACL, 2024

 

Scalable Tabular Foundation Models via Content-Specific Tokenization

Spinaci, M., Polewczyk, M., Hoffart, J., Kohler, M. C., Thelin, S., & Klein, T., TRL Workshop, NeurIPS, 2024

 

SALT: Sales Autocompletion Linked Business Tables Dataset

Klein, Tassilo, Clemens Biehl, Margarida Costa, Andre Sres, Jonas Kolk, Johannes Hoffart, TRL Workshop, NeurIPS, 2025

 

Exploration of autoregressive models for in-context learning on tabular data

Baur, Stefan K., Sohyeong Kim, TRL Workshop, NeurIPS, 2024

 

Generalizing teacher networks for effective knowledge distillation across student architectures

Kuluhan Binici, et al., BMVC, 2024

miCSE: Mutual information contrastive learning for low-shot sentence embeddings

T Klein, M Nabi, ACL, 2023

 

Scd: self-contrastive decorrelation for sentence embeddings

T Klein, M Nabi, ACL, 2022

 

Attention is (not) all you need for commonsense reasoning

T Klein, M Nabi, ACL, 2019

 

Budget-aware adapters for multi-domain learning

R Berriel, S Lathuillere, M Nabi, T Klein, T Oliveira-Santos, N Sebe, ICCV, 2019

 

Contrastive self-supervised learning for commonsense reasoning

T Klein, M Nabi, ACL, 2020

Multimodal prototypical networks for few-shot learning

F Pahde, M Puscas, T Klein, M Nabi, WACV, 2021

 

Multimodal self-supervised learning for medical image analysis

A Taleb, C Lippert, T Klein, M Nabi, IPMI, 2021

 

DeepNAT: Deep convolutional neural network for segmenting neuroanatomy

C Wachinger, M Reuter, T Klein, NeuroImage, 2017

 

Learning to remember: A synaptic plasticity driven framework for continual learning

O Ostapenko, M Puscas, T Klein, P Jahnichen, M Nabi, CVPR, 2019

 

Differentially Private Federated Learning: A Client Level Perspective

RC Geyer, T Klein, M Nabi, On Device ML Workshop, NIPS, 2017

twitter pixeltwitter pixeltwitter pixeltwitter pixeltwitter pixeltwitter pixeltwitter pixeltwitter pixeltwitter pixeltwitter pixeltwitter pixeltwitter pixel