Steffen Zeuch

Researcher at German Research Center for Artificial Intelligence (DFKI) and Post-Doc at TU Berlin working with Volker Markl, IAM, and DIMA.

During my Ph.D. I was a member of the DFG graduate school SOAMED.

Research Areas

Query processing, in-memory databases and index structures, distributed and streaming systems, modern chipset extensions, QEP parallelization and scheduling on multi/many core and heterogeneous hardware, database and compiler interaction (especially JIT), progressive optimization.

Current Research Questions

    How can streaming systems exploit the capabilities of modern hardware more efficiently?
    What changes in the system and algorithm design are necessary to enable the full potential of very fast network technologies, e.g., InfiniBand, that will be commonly available in the near future?
    How can we schedule query execution in the topology of the future, e.g., in the IoT scenario, where we face highly dynamic execution environments?


Steffen Zeuch

Address

DFKI Project Office Berlin
Alt-Moabit 91c
10559 Berlin, Germany

e-mail/phone

steffen.zeuch@dfki.de
+49 30 23895 1812

Biography

During my studies, I specialized on database technology and Business Intelligence. I am familiar with the Business Intelligence concepts starting with modeling business and ETL processes, transforming the concepts to data layouts, and storing and querying them in a distributed system. Due to my work and internships at the SAP database department, I am also familiar with modern in-memory databases.
In the context of my master thesis at SAP, I developed a data structure for data items with infrequent access patterns that affects the entire database stack ranging from SQL query transformation to bit-level organization.
My research as Ph. D. student focuses on multi-core and main memory challenges for data access in query optimization. My work on accelerating tree structures using SIMD instructions appeared at EDBT 2014. Furthermore, I proposed a Query Task Model that opens a design space for database schedules. With QTM, I generalize the modeling of parallel query execution such that different approaches become comparable. This model could be extended to schedule database queries among heterogeneous hardware. After that, I revealed performance properties of relational selection operators on modern hardware in my publication at IMDM 2015. Finally, I build up on performance insights of individual operators to provide more efficient query execution plans. Therefore, I propose a progressive optimization algorithm in my last publication at PVLDB 2016. I defended my Ph.D. thesis on April 27th 2018 and was graded with summa cum laude.
After my Ph. D. I worked as a professional software engineer at SAP. Since July 2017, I am a Senior Researcher at DFKI Berlin. Since July 2019, I am additionally a Senior Researcher in the DIMA research group of Prof. Volker Markl. Currently I exploit the usage of GPUs for database algorithms, the interaction between compilers and databases, as well as the optimization of Big-Data algorithms on different hardware platforms.

Work Experience

  • Since July 2019 - DIMA @ TU Berlin, Senior Researcher
  • Since July 2017 - DFKI Berlin, Senior Researcher
  • Sep 2016 - Juni 2017 - SAP AG, Berlin Software Development (Security Team)
    • Development for SAP HANA - Design and development of authentication and cryptographic algorithms.
  • May 2013 - Aug 2013 - IBM Research Almaden, San Jose
    • Development for InfoScout team - Design and development of the InfoScout prototype.
  • Feb 2011 - Sep 2011 -Internship SAP AG, Berlin
    • Development for SAP HANA - Design and development of a data structure that is optimized for data with infrequent access patterns.
  • May 2009 - Jan 2011 - Working student at SAP, Berlin
    • Performance testing for HANA DB persistence layer - Development of tests and test evaluation. Design, implementation and administration of a benchmark framework.
  • Sep 2008 - Feb 2009 - Internship SAP AG, Berlin
    • Accelerate infrastructure deployment process - Automatization of the deployment process for a test environments, including virtualization of OS, installation and configuration of an SAP system.

Publications

International Conferences


12
"Analyzing Efficient Stream Processing on Modern Hardware": Steffen Zeuch, Sebastian Breß, Tilmann Rabl, Bonaventura Del Monte, Jeyhun Karimov, Clemens Lutz, Manuel Renz, Jonas Traub, Volker Markl. PVLDB 12(5): 516-530 (2019).
11
"An Overview of Hawk: A Hardware-Tailored Code Generator for the Heterogeneous Many Core Age": Sebastian Breß, Henning Funke, Steffen Zeuch, Tilmann Rabl, Volker Markl. BTW (Workshops) 2019.
10
"Performance Analysis and Automatic Tuning of Hash Aggregation on GPUs": Viktor Rosenfeld, Sebastian Breß, Steffen Zeuch, Tilmann Rabl, Volker Markl. DaMoN 2019.
9
"Efficient and Scalable k‑Means on GPUs": Clemens Lutz, Sebastian Breß, Tilmann Rabl, Steffen Zeuch, Volker Markl. Datenbank-Spektrum 18(3): 157-169 (2018).
8
"Generating custom code for efficient query execution on heterogeneous processors": Sebastian Breß, Bastian Köcher, Henning Funke, Steffen Zeuch, Tilmann Rabl, Volker Markl. VLDB J. 27(6): 797-822 (2018).
7
"Efficient k-means on GPUs": Clemens Lutz, Sebastian Breß, Tilmann Rabl, Steffen Zeuch, Volker Markl. DaMoN 2018.
6
"Exploiting Automatic Vectorization to Employ SPMD on SIMD Registers": Stefan Sprenger, Steffen Zeuch, Ulf Leser. CDE Workshops 2018.
5
"Cache-Sensitive Skip List: Efficient Range Queries on Modern CPUs": Stefan Sprenger, Steffen Zeuch, Ulf Leser. ADMS/IMDM@VLDB 2016.
4
"Non-Invasive Progressive Optimization for In-Memory Databases": S. Zeuch, H. Pirk, J.C. Freytag. PVLDB 9(14):(2016).
3
"Selection on Modern CPUs": S. Zeuch, J.C. Freytag. IMDM 2015.
2
"QTM: Modelling Query Execution with Tasks": S. Zeuch, J.C. Freytag. ADMS 2014..
1
"Adapting Tree Structures for Processing with SIMD Instructions": S. Zeuch, F. Huber, J.C. Freytag. EDBT 2014.

Misc (Posters, Demos, Theses)


PhD 2017
"Query Execution on Modern CPUs": Steffen Zeuch. Dissertation, Humboldt University Berlin, submitted June 2017.
Master
"Design and development of a data structure for an in-memory database that is optimized for data with infrequent access patterns": . Master thesis HTW Berlin and SAP AG, 2011.
Bachelor
"Implementation of an automated deployment process for a test environment": . Bachelor thesis HTW Berlin, 2009.
Poster
"Intelligent Conversation to Transform Data into Insight.": E. Kandogan, M. Roth, S. Zeuch, et al.. IBM Workshop on Big Data Analytics,2013..

Theses & Teaching

Theses Supervision

  • Lorenzo Julian Toso: Efficient join operators on heterogeneous systems using RDMA and coprocessors. Master Thesis, TU Berlin, Germany, 2019.
  • Tobias Behrens: Energy Efficient Analytical Data Processing on ARM Architecture. Master Thesis, TU Berlin, Germany, 2019.
  • Haralampos Gavriilidis: Efficient Data Exchange between Data Processing Frameworks. Master Thesis, TU Berlin, Germany, 2019.
  • Phillip Grulich: Compiler Optimizations for Stream Processing Systems on Managed Languages. Master Thesis, TU Berlin, Germany, 2019.
  • Adrian Michalke: Out-of-Core GPU-accelerated Query Processing with Unified Memory. Bachelor Thesis, TU Berlin, Germany, 2018.
  • Janis von Bleichert: Code Generation for Stream Processing Systems. Master Thesis, TU Berlin, Germany, 2018.
  • Daniel Lunow: The Effect of Prefetching in Modern CPUs. Seminar Paper, Humboldt University Berlin, Germany, 2016.
  • Taras Iks: Konzept und Implementierung eines QTM-DLB auf Basis von PostgreSQL. Seminar Paper, Humboldt University Berlin, Germany, 2016.
  • Dennis Schneider: Kostenschätzung in verteilten Datenverarbeitungssystemen. Master Thesis, Humboldt University Berlin, Germany, 2015.
  • Tino Schernickau : Ein adaptiver Index für verteilte, strukturierte Laufzeitstatistiken. Master Thesis, Humboldt University Berlin, Germany, 2015.
  • Robert Przewozny: Erweiterung von Datenflussprogrammen um Operatoren zur Statistiksammlung. Diploma Thesis, Humboldt University Berlin, Germany, 2014.

Teaching

Summer term 2019:
  • BDAPRO: Big Data Analytics Project, Semester project.
Winter term 2018/19:
  • DBSEM: Foundation of Database Systems, Seminar.
Summer term 2018:
  • BDAPRO: Big Data Analytics Project, Semester project.
Winter term 2017/18:
  • DBSEM: Foundation of Database Systems, Seminar.
Summer term 2016:
  • Grundlagen von Datenbanksystemen (DBS I), Tutorial.
Summer term 2015:
  • Kompaktvorlesung: Einführung in C, Tutorial.
Summer term 2014:
  • Neue Konzepte und Techniken für Datenbanksysteme, Tutorial.
Summer term 2013/2014:
  • Implementierung von Datenbanken (DBS II), Tutorial.

© Steffen Zeuch   |   Last Update: 11 Jul 2019 |   Imprint and Data Privacy