Lauritz Thamsen | Postdoc @ TU Berlin

I am a research associate at TU Berlin, where I just finished my PhD in the distributed systems group of Odej Kao, working on resource management for distributed dataflows in the national research projects Stratosphere and the Berlin Big Data Center. Prior to that, I got my Bachelor's and Master's degree in Software Engineering from Hasso Plattner Institute, University of Potsdam, where I was part of the Software Architecture Group of Robert Hirschfeld, of the HPI Research School, and the HPI-Stanford Design Thinking Research Program. During my studies I also was at SAP Labs in Palo Alto, working on programming tools in the Technology Infrastructure Practice under Dan Ingalls, and at Signavio in Berlin, as a backend engineer.

Lauritz Thamsen

I am broadly interested in distributed systems and programming tools, with a particular focus on methods and technology for building and operating reliable data-intensive applications in context of critical public infrastructures.

contact

lauritz.thamsen at acm.org
and at tu-berlin.de
+49 (30) 314 - 24539

office

TEL 1210
Ernst-Reuter-Platz 7
10587 Berlin

web

Twitter: @lauritzthamsen
GitHub: lauritzthamsen
LinkedIn: lauritzthamsen

news

  • December 2018: We're going to present some of our work on scheduling distributed dataflows at IEEE CloudCom 2018 and IEEE Big Data 2018.
  • November 2018: The research project WaterGridSense, in which we will develop a scalable analytics platform for continuously processing data from distributed sensors within water networks, was kicked-off on November 5 at TU Berlin.
  • October 2018: In the winter semester I am lecturing on the Methods of Cloud Computing. I am also co-supervising a distributed systems bachelor's project on IoT data pipelines and a distributed systems master's project on applying machine learning for a more sustainable usage of water.
  • July 2018: Multiple new research projects will start this year in our group. If you are interested in scalable data analytics, distributed stream processing, and machine learning, please send me an email and come talk to us!
  • May 2018: I successfully defended my PhD thesis on May 4 in front of Odej Kao, Cesar de Rose, Andreas Polze, and Tilmann Rabl. [slides]
  • March 2018: I submitted my PhD thesis on the topic of Dynamic Resource Allocation for Distributed Dataflows to the faculty of Electrical Engineering and Computer Science at TU Berlin. [pdf]
  • December 2017: We are going to present some of our work on runtime prediction and dynamic resource management for batch processing jobs of distributed dataflow systems at IEEE CloudCom 2017 and PDCAT '17.
  • June 2017: I am going to present some of our work on scheduling recurring batch jobs in shared analytics clusters at IEEE BigData Congress 2017.
  • December 2016: We are presenting our take on runtime prediction for recurring distributed dataflow jobs at IEEE IPCCC 2016 and some work on visual dataflow programming at IEEE Big Data 2016.

availabilities

  • Collaboration: If you are interested in collaborating/working with me on any stuff related to building and operating scalable and reliable data-intensive applications in context of physical infrastructures, please reach out!
  • Community service: I am available for community service in the areas of data-intensive applications and cluster/cloud computing. Please find an up-to-date list of my previous services below.
  • Theses: We are looking for motivated students for bachelor and master theses. Have a look at the proposed topics and the publications below.

publications

2018:
  • CoBell: Runtime Prediction for Distributed Dataflow Jobs in Shared Clusters. Ilya Verbitskiy, Lauritz Thamsen, Thomas Renner, and Odej Kao. To appear in the Proceedings of the 10th IEEE International Conference on Cloud Computing Technology and Science (CloudCom). IEEE. 2018. Acceptance rate 20%.
  • Scheduling Stream Processing Tasks on Geo-Distributed Heterogeneous Resources. Gerrit Janßen, Ilya Verbitskiy, Thomas Renner, and Lauritz Thamsen. To appear in the Proceedings of the 2018 IEEE International Conference on Big Data (IEEE BigData). To be presented at the 1th First International Workshop on the Internet of Things Data Analytics (IoTDA). IEEE. 2018.
  • Learning Efficient Co-locations for Scheduling Distributed Dataflows in Shared Clusters. Lauritz Thamsen, Ilya Verbitskiy, Benjamin Rabier, and Odej Kao. To appear in Services Transactions on Big Data (Vol. 5, No. 1). Services Society. 2018.
  • Adaptive Resource Management for Distributed Data Analytics. Lauritz Thamsen, Thomas Renner, Ilya Verbitskiy, and Odej Kao. In Lucio Grandinetti, Seyedeh Leili Mirtaheri, Reza Shahbazian, Thomas Sterling, Vladimir Voevodin (eds.), Advances in Parallel Computing – Big Data and HPC: Ecosystem and Convergence. IOS Press. 2018. [pdf]
2017:
  • Ellis: Dynamically Scaling Distributed Dataflows to Meet Runtime Targets. Lauritz Thamsen, Ilya Verbitskiy, Jossekin Beilharz, Thomas Renner, Andreas Polze, and Odej Kao. In the Proceedings of the 9th IEEE International Conference on Cloud Computing Technology and Science (CloudCom). IEEE. 2017. Acceptance rate 29%. [pdf]
    • https://doi.org/10.1109/CloudCom.2017.37, © IEEE, 2017. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists or to reuse any copyrighted component of this work in other works must be obtained from IEEE.
  • SMiPE: Estimating the Progress of Recurring Iterative Distributed Dataflows. Jannis Koch, Lauritz Thamsen, Florian Schmidt, and Odej Kao. To appear in the Proceedings of the 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT). IEEE. 2017. Acceptance rate 32%. [pdf]
    • https://doi.org/10.1109/PDCAT.2017.00034, © IEEE, 2017. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists or to reuse any copyrighted component of this work in other works must be obtained from IEEE.
  • Scheduling Recurring Distributed Dataflow Jobs Based on Resource Utilization and Interference. Lauritz Thamsen, Benjamin Rabier, Florian Schmidt, Thomas Renner, and Odej Kao. In the Proceedings of the 6th IEEE BigData Congress. IEEE. 2017. Acceptance rate 23%. [pdf]
    • https://doi.org/10.1109/BigDataCongress.2017.28, © IEEE, 2017. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists or to reuse any copyrighted component of this work in other works must be obtained from IEEE.
  • Adaptive Resource Management for Distributed Data Analytics Based on Container-level Cluster Monitoring. Thomas Renner, Lauritz Thamsen, and Odej Kao. In the Proceedings of the 6th International Conference on Data Science, Technology and Applications (DATA). SCITEPRESS. 2017. [pdf]
    • © SCITEPRESS, 2017. This contribution was presented at DATA 17. This is the authors' version of the work.
  • Addressing Hadoop’s Small File Problem With an Appendable Archive File Format. Thomas Renner, Johannes Müller, Lauritz Thamsen, and Odej Kao. In the Proceedings of the Big Data Analytics Workshop (BigDAW), co-located with the ACM International Conference on Computing Frontiers. ACM. 2017. [pdf]
    • © ACM, 2017. This is the authors' version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version will be published in the proceedings of the Big Data Analytics Workshop (BigDAW17).
2016:
  • Selecting Resources for Distributed Dataflow Systems According to Runtime Targets. Lauritz Thamsen, Ilya Verbitskiy, Florian Schmidt, Thomas Renner, and Odej Kao. In the Proceedings of the 35th IEEE International Performance Computing and Communications Conference (IPCCC). IEEE. 2016. Acceptance rate 26%. [pdf]
    • https://doi.org/10.1109/PCCC.2016.7820629, © IEEE, 2016. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists or to reuse any copyrighted component of this work in other works must be obtained from IEEE.
  • CoLoc: Distributed Data and Container Colocation for Data-Intensive Applications. Thomas Renner, Lauritz Thamsen, and Odej Kao. In the Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData). Presented at the 4th International Workshop on Distributed Storage Systems and Coding for Big Data. IEEE. 2016. [pdf]
    • https://doi.org/10.1109/BigData.2016.7840954, © IEEE, 2016. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists or to reuse any copyrighted component of this work in other works must be obtained from IEEE.
  • Visually Programming Dataflows for Distributed Data Analytics. Lauritz Thamsen, Thomas Renner, Marvin Byfeld, Markus Paeschke, Daniel Schröder, and Felix Böhm. In the Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData). Presented at the 3rd Workshop on Advances in Software and Hardware for Big Data to Knowledge Discovery (ASH). IEEE. 2016. [pdf]
    • https://doi.org/10.1109/BigData.2016.7840860, © IEEE, 2016. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists or to reuse any copyrighted component of this work in other works must be obtained from IEEE.
  • When to Use a Distributed Dataflow Engine: Evaluating the Performance of Apache Flink. Ilya Verbitskiy, Lauritz Thamsen, and Odej Kao. In the Proceedings of the IEEE International Conference on Cloud and Big Data Computing (CBDCom). IEEE. 2016. [pdf]
    • https://doi.org/10.1109/UIC-ATC-ScalCom-CBDCom-IoP-SmartWorld.2016.0114, © IEEE, 2016. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists or to reuse any copyrighted component of this work in other works must be obtained from IEEE.
  • Continuously Improving the Resource Utilization of Iterative Parallel Dataflows. Lauritz Thamsen, Thomas Renner, and Odej Kao. In the Proceedings of the IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW). Presented at the International Workshop on Big Data and Cloud Performance (DCPerf). IEEE. 2016. [pdf]
    • http://dx.doi.org/10.1109/ICDCSW.2016.20, © IEEE, 2016. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists or to reuse any copyrighted component of this work in other works must be obtained from IEEE.
  • Aura: A Flexible Dataflow Engine for Scalable Data Processing. Tobias Herb, Lauritz Thamsen, Thomas Renner, and Odej Kao. In Andreas Knüpfer, Tobias Hilbrich, Christoph Niethammer, José Gracia, Wolfgang E. Nagel, Michael M. Resch (eds.), Tools for High Performance Computing 2015. Springer. 2016. [pdf]
2015:
  • Exploratory Authoring of Interactive Content in a Live Environment. Philipp Otto, Jaqueline Pollak, Daniel Werner, Felix Wolff, Bastian Steinert, Lauritz Thamsen, Marcel Taeumel, Jens Lincke, Robert Krahn, Daniel H. H. Ingalls, and Robert Hirschfeld. HPI Technical Reports, vol. 101. Hasso Plattner Institute. 2016. [pdf]
  • Lively Groups: Shared Behavior in a World of Objects without Classes or Prototypes. Tim Felgentreff, Jens Lincke, Robert Hirschfeld, and Lauritz Thamsen. In Proceedings of the Future Programming Workshop (FPW) 2015, co-located with the Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM. 2015. [pdf]
    • © ACM, 2015. This is the authors' version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version will be published in the proceedings of the Future Programming Workshop (FPW) 2015.
  • Network-Aware Resource Management for Scalable Data Analytics Frameworks. Thomas Renner, Lauritz Thamsen, and Odej Kao. In Proceedings of the First Workshop on Data-Centric Infrastructure for Big Data Science (DIBS) 2015, co-located with the 2015 IEEE International Conference on BigData (BigData). IEEE. 2015. [pdf]
    • http://dx.doi.org/10.1109/BigData.2015.7364083, © IEEE, 2015. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists or to reuse any copyrighted component of this work in other works must be obtained from IEEE.
  • Preserving Access to Previous System States in the Lively Kernel. Lauritz Thamsen, Bastian Steinert, and Robert Hirschfeld. In Hasso Plattner, Christoph Meinel, and Larry Leifer (eds.), Design Thinking Research: Making Design Thinking Foundational. Springer. 2015. [pdf]
  • Implicit Parallelism through Deep Language Embedding. Alexander Alexandrov, Andreas Kunft, Asterios Katsifodimos, Felix Schüler, Lauritz Thamsen, Odej Kao, Tobias Herb, and Volker Markl. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD). ACM. 2015. Acceptance rate 26%. [pdf]
    • © ACM, 2015. This is the authors' version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version will be published in the proceedings of the ACM SIGMOD international conference.
Previously:
  • Object Versioning to Support Recovery Needs: Using Proxies to Preserve Previous Development States in Lively. Bastian Steinert, Lauritz Thamsen, Tim Felgentreff, and Robert Hirschfeld. In Proceedings of the Dynamic Languages Symposium (DLS) 2014, co-located with the Conference on Object-oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM. 2014. Acceptance rate 35%. [pdf]
    • © ACM, 2014. This is the authors' version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version will be published in the proceedings of the Dynamic Languages Symposium.
  • Orca: A Single-language Web Framework for Collaborative Development. Lauritz Thamsen, Anton Gulenko, Michael Perscheid, Robert Krahn, Robert Hirschfeld, and David A. Thomas. In Proceedings of the Conference on Creating, Connecting and Collaborating through Computing (C5) 2012. IEEE. 2012. [pdf]
    • http://dx.doi.org/10.1109/C5.2012.9, © IEEE, 2012. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists or to reuse any copyrighted component of this work in other works must be obtained from IEEE.

theses

  • Dynamic Resource Allocation for Distributed Dataflows. Lauritz Thamsen. PhD thesis submitted at TU Berlin in March 2018 and successfully defended on May 4, 2018. [pdf] [slides]
  • Object Versioning for the Lively Kernel: Preserving Access to Previous System States in an Object-oriented Programming System. Lauritz Thamsen. Master thesis submitted at Hasso-Plattner-Institut, University of Potsdam, in May 2014. [pdf]
  • Object Collaboration in the Orca Web Framework. Lauritz Thamsen. Bachelor thesis submitted at Hasso-Plattner-Institut, University of Potsdam, in June 2011. [pdf]

community service

  • Program committee: SDNCC 2016
  • External reviewer: IEEE Transactions on Parallel and Distributed Systems, IEEE Transactions on Services Computing, Springer's Cluster Computing, Euro-Par 2018, IEEE eScience 2018, IEEE ICCAC 2017
  • Volunteer: AOSD 2012, ESUG 2012


© Lauritz Thamsen   |   Last Update: 14 Nov 2018