I am looking for PhD/Postdoc researchers to start in Fall 2024 for my newly funded projects "AthenaRL: Scalable and Flexible Distributed Reinforcement Learning Systems" (~550k EUR, link) and "LARA: Large Language Models for Quantum Machine Learning Algorithms" (~350k EUR). Please find details in vacancies.
I am a tenure-track Assistant Professor in the Department of Computer Science at Aalto University, leading the Aalto Data-Intensive System group (ADIS). I am also affiliated with the Finnish Center for Artificial Intelligence (FCAI), the Helsinki Institute for Information Technology (HIIT) and the Quantum Doctoral Pilot Programme (QDOC). My research focuses on efficient data-intensive systems that translate data into value for decision making. The scope of my research spans across multiple layers of the data-intensive systems, from scalable machine learning systems to distributed data management systems, as well as code optimization techniques. That is to answer the question “how to co-design multiple layers of the software stack to improve scalability, performance, and energy efficiency of machine learning systems”. My long-term goal is to explore and understand the fundamental connections between data management and modern machine learning systems to make decision-making more transparent, robust and efficient. Please find more details in my research statement and the press coverage.
Before joining Aalto, I was an Assistant Professor at Queen Mary University of London, after working as a post-doctoral researcher at Imperial College London in the Large-Scale Data & Systems (LSDS) group with Prof. Peter Pietzuch. Before Imperial, I was a research assistant and obtained my PhD in the Databases and Information Systems Group at Humboldt-Universität zu Berlin (HU), supervised by Prof. Matthias Weidlich. Before HU, I worked as a student assistant at the Parallel Programming group with Prof. Felix Wolf in RWTH-Aachen University and Technical University of Darmstadt.
Assistant Professor
Aalto University, Finland
Assistant Professor
Queen Mary University of London, UK
Post-doctoral researcher
Imperial College London, UK
Software Development Engineer
Amazon Web Services, Redshift Team
PhD in Computer Science
Humboldt-Universität zu Berlin, Germany
Visiting PhD student in Computer Science
University of Queensland, Australia
Visiting master student in Computer Science
RWTH-Aachen University, Germany
Master of Science in Computer Science
Xi'an Jiaotong University, China
Bachelor of Science in Computer Science
Wuhan Istitute of Technology, China
Funding agency: Research Council of Finland — Academy Project Funding
Duration: 2024-2028
Role: Principal investigator (PI)
Amount: 546,079 EUR (total cost 780,114 EUR)
Funding agency: Business Finland — Quantum Computing Research Call
Duration: 2024-2026
Role: Co-Principal investigator (Co-PI)
Amount: 349,915 EUR (out of 700,000 EUR)
Funding agency: The Finnish Doctoral Program Network in Artificial Intelligence
Duration: 2024-2027
Role: Principal investigator (PI)
Amount: 112,011 EUR
Funding agency: Research Council of Finland–the Finnish Quantum Flagship’s Quantum Doctoral Pilot Programme
Duration: 2024-2027
Role: Principal investigator (PI)
Amount: 112,011 EUR
Marcel Wagenländer, Guo Li, Bo Zhao, Luo Mai, Peter Pietzuch
TENPLEX: Changing Resources of Deep Learning Jobs using Parallelizable Tensor Collections
In the Proc. of Symposium on Operating Systems Principles (SOSP), Austin, TX, USA, 2024
Song Liu, Jie Ma, Zhenyuan Zhang, Xinhe Wan, Bo Zhao, Weiguo Wu
Scalpel: High Performance Contention-Aware Task Co-Scheduling for Shared Cache Hierarchy
In IEEE Transactions on Computers, 2024 (to appear)
Alessandro Fogli, Bo Zhao, Peter Pietzuch, Maximilian Bandle, Jana Giceva
OLAP on Modern Chiplet-Based Processors
In the Proc. of International Conference on Very Large Data Bases (VLDB), Guangzhou, China, 2024
Lei You, Lele Cao, Mattias Nilsson, Bo Zhao, Lei Lei
Distributional Counterfactual Explanation With Optimal Transport
perprint in arXiv, 2024
Huanzhou Zhu*, Bo Zhao*, Gang Chen, Weifeng Chen, Yijie Chen, Liang Shi, Yaodong Yang, Peter Pietzuch, Lei Chen (*equal contribution)
MSRL: Distributed Reinforcement Learning with Dataflow Fragments
In the USENIX Annual Technical Conference (USENIX ATC), Boston, MA, USA, 2023.
Song Liu, Xinhe Wan, Zengyuan Zhang, Bo Zhao, Weiguo Wu
TurboStencil: You Only Compute Once for Stencil Computation
In Future Generation Computer Systems, Volume 146, 2023.
Gururaghav Raman, Bo Zhao, Jimmy Chih-Hsien Peng, Matthias Weidlich
Adaptive incentive-based demand response with distributed non-compliance assessment
In Applied Energy, Volume 326, 2022.
Bo Zhao
State Management for Efficient Event Pattern Detection
Dissertation, Humboldt-Universität zu Berlin, 2022.
Bo Zhao, Han van der Aa, Nguyen Thanh Tam, Nguyen Quoc Viet Hung, Matthias Weidlich
EIRES: Efficient Integration of Remote Data in Event Stream Processing
In Proc. of the 47th ACM SIGMOD International Conference on Management of Data (SIGMOD), Xi'an, China, ACM, June 2021.
Bo Zhao, Nguyen Quoc Viet Hung, Matthias Weidlich
Load Shedding for Complex Event Processing: Input-based and State-based
Techniques
In Proc. of the 36th IEEE International Conference on Data Engineering (ICDE), Dallas, TX, USA, IEEE, April 2020.
Gururaghav Raman, Jimmy Chih-Hsien Peng, Bo Zhao, Matthias Weidlich
Dynamic Decision Making for Demand Response through Adaptive Event Stream Monitoring
In Proc. of 2019 IEEE Power & Energy Society General Meeting (PESGM), Atlanta, GA, USA. IEEE, August 2019.
Bo Zhao
Complex Event Processing under Constrained Resources by State-based Load Shedding
In Proc. of the 34th IEEE International Conference on Data Engineering (ICDE), Paris, France, IEEE, April 2018.
Song Liu, Bo Zhao, Qing Jiang, Weiguo Wu
A Semi-Automatic Coarse-Grained Parallelization Approach for Loop Optimization And Irregular Code Sections
In Chinese Journal of Computers, 2017, 40(9): 2127-2147.
Bo Zhao, Zhen Li, Ali Jannesari, Felix Wolf, Weiguo Wu
Dependence-Based Code Transformation for Coarse-Grained Parallelism
In Proc. of the International Workshop on Code Optimisation for Multi and Many Cores (COSMIC) held in conjunction with CGO, San Francisco, CA, USA, ACM, February 2015.
Zhen Li, Bo Zhao, Ali Jannesari, Felix Wolf
Beyond Data Parallelism: Identifying Parallel Tasks in Sequential Programs
In Proc. of 15th International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP), Zhangjiajie, China, Lecture Notes in Computer Science, Springer, November 2015.
Song Liu, Weiguo Wu, Bo Zhao, Qing Jiang
Loop Tiling for Optimization of Locality and Parallelism
In Journal of Computer Research and Development, 2015, 52(5): 1160-1176.
Email: bo.zhao@aalto.fi
Tel. : +358 503227953
Mail Address:
Tietotekniikan laitos, P.O.Box 15400, FI-00076 Aalto
Visiting Address:
A108, Department of Computer Science, Aalto University
Konemiehentie 2, 02150 Espoo, Finland
Our group is funded by Research Council of Finland, Business Finland, the Finnish Doctoral Program Network in Artificial Intelligence (AI-DOC), and the Quantum Doctoral Pilot Programme (QDOC).
Dr. Tuo Shi (Fall 2024)
Cong Yu (Dec.2023-)
Mustapha Abdullahi (Apr.2024-)
Jiaxin Guo (Jun.2024-)
Songlin Jiang (Spet.2024-)
Alireza Samar (Oct.2024)
Zheyue Tan (.Jan.2025-)
Zhongxuan Xie ( Feb.2024-)
Linus Jern ( Fall 2024-)
Wenxuan Li (Summer 2024. Home university: University of Cambridge, now intern at Microsoft Research Asia)
Shilong Deng (Spring 2024. Home university: University of Liverpool)
CV Download (updated in Aug.2024)
Organizing Committee: UbiComp'25 Local Arrangement Chair
Programme Committee: CIKM'21,22,23,    CoNEXT'24,    VLDB'25,    ICDE'25
Availability Committee: SIGMOD'22,23
Demonstration Track Committee: ICDE'23,24
Journal Reviewer: TPDS'23    JMLR'24
Journal Editor: Proceedings of the ACM on Networking (PACMNET)
Co-Chair: The great talk series, CS Department, Aalto University
Aalto Data-Intensive System group is seeking full-time doctoral and postdoctoral researchers in scalable machine learning systems and distribute data-intensive systems. Due to the large volume of applications, I am not able to reply all candidates. But I promise to carefully go over each application. Thanks!
[About the group] We conduct research on efficient data-intensive systems that translate data into value for decision making. The scope of our research spans across multiple subfields, from scalable data-centric machine learning systems to distributed data stream management systems, as well as code optimization techniques. That is to answer the question “how to co-design multiple layers of the software stack to improve scalability, performance, and energy efficiency of ML systems”. Our long-term goal is to explore and understand the fundamental connections between data management and modern ML systems to make decision-making more transparent, robust and efficient. Please find more details in our research statement.
We have Research Fellow and Postdoctoral Fellow Positions funded by The Helsinki Institute for Information Technology (HIIT). Please find details here.
Successful applicants will conduct impactful research in the field of data-intensive systems and their applications, publish research results in top-tier conferences, and collaborate with other researchers and visit world-leading research groups and industry labs within our international network (e.g., Imperial College, TUM, MPI-SWS, HU Berlin, NUS, Uni Edinburgh, AWS, Huawei, etc).Previously, we had the pleasure to work with following students Marcel Wagenländer, Alessandro Fogli, Jinnan Guo, and Mustapha Abdullahi.
The Department of Computer Science is the largest department at Aalto and one of the leading computer science research units in northern Europe. It is routinely ranked among the top 10 in Europe and in the top 100 globally (Shanghai ranking 51-75, USNews 71, Times 73, QS 84). The CS Department is located at the Otaniemi campus in Espoo – one of the most important north-european technology hubs, with a high concentration of companies, startups and research institutes from the high-tech sector, and with a thriving culture of entrepreneurship. It is less than a 15 minutes metro ride away from the center of Helsinki, capital of Finland. The campus is designed by the renowned architect and designer, Alvar Aalto. Please check out our virtual campus experience.
Aalto University is located in Finland which is among the best countries in the world according to many quality of life indicators. For the sixth year in a row (including 2023), Finland is listed as the world's happiest country, according to the World Happiness Report. Please find more information about living in Finland and the Aalto inforamtion package. Want to know more about us and your future colleagues? You can watch these videos: Aalto University – Towards a better world, Aalto People, Shaping a Sustainable Future. Read more about working at Aalto.
The ADIS Seminars provide a forum to discuss the state of the art in systems research with experts from leading academic institutes and industry labs. All events are free to join, please reach out to our group members for accesses.
Liang Liang, Imperial College London, 20.Jun.2024, 14:00 EEST, Online
[Abstract] Learned indices have shown substantial improvements in lookup perfor- mance. With the advancement of updatable learned indices, they are partic- ularly effective in dynamic environments like stream processing. Data stream processing systems, which allow querying over sliding windows of data streams, require efficient index structures for operations such as aggregation and joins. We introduce FLIRT, a learned index specifically designed for managing in- coming streams of sequential data, such as time-series data, over a window. FLIRT utilizes a piece-wise linear model to reduce lookup times and employs a queue-assembly structure to efficiently support updates. While FLIRT is optimized for scenarios where the search keys arrive in a spe- cific order, this requirement can be restrictive in diverse streaming environments. The need for sequential key arrivals limits FLIRT’s applicability, for instance, in settings where the order of data arrival (processing time) does not reflect the inherent order of the data events (event-time). To overcome these limita- tions and broaden the use cases, we developed SWIX, a versatile learned index designed for more generic window processing. Unlike FLIRT, SWIX accommo- dates out-of-order key insertions, thereby enhancing its utility in environments where data arrival order is unpredictable. To support its versatility, SWIX em- ploys sophisticated buffering techniques and lightweight structures that enables efficient query execution at a low cost and makes SWIX memory efficient. Our experiments demonstrate that both FLIRT and SWIX achieve excellent per- formance in their respective applications and pave the way for incorporating streaming learned indices into other operations, such as streaming joins.
[About the speaker] Liang is currently a PhD student at Imperial College London, where he is supervised by Prof. Thomas Heinis and specializes in database research. Before embarking on his PhD, Liang earned a Master’s degree in High Performance Computing with Data Science from the University of Edinburgh with distinction in 2020 and a Master’s degree in Data Science from Monash University with high distinction in 2019. He completed his undergraduate studies in Law at TianGong University in 2016. Liang’s research focuses on data-oriented streaming workflow systems and streaming learned indices. He has published in to prestigious journals and conferences venues and continues to explore the integration of learned indices into streaming operations, such as joins and aggregations. Additionally, Liang is very interested in combining machine learning methods to enhance streaming tasks, join cardinality estimation, predictive state representations, and hybrid auto-scaling approaches.
Valter Uotila, University of Helsinki, 26.Mar.2024, 14:00 EET, CS Building B322
[Abstract] Quantum computing has developed significantly in recent years. Developing algorithms to estimate various metrics for SQL queries has been an important research question in database research since the estimations affect query optimization and database performance. This work represents a quantum natural language processing (QNLP) -inspired approach for constructing a quantum machine learning model that can classify SQL queries with respect to their execution times, costs and cardinalities. From the quantum machine learning perspective, we compare our model and results to the previous research in QNLP and conclude that our model reaches similar accuracy as the QNLP model in the classification tasks. This indicates that the QNLP model is a promising method even when applied to problems that are not in QNLP. We study the developed quantum machine learning model by calculating its expressibility and entangling capability histograms. The results show that the model has favorable properties to be expressible but also not too complex to be executed on quantum hardware, for example, on the current 20-qubit quantum computer in Finland.
[About the speaker] Valter Uotila is a second-year doctoral researcher at the University of Helsinki researching quantum computing applications for databases and data management. His research interests are in the intersection of quantum computing, databases and category theory.
Madelon Hulsebos, UC Berkeley, 22.Mar.2024, 17:00 EET, Online
[Abstract] The impressive capabilities of transformers have been explored for applications over language, code, images, but the millions of tables have long been overlooked while tables dominate the organizational data landscape and give rise to peculiar challenges. Unlike natural language, tables come with structure, heterogeneous and messy data, relations across tables, contextual interactions, and metadata. Accurately and robustly embedding tables is, however, key to many real-world applications from data exploration and preparation to question answering and tabular ML. In this talk, I will discuss the general approaches taken towards adapting the transformer architecture towards tables and give an overview of the tasks already explored in this space. I will zoom in on some of the shortcomings of these approaches and close with the open challenges and opportunities, and some ongoing work.
[About the speaker] Madelon Hulsebos is a postdoctoral fellow at UC Berkeley. She obtained her PhD from the Informatics Institute at the University of Amsterdam, for which she did research at the MIT Media Lab and Sigma Computing. She was awarded the BIDS-Accenture fellowship for her postdoctoral research on retrieval systems for structured data. Madelon her general research interest is on the intersection of data management and machine learning, with recent contributions in methods, tools and resources for Table Representation Learning.
Wenqi Jiang, ETH Zürich, 7.Mar.2024, 14:00 EET, Online
[Abstract] The recent advances in generative large language models (LLMs) are attributable to the surging number of model parameters trained on massive datasets. However, improving LLM quality by scaling up models leads to several major problems including high computational costs. Instead of scaling up the models, a promising direction, which OpenAI has recently adopted, is known as Retrieval-Augmented Language Model (RALM), which augments a large language model (LLM) by retrieving context-specific knowledge from an external database via vector search. This strategy facilitates impressive text generation quality even with smaller models, thus reducing computational demands by orders of magnitude.
However, RALMs introduce unique system design challenges due to (a) the diverse workload characteristics between LLM inference and retrieval and (b) the various system requirements and bottlenecks for different RALM configurations including model sizes, database sizes, and retrieval frequencies. In this talk, I will present Chameleon, a heterogeneous accelerator system integrating both LLM and retrieval accelerators in a disaggregated architecture. The heterogeneity ensures efficient serving for both LLM inference and retrieval, while the disaggregation allows independent scaling of LLM and retrieval of accelerators to fulfill diverse RALM requirements. Our Chameleon prototype implements retrieval accelerators on FPGAs and assigns LLM inference to GPUs, with a CPU server orchestrating these accelerators over the network. Evaluated on various RALMs, Chameleon exhibits up to 2.16× reduction in latency and 3.18× speedup in throughput compared to the hybrid CPU-GPU architecture.
[About the speaker] Wenqi Jiang is a fourth-year Ph.D. student at ETH Zurich, where he is affiliated with the systems group advised by Gustavo Alonso and Torsten Hoefler. Wenqi's research interests span data management, computer architecture, and computer systems. His work primarily focuses on designing post-Moore data systems, which involve cross-stack solutions including algorithm, system, and architecture innovations. Some examples of his work include large language models, vector search, recommender systems, and spatial data processing.
Xiaozhe Yao, ETH Zürich, 26.Feb.2024, 14:00 EET, Online
[Abstract] Fine-tuning large language models (LLMs) for downstream tasks can greatly improve model quality, however serving many different fine-tuned LLMs concurrently for users in multi-tenant environments is challenging. Dedicating GPU memory for each model is prohibitively expensive and naively swapping large model weights in and out of GPU memory is slow. Our key insight is that fine-tuned models can be quickly swapped in and out of GPU memory by extracting and compressing the delta between each model and its pre-trained base model. We propose DeltaZip, an LLM serving system that efficiently serves multiple full-parameter fine-tuned models concurrently by aggressively compressing model deltas by a factor of 6× to 8× while maintaining high model quality. DeltaZip increases serving throughput by 1.5× to 3× and improves SLO attainment compared to a vanilla HuggingFace serving system.
[About the speaker] Xiaozhe Yao is a second-year doctoral student at Systems Group, Department of Computer Science, ETH Zürich advised by Prof. Dr. Ana Klimović. Working on a wide spectrum of machine learning and systems, his research direction is to build systems that support large-scale machine learning and democratize machine learning. Prior to ETH, Xiaozhe Yao gained his Master’s degree at the University of Zurich in Data Science, advised by Prof. Dr. Michael Böhlen and Qing Chen. Before that, he completed his Bachelor’s study at Shenzhen University in Computer Science, advised by Prof. Dr. Shiqi Yu. He interned at Shenzhen Institute of Advanced Technology in 2016 as a data scientist.
Pedro Silvestre, Imperial College London, 19.Feb.2024, 13:00 EET, Online
[Abstract] Reinforcement Learning (RL) is an increasingly relevant area of algorithmic research. Though RL differs substantially from Supervised Learning (SL), today's RL frameworks are often simple wrappers over SL systems. In this talk, we first analyse the differences between SL and RL from the system designer's point-of-view, then discuss the issues and inefficiencies of RL frameworks arising from those differences. In particular, we discuss how the existence of cyclic and dynamic data dependencies in RL forces the decomposition of algorithms into disjoint dataflow graphs, preventing holistic analysis and optimisation.
We then propose TempoRL, a system designed to efficiently capture these cyclic and dynamic data dependencies in a single graph by instead viewing RL algorithms as Systems of Recurrence Equations (SREs). TempoRL is then able to holistically analyse and optimise this graph, applying both classic and novel transformations like automatic vectorisation (when memory allows) or incrementalisation (when memory is scarce). Because SREs impose no control-flow, TempoRL is free to choose any execution schedule that respects the data dependencies. Luckily, by designing around SREs, we are able to leverage the powerful polyhedral analysis framework to find efficient and parallel execution schedules, as well as, compute a memory management plan through dataflow analysis. The remainder of the talk discusses the surprising advantages that this novel computational model brings, and the applications it may have outside of RL.
[About the speaker] Pedro Silvestre is a PhD student in the Large-Scale Data & Systems Group at Imperial College London, under the supervision of Prof. Peter Pietzuch, working on Dataflow Systems for Deep Reinforcement Learning. Before Imperial, Pedro was a Research Engineer at the TU Delft’s Web Information Systems Group working on Consistent Fault-tolerance for Distributed Stream Processing. Pedro completed both his MSc and BSc from the NOVA School of Science and Technology.