Harnessing DPUs for distributed data storage systems

Big data requires efficient data access solutions. Distributed data storage systems are thus widely used due to the capacity and scalability. Data processing units (DPUs) represent a new technology that particularly suits distributed data storage. This master project compares DPU-supported RDMA operations with PCIe transactions, in preparation of future data storage systems.

In a distributed data storage system, where the data is scattered over the memories of many machines (for example in a data centre), one of the latest strategies is to access the data through the RDMA (remote direct memory access) technology. RDMA enables access to the memory of remote machines over the network, bypassing the operating system and CPU, thereby significantly reducing the latency of the process. The emergence of programmable network devices (such as DPUs) allows the computation capability of network devices to be used not only for forwarding the data but also for processing the data on its own path. A DPU-supported network interface card can offload processing tasks that the system CPU would normally handle.

This master thesis aims to investigate the details of DPU-supported RDMA operations. In particular, when a node in the network sends its request to access the main memory of a remote node, the request first goes into the DPU. Then, DPU will forward the request to the main memory. There are two approaches to accessing the main memory from DPU: running RDMA operations and executing PCIe transactions. In order to choose the best approach and thereby accelerate data access from DPU to the main memory, a performance evaluation will be carried out in this project. The research work involves (1) design of simple and yet representative experiments, (2) a systematic comparison of the two approaches based on different parameters. 

The eX3 infrastructure, which has several smart network systems (including PCIe-supported and DPU-supported solutions), will be used as the hardware testbed. 

The master student will, through the master project work, acquire advanced knowledge in RDMA, PCIe, in-network computing, as well as hands-on experience with the latest networking technologies. These are essential ingredients in, for example, large-scale data centres.

Good programming skills are required before starting the master project. The master student is also expected to be interested in learning new knowledge under close interactions with the supervisors.

Emneord: Distributed data storage, DPUs, RDMA, PCIe
Publisert 2. nov. 2021 01:09 - Sist endret 2. nov. 2021 01:09

Omfang (studiepoeng)