In-network Acceleration of Big Data Processing

This master project will investigate novel use of emerging programmable network adaptors for accelerating Big data processing.

Companies use Big data to provide better customer service, create personalised marketing, and take other actions that, ultimately, increase revenue and profits through fast and informed decisions.

Big data processing is a set of techniques sometimes with specialised programming models to access large-scale data to extract useful information for supporting and providing decisions. For example, Hadoop is an open-source software for Big data processing. This software is even available through some cloud providers such as Amazon to process Big data. Spark is an alternative to Hadoop developed at the University of California at Berkeley, which is designed to overcome the disk I/O limitations and improve the performance of Hadoop. The major feature of Spark that makes it unique is its ability to perform in-memory computations. However, one of the main overheads of distributing the data over the memories of many machines comes from network communication.

Goal

In order to optimise the network communication and offload computation from the main CPU, In this thesis, we are going to use a special type of network card which is equipped with an Arm processor to enhance the performance of the Spark operations. The result of this master project will have the potential of improving all the systems that require Big data processing.

Learning outcome

In this thesis, you will learn state-of-the-art technologies that are exploited in a large-scale datacenter. Specifically, you will learn Big data processing and high-performance programming. Moreover, you will learn state-of-the-art techniques in distributing data over the memories of many machines.

Qualifications

It is beneficial, but not mandatory, if you have some knowledge on networking and operating systems as well as C/C++ programming. Moreover, you will get support to organise the work and implement new systems through programmable network cards.

Work Place

Simula will provide the appropriate equipment and the work place for the student. Here is the link to project description at Simula.

Publisert 27. sep. 2022 09:29 - Sist endret 27. sep. 2022 09:39

Veileder(e)

Omfang (studiepoeng)

60