In-network processing with SmartNICs

This thesis uses the SmartNIC Netronome Agilio to fix a performance problem in the network that exists mostly because a performance problem in operating system was solved somewhat carelessly.

Bildet kan inneholde: passiv kretskomponent, kretskomponent, maskinvareprogrammerer, mikrokontroller, maskinvare.

The Netronome Agilio CX 2x10 SmartNIC.

In-network processing is the area of performing computations in the network on data traveling through the network. The opportunities are endless. Maybe the data from temperature sensors are combined on wireless base stations to show the origin of a fire. Maybe the join operations of a database search is performed on routers to offload the database servers. This thesis is about an implementation of TCP Segmentation Offload (TSO) with controlled timing.

TSO is a way of reducing the load of computers' CPUs by handing not a single TCP packet at a time from the kernel to the network card, but passing a huge chunk of data for a single TCP connection to the network card, and letting the card deal with splitting that chunk into packets small enough for the Internet. That's really good for CPUs, especially when they busy servers in a cloud or cluster. And because of this, Linux will soon make it impossible to switch TSO off.

But there's trouble: all of those packets will be sent back-to-back from the card into the Internet as quickly as possible, and eventually (due to TCP's normal behaviour), too quickly for one of the routers in the Internet. Before TSO, this would mean that many TCP flows loose one packet, and all of them slow down (congestion control). But with TSO, only one TCP flow looses lots of packets, and only this one slows down. To return to a better behaviour, we must wait a little bit between packets, but do that on the network card. That card must be a bit smarter than usual.

Now, SmartNIC are network interface cards that are capable of performing computations without the help a computer's CPU. Most of these SmartNICs have very limited capabilities. Most are really good at handling tables for switching, routing and filtering. Those were programmed using domain-specific languages like eBPF and XDP, but the current hype for programming these is the P4 language. Unfortunately, P4, ePBF and XDP are all useless for our goal. They cannot express timing. It's always as-quick-as-possible for them.

Fortunately, our SmartNIC, the Netronome Agilio CX 2x10 can be programmed using a dialect of C. That's where this thesis needs to go.

These cards have no operating system. They have no libraries, APIs, templates, interfaces and such. There is nothing that happens behind curtains or under the hood. These cards are not simple - nothing that can process 20 GBit/s network traffic in real-time is simple - but nothing is hidden from you. And what you program is what happens.

Learning outcome

  • You have deep insight into in-network processing on SmartNICs.
  • You have experience in kernel programming and close-to-metal firmware programming.
  • You have a deep knowledge of TCP, its slow-start and congestion-control mechanisms, and methods such as packet pacing.
  • You know how to analyse, interpret and present measurement data.

Conditions

We expect that you:

  • have been admitted to a master's program in MatNat@UiO - primarily PROSA
  • take this as a long thesis
  • will participate actively in the weekly SINLab meetings
  • are present in the lab and collaborate with other students and staff
  • are interested in and have some knowledge of C programming and are not afraid of a new assembler language
  • are interested in programming very close-to-metal: no virtualization, paging or MMU, no APIs, no libraries, but 5 different types of memory, hardware-supported locks, thread switching without scheduling overhead, ...
  • have taken IN3230 or include IN4230 in the study plan
  • include the course IN5060 in the study plan, unless you have already completed a course on classical (non-ML) data analysis
  • know something about operating systems, preferably by having taken IN3000
Emneord: Networks, TCP, SmartNIC, In-network processing
Publisert 3. okt. 2023 09:28 - Sist endret 4. okt. 2023 13:58