Oppgaven er ikke lenger tilgjengelig

Edge computing with Kubernetes: Robust, decentralized storage

What is Kubernetes?

Kubernetes, often called K8s, is a system for automating the deployment of applications in the cloud, as well as the distribution, configuration, coordination and administration of these applications (online, see: kubernetes.io). Kubernetes can be used to ensure uptime and scalability for services and end systems, features that are essential for both civilian and military systems. K8s are often used as a service platform in modern communication and information systems. Kubernetes was originally developed by Google and is now released as open source as part of the Cloud Native Computing Foundation (CNCF).

The main benefit of using Kubernetes is the flexibility and scalability the system provides, including automated software updates, service discovery and load balancing, service orchestration, security and configuration management, and self-healing. K8s therefore fits in very well with modern agile system development based on DevOps and microservices.

K8s is also scalable across a set of computers (or nodes), be they physical hardware or virtual machines (or a combination of these). When K8s are used across multiple nodes, it is referred to as a Kubernetes cluster. Typically, in cloud computing, Kubernetes is run in a data hall. However, the solution can also be used for edge computing, but there unstable communication and problems with node connection will cause challenges.

Specific question for this thesis

We offer several tasks related to studying improvements of K8s, which makes it more suitable in the context of edge computing, for example for use in disaster relief and rescue operations.

Kubernetes orchestrates services, which will in most cases require access to data that is shared between the different nodes of the services. The storage may contain data that is collected by the service, intermediate results or processing, final results that must be presented to an end user, and so on. The actual demand depends on the application, and there is an endless number of those.

In our scenario, this happens at the edge, where services can only be deployed only on weak nodes, which have to communicate to solve the tasks at hand. There communication is not guaranteed and they may go down when they leave communication range or their battery runs out.

All this means that storage in persistent memory is important - data can not only be kept in memory. It is also important to investigate the trade-off between keeping redundant copies or performing redundant computations when data is lost.

In this thesis, you will explore the options for distributed, redundant storage of data that can be managed by Kubernetes and made available to the services it orchestrates. You will also answer the question how suitable properties can be formulated by the operator of the service, to enable your improve Kubernetes scheduler to choose the right kind and the right level of redundancy.

Learning outcome

Experience

in formulating, investigating and answering research questions
handling of Kubernetes, the famous container orchestration system
conducting, evaluating and interpreting research results

Conditions

We expect that you:

have been admitted to a master's program in MatNat@UiO - primarily PROSA
take this as a long thesis
will participate actively in the weekly SINLab meetings
are interested in and have some knowledge of C/C++ programming
include the course IN5060 in the study plan, unless you have already completed a course on classical (non-ML) data analysis
include IN5700 or IN5020 in the study plan, unless you have already completed a course on distributed systems

Publisert 29. aug. 2023 10:06 - Sist endret 16. okt. 2023 14:59

Veileder(e)

Carsten Griwodz Universitetet i Oslo
Frank Trethan Johnsen