Distributed lifetime management and monitoring system for an embedded distributed computing platform
Platform: Computing nodes deployed across Norway. Wired or wireless (WiFi and 2G/3G) connection to servers. Static and mobile nodes.
Requirements: Knowledge of computer networks and distributed systems. Knowledge of Internet protocols, OS-/Linux programming. Interest in embedded systems. Autonomy and enthusiasm for research.
How can you improve system’s robustness and reliability in hostile environments?
Computer networks, including the Internet, present a wide range of challenges that can interfere in the function of distributed systems that depend on them. Thus distributed computing systems have several critical requirements on the available resources. Resources can be just about anything, for example, computation power, memory, bandwidth, storage facilities, network printers or flight ticket booking website.
All of these resources have different requirements (availability, security, etc.) with different priorities. For the topic in this master thesis, the student will implement, evaluate and/or extend the monitoring system to primarily increase availability, robustness and the system’s reliability. The system in question is composed by a small set of servers and some tens or hundreds of computing nodes deployed across Norway. The computing nodes should be always available, thus, some tasks of the monitoring system include: Are the nodes connected and available? Are the nodes experiencing any challenge? How often do challenges occur and of which type?
The student will also investigate other more conceptual relevant research questions of distributed monitoring systems:
- How to develop system’s notification and statistics on the platform’s current state
- How to scale a monitoring system with an increasing number of computing nodes
- How to design monitoring mechanisms that not interfere in the platform’s functionality
- How to reduce communication complexity in large monitoring systems
- How to improve load balance in large monitoring systems
- What is achievable in terms of a real-time monitoring system
Simula will provide the appropriate equipment and the work place for the student.
It is desirable that the student works close to the supervisor and the student should be interested in working with measurements in real-systems.
NB: This project could be conducted as a long or short Master thesis project.
Best starting date: ASAP!