Computing and Software at the Exascale

Computing and software are crucial parts of the LHC physics experiments. The NorduGrid Advanced Resource Connector (ARC) middleware increases in popularity due its simplistic design and ease of deployment. This makes it the preferred choice of middleware for new and many existing sites particularly in Europe and Asia. ARC and its Control Tower allow seamless access to heterogeneous resources: Grid, High Performance Computers and Clouds. Moreover, ATLAS@home, based on BOINC and ARC, allow to access opportunistic resources made of personal computers.

The requirements imposed on software during the coming LHC runs will be as stringent as those on the computing resources. The data throughput that will have to be achieved exceeds anything that our community has managed to date. Such performance can only be attained by combining a number of techniques - multi-threading and parallel processing of events - as well as novel algorithms and optimization of existing software.

The importance of multi-variate analysis or "Machine Learning” in High Energy Physics continues to increase, for applications as diverse as reconstruction, physics analysis, data quality monitoring and distributed computing.

 

We propose master thesis subjects on development of new computing software tools, distributed data management systems, and data models that will address the challenges of future LHC extreme conditions. This involves various aspects of software and algorithm development including modern techniques making use of machine learning and anomaly detection.

Computational Science & Physics Master Projects

Do you have a passion for coding, big data and machine learning? If so, consider a master thesis project in computing and software for experimental particle physics. The Large Hadron Collider (LHC) experiments require huge computing resources to transfer, store, process and catalog data collected by the detectors, to produce matching simulated data, and to provide the wherewithal for physicists around the world to do data analysis. Our research in this area is now focused on the “High Luminosity” phase of the LHC (HL-LHC), due to begin in 2027-2028, which will see data being produced at a rate ten times higher than presently. The data throughput that will have to be achieved exceeds anything that our community has managed to date and will stress all areas of computing and software. 

We can propose topics in all of these areas, which are all related and which can be combined with a different emphasis depending on your interests. Most of these topics would be related to the ATLAS experiment and will involve interactions with LHC computing experts outside Norway. Here are some more details about the various aspects of our research, and in which we can propose thesis topics:

  1. Physics analysis software. Most particle physicists interact on a daily basis with software that enables them to read, analyse and produce results from the files produced by the data processing chain. In the HL-LHC era these data products will have to be much smaller and more compact than are currently used, and they will need to be accessed in a more efficient way, using columnar data access techniques that are common in the data science industry. There are a number of possible projects here, ranging from the physics contents of the data formats, to the internal organisation and structure of the data, to the technologies which will be used to allow fast access to the data. The exact nature of the project will depend on the interests in computing and physics. 
  2. Distributed computing. Computing experts in Norway have made major contributions to the network of compute and storage resources that all LHC physicists have access to (collectively known as “The Grid”). In particular we developed the NorduGrid Advanced Resource Connector (ARC) which allows seamless access to a wide variety of resources: Grid, High Performance Computers (HPCs) and Clouds. Another project, ATLAS@home, allows volunteers to temporarily donate time on their personal computers to simulating collision events. The HL-LHC era will require much more use of non-traditional resources such as HPCs and Clouds, and this is our main particular area of research. 
  3. Software for data processing. The software that reconstructs collision events recorded by the detector, and produces matching simulated events, will require a thorough upgrade for HL-LHC. This will require a number of techniques, including the writing of highly performant multi-threaded C++, adaptation of some code to use Graphics Processing Units (GPUs), and the application of Deep Machine Learning techniques to some problems. This development will span all aspects of the data processing chain, so projects might be related to fast simulation of detector calorimeters using generative networks, or adapting parts of the ATLAS reconstruction to use modern C++ designs. The exact nature of the project will depend on your interests in computing and physics.  
  4. Deep  learning. Below are two suggestions on topics related to the use of deep neural networks in searches for new physics in the data recorded by the ATLAS detector at LHC in run2 (and maybe some initial studies of data from run3). Common for both are the challenges related to the data structure. In regular analyses the data is stored in so called ROOT nTuples, but when wanting to use data in machine learning libraries (e.g ScikitLearn TensorFlow, Keras etc) it needs to be in a columnar form (python data frames, Pandas, ROOT’s own RDataFrame class etc.) . The transition from nTuples to columnar format must face challenges of memory consumptions (e.g read in batches) and often goes through several steps. There are many possibilities where the data handling, to prepare for usage of ML libraries, can be improved. 
    1. UnsupervisedMake use of unsupervised ML on real data to detect possible anomalies. Test the whole procedure by injecting several new physics theory models (in terms of Monte Carlo simulated data) and iterate until the algorithm  / network is “ trained” to be able to discover new physics in the actual real data. If new physics is not found in the current very large dataset (140 /fb) already available, the resulting ML will be ready to run on the new coming data to be taken by the LHC from 2022, possibly just in time to be ready by the end of the Master program (at least a decent data sample is expected in 2022). To be as unbiased as possible when constructing the training sample one would try to perform as little “cleaning” of the data as possible. This introduces some challenges wrt. memory management and how to feed the network with input data. The final validation of the trained network would be to test it on data/simulations with events from new physics scenarios baked into the samples. References: 1) Internal ATLAS presentation: ATLAS exotics workshop Sept 2021.pdf; 2) Dark Machines for a challenge focused on outlier detection https://arxiv.org/abs/2105.14027 ; 3) LHC Olympics 2020 for a challenge that includes both outlier and group anomaly detection approaches https://arxiv.org/abs/2101.08320 ; 4) https://cerncourier.com/a/hunting-anomalies-with-an-ai-trigger/ 
    2. Supervised. The various scenarios of new physics theories discussed in the projects document may show up in a given process at the LHC. Model independent searches for new physics are proposed in final states with leptons recorded with the ATLAS detector. New methods and tools involving neural networks are to be used and optimized, using real data as well as simulated data based on various theoretical implementations, in order to correctly interpret any signal of new physics and distinguish it from regular SM electroweak and strong processes. Let us concentrate on final states with two leptons and missing ET. The main SM background comes from pp→W+W-+X→ l+l-+ννbar (MET)+X and pp→Z(→l+l-) Z(→ννbar)+X→ l+l-+MET+X, as well as pp→ Z (→ l+l-) H(→ invisible)+X. New physics, on the other hand, may come from several of the processes discussed above: production of pairs of direct sleptons or lightest charginos (1a, 1b), mono-Z and mono-Z’ (3a and 3b), and more. The goal is to study various variables, such as M(ll), MET, MT2, decay angles and more, and feed them into a neural network in order to characterise any signal beyond the SM and possibly identify the new physics at work. Reference example: Searching for exotic particles in high energy collisions with deep learning 

Finally the two proposed projects will perform some comparisons of the performance. I.e. compare the unsupervised versus the supervised learning on some given new physics models. Hopefully there will be a chance to look at the first data from LHC run-3 

  1. Multithreading and parallel processing. The requirements imposed on software during the coming LHC runs (Run 3 starting in 2022 and the High Luminosity Run somewhat later) will be as stringent as those on the computing resources. The data throughput that will have to be achieved exceeds anything that our community has managed to date. Such performance can only be attained by combining a number of techniques - multi-threading and parallel processing of events - as well as novel algorithms and optimisation of existing software. One task would consist of implementing  multi-threading and parallel processing in the study of  some classes of theories of new physics. A large sample of data (140 /fb at 13 TeV) and corresponding MC simulation samples of Standard Model and new physics processes exist. The new physics theories are inspired by Grand Unified Theories (spin-1 Z’ models) or by extra space dimensions (spin-2 graviton resonances), just to give 2 examples. The detailed MC sampling of new physics parameter space, as required for example when deriving Bayesian posterior densities and corresponding credibility intervals, is computationally expensive. It does, however, lend itself to parallelization by combination of statistically independent samplings. One way to achieve this may be to update the code package to use a newer version of the Bayesian Analysis Toolkit, implementing such functionality. This could be the latest C++ based version, or even the new BAT.jl based on the modern Julia programming language. It is possible to split this task up if more than one student is interested. For example, one project could focus more on the preparation of input data for the statistical analysis, which requires work with real and simulated ATLAS data, applying dedicated matrix element reweighting to some of the MC samples. Another project could then in parallel focus more on the statistical analysis, dealing with the implementation in the Bayesian Analysis Toolkit etc.

References:

https://atlas.cern/updates/news/live-talk-computing  


 

Tags: Computational science, distributed computing, deep learning, supervised and unsupervised learning, software for data processing, software for physics analysis, multithreading and parallel computing
Published Mar. 15, 2021 2:19 PM - Last modified Aug. 12, 2022 12:11 PM

Supervisor(s)

Scope (credits)

60