Originally featured in TOP500 on June 26, 2018
The International Supercomputing Conference (ISC18) kicked off on Monday, June 25 in Frankfurt, Germany, with Maria Girone, Chief Technology Offier of CERN openlab delivering the opening keynote address. She explained how CERN’s needs will drive exascale computation and data science innovation in the future.
Founded in 1954, CERN straddles the Franco-Swiss border, and has an annual operating budget of one billion Swiss francs. With 22 member-states, its resources support 15,000 scientists around the world, and employs 2,500 at Swiss and French sites. CERN is home to the Large Hadron Collider (LHC), the world’s largest and most powerful particle accelerator.
In addition to probing the fundamental structure of our universe, understanding the very first moments of our universe after the big bang, and searching for dark matter, CERN technicians and scientists are always developing new technologies for accelerators and detectors. Its instrumentation advances medical diagnoses and therapies, trains the scientists and engineers of tomorrow, and unites people from different countries and cultures. “CERN is every bit as much of a feat of social engineering as it is a technical challenge,” said Girone. Each participating lab builds their own parts that must work with everything else. There are 170 collaborative computing centers in 42 countries on most continents. She added with a smile, “we’re even working on Antarctica!”
Two general-purpose detectors cross-confirm major discoveries, such as the Higgs Boson on July 4, 2012. ALICE and LHCb (the “b” stands for beauty) are detectors that specialize in the study of specific high energy physics (HEP) phenomena. The LHCb experiment investigates the “slight differences between matter and antimatter by studying a type of particle called the beauty quark, or b quark.” There are 650 scientists from 48 institutions in 13 countries who participate in the LHCb experiment, alone.
CERN instrumentation has the capacity to generate 1 petabyte of data per second, and hundreds of petabytes per year. The Meyrin data center in Geneva is the heart of CERN’s computing infrastructure with 300,000 processor cores, 180 petabytes of disk and 230 petabytes of tape storage. A second data center in Wigner, Budapest (WLCG) features HPC systems with 100,000 cores, and 100 petabytes of disk storage. WLCG gives thousands HEP of scientists around the world near real-time access.
CERN is a leader in global data distribution and management. Massive amounts of data are moved between hemispheres each day via 340 Gbps transatlantic link, and identity management is handled by the eduGAIN federation. From a local management standpoint, eduGAIN saves managers time and effort because home credentials provide authentication and access to resources, instrumentation and data that are physically located at institutions in in 48 member-countries that comprise an interfederated trust fabric. It’s more secure and takes less time to manage since researchers must only remember one user name and password.
It’s necessary for physicists to sift through 30 to 50 petabytes produced annually by the LHC experiments. “Searching for a single event is compared to finding a specific grain of sand in 20 volleyball courts,” said Girone. There is so much data that scientists couldn’t possibly access or manage raw data, so CERN exploits co-processors for software-based filtering and real-time construction, which prunes, packs and optimizes data for transfer and analysis.
CERN’s road map for the future includes the construction of the High Luminosity (HL) LHC, which began a few weeks ago and is expected to be completed in 2035. It will be a challenge to address the exascale demands in the future. Considering current technologies, CERN will need 50 to 100 times the amount of current CPU computing capacity by 2028, as well as further advances in software development, advanced networks and storage solutions. The technology evolution expected to take place between now and 2028 will help meet this goal, along with close collaborations with industry partners.
CERN openlab is a broad science-industry partnership that fosters research and innovation to drive the innovation of ICT solutions for CERN and its stakeholders. To tackle the resource gap, openlab plans to fully exploit available hardware, expand dynamically to new computing environments, and introduce layered, virtualized services to provide flexibility and efficiency. CERN expects as much as 90 percent of resources will be delivered via the OpenStack private cloud platform, which will allow flexibility and dynamic deployment. Additionally, they have the option of elastically and dynamically expanding production to commercial clouds and currently have a joint procurement of R&D cloud services with several providers. While they employ accelerators, such as GPUs, FPGAs and others, they are exploring lower-performance, low-power alternatives, such as ARM. They have a focus on software optimization toward greater performance, and plan to explore the entire panorama of emerging innovations as they unfold.
CERN is well-suited for machine learning, and is expected to become a leader in the field of artificial intelligence (AI), specifically in the areas of monitoring automation and anomaly detection. Reconstruction and simulation—the most important applications for CERN—are data-intensive processes that will also benefit from machine learning and AI.
CERN has a relationship with the Square Kilometer Array (SKA) project, which has instrumentation in the Karoo region of South Africa, and in Australia. Phase One of that project is expected to start generating data in the mid 2020’s and will function for the next 50 years. CERN has a joint exascale data-storage and processing challenge between HL-LHC & SKA. Additionally, CERN is accelerating innovation and knowledge transfer to medical applications via the MEDICIS-PROMED program.
Girone closed her presentation with a quote by actor Tom Hanks who had visited CERN in 2009 with actor-director Ron Howard. He said, “Magic is not happening here; magic is being explained here.”
For more information about CERN, the LHC and related projects, visit their website. Follow @ISChpc and #ISC18 for more conference news.