In December 2019, ECMWF’s Council of Member States gave ECMWF authorisation to sign a contract with Atos for the supply of the BullSequana XH2000 supercomputer.

The supercomputer will be hosted in the new ECMWF data centre currently being developed by the Italian Government and the Regione Emilia-Romagna in Bologna, Italy. It is expected to be fitted in 2020 and to become fully operational at the end of 2021.

The Cray XC40 HPCF

ECMWF’s current Cray XC40 high-performance computing facility (HPCF) continued to provide a good and stable service, processing more than half a million jobs per day on average. At the end of 2019, the data archive held 310 petabytes of primary data and a further 123 petabytes of backups. On average around 290 terabytes of new data is added to the archive daily, and 220 terabytes retrieved. Despite the amount of data added to the archive daily, the overall size of the primary archive was little changed over the year, showing the benefit of data stewardship efforts.

Monthly data archive growth
Monthly data archive growthThe increase in the growth of ECMWF’s data archive has been driven by three areas: research, the ERA-Interim and ERA5 weather and climate reanalyses, and operations.

New IBM tape libraries were installed at Shinfield Park and put into operational use. Old, but still needed, data held in the Oracle tape libraries is being migrated to the new libraries in a process that will take several years to complete.

The High-Performance Storage System (HPSS), which provides the backbone of ECMWF’s Data Handling System, was upgraded in May 2019, in turn allowing the core platform to be moved from an AIX-based system to a more powerful Linux-based machine. This will help support the load increase anticipated from the new supercomputer and is a prerequisite for supporting the new tape libraries and drives and migration to Bologna.

In addition to running the existing data centre and managing the network, considerable work went into designing the ICT service for the new data centre and for the European Weather Cloud pilot phase. Member States have a new web-based tool for managing user access to ECMWF services, the network infrastructure at Shinfield Park has been prepared for the site-to-site connection to Bologna, and many web services have moved to a new Single Sign-On platform. A major effort to wind down legacy infrastructure and services got under way with a view to simplifying the migration to Bologna.

Work also went into building and testing new systems for data centre infrastructure management, service configuration management, and server and application deployment automation.

New operating model for end‐user compute
New operating model for end‐user computeThe operating model developed by the Centre’s Technical Design Authority is being used to validate user workflows and ensure all service components are migrated to or created for the Bologna environment.

New supercomputer contract

A comprehensive tender process was launched in 2018 for the new ECMWF HPCF. The tenders submitted were assessed against criteria including committed performance, implementation plan, flexibility and risks, quality of technical solution, environmental impact, quality of service provision and support, and price.

At its December session, ECMWF’s Council authorised the Director-General to sign a contract with the successful tenderer, Atos UK Ltd. The new facility will be provided under a four-year service agreement and will deliver a performance increase of about five over the current system, based on the time-critical capability and capacity benchmarks. It will be hosted in the new data centre in Bologna and initially will run in parallel with the existing Cray HPCF.

The high-performance computing facility serves not only to produce forecasts but also to run research experiments designed to push the boundaries of predictability, including ground-breaking work on the assimilation of cloud observations from satellite radar and lidar into ECMWF’s Integrated Forecasting System (IFS), and progress towards assimilating satellite radiances in the visible part of the spectrum.

Of the available computing resources, 25% is allocated for workload from ECMWF Member States.

System specifications
System specificationsThe Cray XC40 system and the new Atos Sequana XH2000 system.

Scalability Programme

The first phase of ECMWF’s Scalability Programme, a major programme to prepare all ECMWF’s systems for future supercomputer architectures, is complete. The second, implementation phase (2020–2024) will bring the results into operation.

The cutting-edge research of the first phase was achieved through close work with ECMWF’s Member States, participation in several European research projects funded by the European Commission, and the support of actors in public–private partnership ETP4HPC and PRACE. This participation provides key contributions to European infrastructure investments and to the planning of funding programmes.

Examples include the EU-funded NEXTGenIO and ESCAPE-2 projects. ECMWF was one of 8 partners in NEXTGenIO, which ended in October following a final workshop and hackathon hosted by ECMWF. The project designed and built a prototype hardware platform that promises massive gains in input/output (I/O) capabilities in supercomputing. Some of the developments have been implemented at ECMWF: FDB5 (Fields Database 5) and MultIO are used in ECMWF’s time-critical operational workflows.

ESCAPE-2, led by ECMWF, is preparing components of leading European Earth system models for heterogeneous processor architectures through advanced numerical method and novel programming approaches.

Such individual developments are being tested and disseminated through the EU-funded ESiWACE centre of excellence, which aims to coordinate research and its transfer to operations across weather and climate prediction centres in Europe, and for which ECMWF coordinates the weather prediction efforts.

With machine learning and artificial intelligence expected to play an important role in ECMWF’s long-term Strategy, in 2019 the Centre appointed a Coordinator of machine learning and artificial intelligence (AI) activities at ECMWF.

Potential areas of use for AI techniques include data quality control; bias correction in data assimilation; emulating model components; and quantifying uncertainty. For example, experiments carried out at ECMWF illustrate that properly trained neural networks can already make short-range predictions of surprising accuracy.

Neural network experiments
Neural network experimentsGeopotential at 500 hPa (in m2/s2) between 00 UTC on 1 March and 00 UTC on 2 March 2017 according to the analysis (left) and according to a 24-hour neural network forecast starting from the analysis at 00 UTC on 1 March (right).

The high-performance computing facility processes more than half a million jobs per day on average.

The new facility will deliver a performance increase of about five over the current system.