Diamond’s dedicated Scientific Software, Controls and Computation (SSCC) department plays a pivotal role in delivering science for the user community, covering a vast array of hardware and software activities that enables the machine to operate and supports the full life cycle of science experiments conducted at Diamond.
The existing software architecture at Diamond has evolved over the last 20 years to successfully facilitate world-leading science across the facility by balancing the competing demands for flexibility and high-throughput with automated experiments. The upcoming Diamond-II upgrade programme, however, will provide an exciting opportunity to build on Diamond’s current SSCC capabilities to allow us to enhance our software functionality to fully exploit the data and new types of experiments the upgrade will allow.
Software and Computing is one of the five main programme pillars of the Diamond-II upgrade. This pillar aims to deliver the core software, computing and controls developments necessary to enable handling of increased data rates and support the development of common instrumentation, control, acquisition and detector readout systems. It also includes the development of information management and post-visit analysis services. The work covered by the Software and Computing pillar will provide the underlying capability needed by the Machine and Beamline pillars of work, where specific instances for a given machine or beamline function need to be applied.
To ensure the smooth delivery of this programme, Sky French was appointed to the role of Head of Integrated Software Programme for Diamond-II in April. In this role she is responsible for leading the design, development, and implementation of integrated Diamond-II software, in close coordination with beamline scientists, Data Analysis, Data Acquisition and Controls groups.
My role was created to facilitate the successful delivery of the ambitious, but key, integrated software and computing programme for Diamond-II. Whilst we can invest in upgrading the physical components of our machine, without innovation in the software and expenditure in hardware we are not going to be able to fully exploit the science that can be realised with Diamond-II by extracting the most out of the data.
The main challenge I expect to face in my role is facilitating successful and clear communication to make sure the programme delivers in a way which enables new science to happen. This is an integrated software programme that will require collaboration both across the SSCC groups, but also with the science divisions, to ensure we match delivered capabilities to the needs of the scientific community. As the project progresses, a key focus will be on supporting open meetings with key groups across Diamond, and collaborating at all opportunities to ensure we deliver enhanced software and computing that will enable our flagship and upgraded beamlines to be the best they can be, delivering fully on the science capabilities they aspire to. Being very aware in the process, that this is a unique and exciting opportunity to modernise the full suite of beamline software, to make it much more maintainable and extensible, with improved usability and promise.
Diamond-II will deliver substantially increased spectral brightness on the beamlines that will in turn enable more complex experiments to be conducted with higher spatial and temporal resolutions. To facilitate this, Diamond is ensuring that we can provide our users and science community with a greater automation of experiments, faster detectors, rapid data processing and reduction, and the introduction and development of new data processing techniques. The upgrade programme is also committed to investing in our data storage capabilities to cope with the magnitude change to data volumes expected with Diamond-II.
All these required changes amount to a vast package of works that will cover six separate areas of development – Hardware Infrastructure, Software Infrastructure, Data to Information, Real-time Data, Experiment Management and Information Management.
Supporting the new experiment techniques anticipated on Diamond-II will require the development of the next generation of high-performance data acquisition systems, including readout and transfer to storage or processing facilities, data streaming and real-time feedback for visualisation.
Diamond’s current data archiving solution hosts over 20 PBs of data (for illustration 1 PB of data is enough to store 13 years of HD-TV video), however for Diamond-II, the total volume of data will increase significantly in to the 100s PBs. To support the larger more complex data the usability of the archive will be enhanced, to provide greater functionality in terms of how data is selected and how it is moved out of the archive and processed.
Transferring data from Diamond to user’s home institutions for processing currently presents significant challenges due to the size of the data and network limitations. Even when it is achieved computer resource and software to process the data may not be readily available at their home institute. The development of our post visit data analysis provision will therefore be crucial if we are to continue to meet user demands and enable users to fully remotely process and evaluate the data collected on Diamond-II.
Automated remote sessions will increasingly become the norm with Diamond-II. The metadata from the proposal process will need to be integrated with session allocation, sample registration and logistics, processing pipelines and the resulting visualisation and analysis of experimental results. Users will then be able to perform data mining, data analysis and gain access to enhanced search capabilities to fully exploit the value stored within metadata repositories. These changes will in part be realised by greater integration of the Laboratory Information Management System and User Administration into the Data Catalogue.
Diamond currently operates two data centres with combined capacity of 500 kW. This house the required IT resource of approximately 18000 compute cores and 20 PB of high-performance storage needed today for first level data capture and processing of data from the photon beamlines and electron microscopes. Our first data centre is now 100% full and having been constructed in 2011 the cooling infrastructure is nearing its end-of-life, and our second data centre is 60% full. As a result, there is limited capacity for future IT resource to support the anticipated growth in data volumes and processing needs. This has therefore prompted a need to incorporate a new data centre into our Diamond-II plans to cope with the increased need in capacity and changing technology landscape. This addtional data centre capacity will be delivered in partnership with STFC to build a data centre to serve campus wide needs and will be called The Research Computing Centre.
Diamond Light Source is the UK's national synchrotron science facility, located at the Harwell Science and Innovation Campus in Oxfordshire.
Copyright © 2022 Diamond Light Source
Diamond Light Source Ltd
Harwell Science & Innovation Campus
Diamond Light Source® and the Diamond logo are registered trademarks of Diamond Light Source Ltd
Registered in England and Wales at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom. Company number: 4375679. VAT number: 287 461 957. Economic Operators Registration and Identification (EORI) number: GB287461957003.