Keep up to date with the latest research and developments from Diamond. Sign up for news on our scientific output, facility updates and plans for the future.
Every year, Diamond produces an Annual Review, covering the scientific, technical, computing and business updates from the facility. The feature that follows has been prepared for our latest review, and looks at work conducted between April 2023 to April 2024.
Scientific Software, Controls and Computation (SSCC) department manages all software, computing and control systems to facilitate and support the science programme of Diamond. The department functions as nine groups: Scientific Computing, Data Analysis, Data Acquisition, Beamline Controls, Accelerator Controls, Electronic Systems, Scientific Information Management Systems, Diamond-II Integrated Software and Cyber Security. The overall structure and function of these areas recognises the importance of, and is optimised to provide, the best possible delivery and support for software, computing, and control systems.
During the past year the Diamond-II project was funded. One of the five pillars of the Diamond-II project, Pillar 1.3, will deliver and uplift the underlying capabilities of software and computing. It was recognised early on in the planning of Diamond-II that there needed to be significant developments in underlying software and computing capabilities to prepare for the substantial increase in data rates that will come with Diamond-II. This has been addressed through the design of a new software architecture for photon beamlines and the definition of an extensive core software and computing programme for Diamond-II to deliver new enabling capabilities. In addition, SSCC will deliver new software, control systems and computing as part of the machine upgrade and beamline developments for Diamond- II. The following section, Update on Diamond-II software and computing, provides an overview of developments in Pillar 1.3 and recognises key deliverables over the last year.
As experiments conducted at Diamond produce increasingly large and complex data sets, it becomes more difficult for users to transport their data back to their home institute for processing. To address this Diamond is increasingly providing users with data processing services to enable information to be extracted from their data. To facilitate this Diamond develops and maintains a suite of data analysis applications to support the photon science and electron microscope programmes. These same tools can also provide near real-time feedback to the user as their experiment progresses. However, some tools are computationally demanding and there is a programme to accelerate the processing using faster computing technologies, such as GPUs, to provide near real-time feedback during experiments.
While the end objective of a new architecture is to enable new science capabilities for Diamond-II, the development process includes a series of intermediate deployments on Diamond’s existing beamlines. These will not only deliver new capabilities early for Diamond, but will provide for a mechanism to debug the new software and ensure it is fit for purpose before major deployment as part of Diamond-II. The year saw the first user experiment conducted with the new software architecture.
Diamond has systematically maintained an archive of all data collected since operations began. To date that data has been “owned” by the scientist that have conducted the experiment. It is recognised that there are added opportunities from the experiments if the data is made open and so accessible to others. To achieve this Diamond will move to making its data repository open after an embargo period whereby not only the scientists who conducted the experiment have access to the data. For this to work the data will need to be FAIR (Findable, Accessible, Interoperable and Reusable) and additional metadata will need to be recorded about the experiment. While this has been very successfully delivered for some science techniques it was not possible across all techniques due to structural limitations in the available information management system. To address this, a new Universal Laboratory Information Management Systems is being developed to provide sufficient flexibility to capture metadata across all physical science beamlines. This is increasingly important as such metadata is a key part of using machine learning techniques to mine data for new information.
Diamond produced more than 10PB (a PB of data is equivalent to 213,000 DVDs) of data last year from photon beamlines and electron microscopes. To support the capture and processing of the data, Diamond operates an extensive computing online resource. This is designed to manage the high throughput and processing of data, as experiments are conducted. It is recognised that provisioning all computing services within the Diamond infrastructure is not a sustainable model, so work is ongoing to decouple applications and services from the existing computing infrastructure by using containerisation technologies. The containers can be deployed on both private and public cloud infrastructure.
The Diamond-II Core Software, Controls and Computing project (the third pillar of the Diamond-II project, seeks to deliver core developments across software, controls and computing to fully exploit opportunities afforded by the Diamond-II machine upgrade, and its new beamlines. It is a significant body of work underpinned by development, deployment and exploitation of a modern beamline software architecture. This will enable science currently impossible today by closely integrating data analysis, high performance computing, the control system, data acquisition (DAQ) and beamline instrumentation. Improved data analysis throughput will be realised by reducing latency in time-critical steps, and providing substantially better management of data, including facility-wide access to rich metadata catalogues.
First reviewed externally in Summer 2022, the beamline software architecture has matured substantially this year: a new Architecture Report has been written, and the project is looking forward to external review at the inaugural meeting of the Scientific Software Advisory Committee in June. It is an architecture encapsulating units of functionality into discrete services. The core supporting infrastructure for this is the Cloud Native Computing Foundation Project, Kubernetes, which will run both in-house developed software, and upstream and community supplied projects. Substantial progress has been made over the course of the year in establishing and provisioning this infrastructure, and the required containerisation of software and services).
Whilst the ultimate ambition of the project is to harness the brightness of the new Diamond machine and enable flagship capabilities, the project will gradually realise a continuous stream of incremental benefits, steadily reducing technical debt and addressing critical obsolescence, whilst unlocking new capabilities deployed with greater flexibility and extensibility. This supports Diamond’s objective to, based on science opportunity, implement exemplar services within the new software architecture to enhance Diamond and de-risk Diamond-II. The project is driven by an extensive plan: a bound waterfall plan, as clear strategy for delivering interconnected developments across disciplines with finite time and resource, exploiting agile loops, to meet key milestones (see Figure 1) and allow for smaller increments of work, early deployments (portfolio opportunities) and regular feedback.
At the centre, Diamond’s DAQ Software Framework is the primary interface for Diamond’s users, responsible for orchestrating experiments and managing data collection. GDA, Diamond’s current platform, will be replaced by the Athena service based DAQ framework. Closely coupled to this is replacement of the existing middle-layer framework, Malcolm, and provision of support for more extensive fly-scanning. These intertwined developments, which span two substantial project work streams, will leverage NSLS-II’s Bluesky library for data collection, and their Ophyd library for device abstraction. Members of Diamond’s SSCC software groups have enjoyed effective close collaboration throughout the year with their counterparts at NSLS-II. In November, the first releases of core Athena services and Ophyd-async were deployed using a new Kubernetes infrastructure to the I22 beamline alongside the current GDA platform. These were successfully employed in a user experiment to synchronise hardware control and data collection. Within MX, Bluesky has been leveraged with great success to support Unattended Data Collection.
The development of Bluesky plans and Ophyd devices will be at the heart of delivering new science capabilities; when brought together with all the considerable developments across the project, they sit as epicentre of a new integrated architecture focused on effective collaboration between software groups and science (see Figure 2) to deliver flexible and extensible software solutions.
The project has also seen advancement across the data analysis and information management domains. This year has seen triumph in the exploitation of novel hardware and architectures to bring about performance gains – e.g., realising spot-finding with GPUs in MX. It has been possible to make use of Diamond-II funding to recruit key positions in the second half of the project.. This has enabled the maturation of the beamline software architecture across the full stack. A new Universal Laboratory Information Management Systems team has been formed to drive forward the much-needed provision of information management systems for the Physical Sciences and tackle the challenge of providing access to rich metadata catalogues. A new Shipping Service has been delivered and Diamond’s SciCat has been deployed to B24. A new Data Analysis Platform team has also been established. The project teams have been able to explore several key cross-cutting themes: streaming, e.g. via prototyping ptychography developments on I08, web-technology based user interfaces, and crucially, authorisation and authentication to ensure to ensure a robust and secure scalable and extensible architecture. Progress notwithstanding, the project teams are very mindful of the challenges which await as they seek to support new deployments and capabilities of the new architecture alongside the existing software.
A critical part of automated macromolecular crystallography data collection is the alignment of the sample with the X-ray beam, which is performed with an automated optical step followed by a raster scan of the sample with the X-ray beam. Basic data analysis, or spot finding, is performed and the position with the strongest diffraction identified (at orthogonal angles) as the optimal location for data collection. Until recently, this involved capturing the data to HDF5 (Hierarchical Data Format version 5, an Open Source file format that supports large, complex, heterogeneous data) files, then performing the analysis from these files once the acquisition was complete. This led to a latency of a few seconds between the end of the scan and availability of results, or longer latency if there were a large amount of contention for the resources shared between beamlines for running this system. With the state-of-the-art Eiger 2XE 16M detectors on beamlines I03 and I04, which routinely operate at 500Hz, this latency is longer than the data collection to follow, representing an unacceptable delay. By addressing this delay throughput of automated data collection services will improve. The new approach is as follows: data from the Eiger detectors is sent over a ZeroMQ (asynchronous messaging library) network stream to the processing computer. Previously the data was captured and saved to HDF5 files in the global file system by ODIN [1]. Now, the data is captured to a RAMDisk [2], to allow in-flight analysis, before being saved to disk, thereby reducing latency on the availability of data to well under one second.
Images are analysed with a shared bank of compute clusters running the full DIALS [3] software suite and running a subset of the DIALS analysis. To accelerate processing, these algorithms have been ported to run on a GPU, and a subset of original functionality has been replicated. From data capture a stream of image data is written by the detector. The GPU spot finding service is notified within seconds of beginning the data collection via an internal Zocalo [4] workflow system, which launches the immediate GPU processing task. The GPU is then dedicated to this data processing activity while new images are being received.
Each image is uploaded to the GPU, then every area on the image is examined to identify peaks where signal pixels appear higher than their local background. Where before this was a serial process, where one CPU was required to analyse the whole image, this is now done in parallel on the GPU by splitting up the separate areas of the image. Once every pixel has been examined, they are grouped into bulk reflections, and the results returned to the service, which forwards the results on to the same processing algorithms running on the compute cluster, which decide based on these results which specific areas of the sample to measure to get best data (fig.3).
This is now being run on beamline I03 [5] alongside the existing system. When the behaviour has been verified as equivalent, then the GPU system will be used full time.
The analysis for X-ray centring is a critical step in acquisition workflow for automated data collection, but is not the only application of this technology. The serial data collection use case on beamline I24, where thousands of microscopic samples are streamed through the X-ray beam, will also benefit immediately from this technology. Firstly it gives feedback on the hit rate i.e. the frequency successful shots, and secondly by offering the possibility to perform data decimation at the edge i.e. not saving or transmitting the blank images back to the data centre, reducing the computational cost of low-hit-rate experiments such as sample injector experiments. Extending the analysis to include indexing will allow more sophisticated sample identification as well as more useful feedback for other modes of experiment and results in improved Diamond-II data analysis work, where users are given faster feedback with greater scientific value at lower energy cost.
Diamond is replacing its Generic Data Acquisition (GDA) software with Athena, a service-based experiment orchestration and data collection platform. At the heart of Athena is the scanning service BlueAPI, which uses NSLS-II’s Python libraries: Bluesky [7], a library for experiment control and collection of scientific data and metadata, and Ophyd [8], which allows the representation of hardware as hierarchical objects grouping together related values from the underlying control system (fig.4). BlueAPI wraps Bluesky plans and Ophyd devices inside a server and exposes endpoints to send commands and receive data. This makes it useful for installation at labs where multiple people may control equipment, possibly from remote locations.
As part of the Diamond-II upgrade, Diamond will deliver software solutions for three new flagship beamlines using Athena as the data acquisition platform. In order to extend and test the collection of services comprising Athena, Diamond is addressing challenging use cases across Diamond’s existing beamlines by applying the new architecture.
Early applications of Bluesky/Ophyd have been trialled on MX beamlines such as I03 and these developments have reduced the complexity and improved the performance of the data acquisition software. The work also significantly de-risks the software development for the K04 Diamond-II flagship beamline.
The first baseline Athena release has been deployed to beamline I22 in order to close out a long running project to implement Time Frame Generator (TFG) functionality using PandA [9]. The status of this project one year ago was that real space, hardware triggered, mapping scanning had been delivered with GDA and Malcolm, but more advanced modes requiring synchronised sample environment hardware triggering and timed scanning with uniform and non-uniform time frames throughout the acquisition had not been achieved. The advanced modes would have been difficult to realise using
Malcolm which only allows time or spatially resolved detector triggering. Completion of the I22 project was seen as an opportunity to progress the replacement of Malcolm with Athena and an updated Ophyd library, Ophyd Async, to facilitate the asynchronous control of devices.
Key to the success of this software architecture application opportunity, and other opportunities already planned and scheduled, is that the new Athena services can be deployed alongside GDA on a beamline controls server. This approach enables client requests to be directed to either GDA or Athena, allowing some beamline operations to be delivered using the new software architecture. The scheme also facilitates the phased migration of beamline experiment orchestration capabilities from GDA to Athena over time. One of the aims of the I22 project was to test this Athena deployment approach and this was successfully achieved.
In November of 2023, a first user experiment was conducted using a baseline Athena release comprising BlueAPI, Bluesky NeXus Writer and RabbitMQ[10] along with a first version of Ophyd Async. This experiment was focused on the collection of timed frame data whilst interacting with a sample environment observing a polymer melting and re-solidifying in a thermally controllable capillary. A Bluesky plan for the experiment was written to trigger detectors at variable multiples of a base rate while ramping temperature, and then the sample was imaged while it was cooling. The plan needed to control a Linkam temperature controller, Tetramms for incident beam flux (I0) and transmitted flux (IT), and detectors for WAXS[11] and SAXS[12] data.
The capability was very favourably received by beamline staff and users. The success of this work was a testament to the hard work of the Beamline Controls and DAQ Core teams, the SSCC support staff for I22 and I22’s beamline scientists.
Since November, the core software development teams have been refining the core services and components such that the I22 software support team can address further hardware synchronised user experiments. The first of these, planned for June, is to collect a timed frame dataset where the sample environment triggers data acquisition by acquiring results from a multi-syringe stop flow system where multiple components are mixed for reacting and acquisition commences after the dispensation of a particular volume of liquid. The second, in November, addresses the collection of a timed frame dataset where an external hardware trigger commences both an entire measurement sequence, only a single frame in a measurement sequence (in an overall timed frame series), or a group of frames from within a measurement sequence. These measurements will be acquired from a high-pressure cell loaded with proteinaceous samples or self-assembled systems, such as lipids, with triggers sent out at the beginning of a pressure sequence, at certain pressure thresholds and after a sudden pressure jump.
In November 2023, a team was formed to deliver a new Laboratory Information Management System (LIMS) for Diamond. Currently at Diamond such a system exists for the structural biology beamlines, comprising of the ISPyB database and the SynchWeb web front end. This system has been successful in providing a complete user workflow from sample shipping to viewing analysed data, as well as supporting unattended data collection, but is closely tied to data structures and workflows for structural biology, limiting its flexibility. A new LIMS system will be delivered as part of the Diamond-II modernised beamline software architecture, to support those beamlines not using ISPyB and SynchWeb. This will be called Universal LIMS, as it will be designed as a flexible system that can be easily adapted to the needs of different beamlines and technique areas.
Universal LIMS (fig.5) will comprise several individual software services with corresponding web front ends. These will be a shipping service; a container logistics service; a sample service; an experiment definition service; a beamline information search and view service; and Electronic Lab Notebooks (ELNs).The first five of these are where the main effort is currently focused, with ELNs being in scope for a later phase. Together these will provide a complete workflow to facilitate user experiments at Diamond.
Workflows for structural biology are well defined and are consistent across the beamlines that use it. This means the workflows are tightly aligned with the database structure, making it more difficult to adapt to other beamlines. The architecture for Universal LIMS is driven by the need to create a flexible system, that can easily be adapted to the needs of the different beamlines and technique areas, whilst minimising code changes required by the development team.
To provide the required flexibility a system using templates to define what metadata is stored will be used. The templates can be created by a defined set of super-users, and then metadata will be stored against specific versions of these. Similarly, the users can update the templates to new versions, and downstream applications will be able to understand how to interpret data depending on the template version.
The templating approach will also be used to define how the stored metadata is displayed to the user. Users will be able to create customisable screens for their scientific metadata, which can be selected based on the underlying data template. Re-usable components will be created too, for example for displaying a series of images in a slideshow.
Currently two services are deployed as part of Universal LIMS. There is a prototype instance of SciCat, used widely as a metadata catalogue by the ESS, MAX IV, PSI and others, running on the B24 beamline. A number of changes have been made to SciCat to support Diamond’s needs, most notably including changes to support the use of templates for both the stored datasets and the UI display of the scientific metadata. This is currently being rolled out to more beamlines, and serves as a way of both testing out the approach of using templates and refining the requirements for a beamline information search and view service.
The second deployed service is the Shipping Service. Functionality that had existed in SynchWeb has been broken out into a standalone service, which serves as an exemplar service for how the further Universal LIMS services can be built and work together. The Shipping Service has recently been made externally accessible, and to date has shipped over 1,000 Dewars. While it is currently being used within workflows for structural biology, is has been created in such a way that it will work seamlessly with other services at Diamond, such as Universal LIMS.
Over the next year the team will roll out more of the Universal LIMS services, starting with a sample service. During the roll outs, opportunities will be identified to both test out the new Diamond-II software architecture and deliver new functionality to beamlines.
Diamond is transitioning to a cloud native architecture across SSCC, utilising key Cloud Native enabling technologies such as containerisation and microservices design patterns. These software encapsulation and development methods drive higher speed and agility in software development, deliver higher application reliability, and provide more portable code. These methods allow Diamond to develop and deploy code more quickly, decrease time to science, and increase ability to respond to scientific drivers, opportunities, and collaborations.
The focus of the past year in this area has been around further developing the Diamond Kubernetes cloud platform, which runs thousands of containers that make up cloud native applications developed by all groups within SSCC. The platform also runs a large suite of community supplied open-source tools.
Recent developments on the Diamond Kubernetes cloud platform include:
The deployment of six Kubernetes clusters for beamlines. These beamline clusters have run parts of the controls and acquisition stack in containers for a live experiment on beamline I22. Each beamline cluster has its own Kubernetes control plane, ensuring it has a separate performance, failure, and administrative domain. Operationally efficient practices and tooling have been developed for multi-cluster lifecycle management.
Deployment of web applications hosted on Kubernetes accessible directly to the public internet. These applications are protected by a web application firewall, as well as active container security monitoring using Redhat’s StackRox platform.
Additional GPU capacity has been added to the large central Kubernetes cluster for EM processing of Relion based pipelines. This marks an interesting transition of traditional HTC/HPC based computing to Kubernetes. This trend is likely to continue as workflow-based tooling such as ArgoCD is deployed to perform data processing on Kubernetes.
Observability tooling has been further developed with upgraded centralised logging (Graylog) and architectural plans for federated metric collection via the Cloud Native Computing Foundation’s Thanos and Prometheus projects.
In addition to the on-premises Kubernetes platform, further enhancement of Diamond’s off-premises cloud processing has also proceeded at pace. Here the IRIS computing collaboration provides computing and storage resources that Diamond has built upon to offer an off-premises Slurm cluster. Tools for fast data transfer have been developed, facilitating non-real-time critical parts of the MX computing load to be transferred to IRIS for processing.
Future developments in the journey to cloud native will include the deployment of Kubernetes clusters to support the accelerator controls systems. The unified approach of standardising application deployment on Kubernetes for all SSCC groups is already yielding operational efficiencies. These efficiencies are expected to grow further as adoption becomes even more widespread.
Diamond Light Source is the UK's national synchrotron science facility, located at the Harwell Science and Innovation Campus in Oxfordshire.
Copyright © 2022 Diamond Light Source
Diamond Light Source Ltd
Diamond House
Harwell Science & Innovation Campus
Didcot
Oxfordshire
OX11 0DE
Diamond Light Source® and the Diamond logo are registered trademarks of Diamond Light Source Ltd
Registered in England and Wales at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom. Company number: 4375679. VAT number: 287 461 957. Economic Operators Registration and Identification (EORI) number: GB287461957003.