Functionally improved and substantially accelerated codes in the scientific domains of diffraction data processing, tomography and ptychography will be required, as too will development of Diamond’s data processing workflow framework. These developments will reduce demands on computation storage and network infrastructure from these large datasets.
- Diamond-II will work with the respective communities to identify new needs and opportunities, prioritise them, and devise strategies (including partnerships) to deliver improved tools. The adoption of a new real-time in-memory processing architecture to process raw diffraction data as it comes off the detector will overcome current limitations in detector performance stemming from the store-to-disk-and-process model in the ptychography, tomography and macromolecular crystallography domains. These will be embedded into the auto processing pipelines.
- Increased coherence, especially at higher photon energies, will increase the application of ptychography and tomo-ptychography imaging techniques, which have the potential to produce multi-petabyte data volumes. These activities will include an assessment of new and distributed technologies to understand how they may be used to alleviate substantial computing demand.
- Delivering a framework for data processing workflows will be a key deliverable, aiming to address the needs of all beamlines and to improve upon all existing capability. Embedding visualisation, so as to enable visualisation of results at key points in the analysis pipeline, will support real-time data analysis on beamlines. Complex "dynamic'' workflows will exploit sample and pipeline management, triggering, monitoring and reprocessing services.
- Current data processing and analysis paradigms are starting to show their limitations. Diamond produces significantly more data with potentially interesting science insight than is published. Many different approaches to this problem have been taken to address this mismatch. One key theme is the utilisation of Machine Learning (ML) algorithms to analyse, classify or segment datasets. Work to exploit Artificial Intelligence and ML for science at Diamond-II sits within an overall strategy at Diamond for science exploitation of developments in these fields, recognising its implications in a wide range of areas from simulations, through data capture to data analysis.