The data produced at Diamond continues to grow at pace, reflecting the investment and developments made in everything from detector technology, beamline stability and automation and improved software. In 2013/2014, the total amount of data archived from Diamond was reported to be 1Petabyte. By March 2015 this had grown to an impressive 3PB of data, equating to 800 million files, catalogued onto tape.
Move to NeXus data file format
Diamond is adopting the NeXus file format as its preferred way of recording data and the data acquisition staff have been continuing its roll out. NeXus files are self-describing binary files with a tree structure agreed by the NeXus community. These files include everything needed to understand the data collection they describe including, for example: the name and structure of the sample, the geometry of the beamline including the sample stages and the orientation and description of detectors, the exposure time, the images and other information gathered during the beamline’s calibration, a description of the scan performed and, of course, the data itself.
Fundamentally this level of completeness will preserve the significance of data collected at Diamond and will mean that Diamond data can be easily interpreted by the researchers who took it and by others. More immediately it provides a tidy interface through which to trigger Diamond’s Scientific Software group’s data reduction pipelines or to analyse and visualise the data after an experiment using Diamond’s DAWN software or with third party tools.
Data Analysis WorkbeNch (DAWN)
Development of the DAWN software has accelerated this year with new scientific staff joining the team and contributing important functionality for specific scientific areas. The modular nature of DAWN and the technology it depends on has allowed components of the package to be used in new ways. As well as incorporating new features into the data acquisition software in Diamond, GDA, components have also been reused by the controls group at Diamond as part of their long term interfaces to EPICS. This has also led to central facilities, including neutron spallation sources as well as synchrotron sources, incorporating new functionality and finding new uses for the software1.
Since 2013, Diamond software and beamline groups have been focused on creating a clean, robust and simple user experience; from tomography data collection, through to the processing of raw data into volumetric data. The efforts and achievements so far of the groups involved can be seen in the dramatic rise in the volume of tomography data being collected, and a threefold increase on usage of the Diamond computing cluster used for tomographic reconstruction.
As well as helping to coordinate these activities, the data analysis group have also been working closely with other Diamond staff to develop and enable new techniques to make the best use of the large cluster resources available and improve the experiment flow. This pipeline is compatible with similar work being undertaken inside the DAWN data processing framework, and is being considered for adoption by the wider community, including CCPi (Computational Collaboration Project for Imaging) and the IMAT station at the ISIS neutron source.
Basham, M. et al. Data Analysis WorkbeNch (DAWN) J Synchrotron Rad. 22 853-858 (2015) DOI: 10.1107/S1600577515002283.
Tomogram showing a cross-section of a bee's eye. Data collected on the Diamond Manchester Imaging Branchline (I13- 2). Courtesy of Gavin Taylor, Emily Baird, and Andrew J Bodey.
Diamond Light Source is the UK's national synchrotron science facility, located at the Harwell Science and Innovation Campus in Oxfordshire.
Copyright © 2017 Diamond Light Source
Diamond Light Source Ltd
Harwell Science & Innovation Campus