A new DAWN: next-generation data processing

Significant upgrades to DAWN bring a new era of data processing to Diamond

The Data Analysis WorkbeNch (DAWN) was created at Diamond Light Source to allow users to process and visualise experimental data at their home institution. A recent paper in the Journal of Applied Crystallography details the wealth of new functionality added for the calibration and processing of powder diffraction and small scattering data from 2D detectors. The new tools provide accurate, reliable calibration, with a generic, flexible processing system that maintains the provenance of the data. The open and modular nature of DAWN allows it to be used on all of Diamond’s beamlines, and in other facilities. The paper contains case studies that describe how some of the features introduced have been applied to a broad set of experiments, across different beamlines at Diamond.

Figure 1: View of the new DAWN software package.

Generic Data Acquisition

The Generic Data Acquisition (GDA) software package used at Diamond was initially developed at the Synchrotron Radiation Source (SRS) at Daresbury Laboratory. In 2003 it was adopted by Diamond, who took over as its principal developer. GDA is designed to be universal and flexible, so that it can be customised for use on any beamline, and for any underlying hardware control system. DAWN originally grew from the GDA, with the aim of allowing beamline users to process and visualise their data once they had returned to their home institutions. Whilst other software packages are available, they are often designed to support specific experiments or work with certain data formats and are not able to view all the data formats Diamond collects. DAWN’s open source nature makes it an attractive option for Diamond’s users, and being built on Java, it works on Windows, Mac and Linux.

DAWN

Beamline experiments are becoming increasingly automated, and produce ever larger amounts of data that need to be processed and visualised into meaningful results. The new functionality in DAWN makes it an easy-to-use, flexible software package that can be employed both during and after data collection. The most recent additions to DAWN’s functionality are tools for the automated calibration and batch processing of two-dimensional Powder X-ray Diffraction (PXRD) and Small-Angle X-ray Scattering (SAXS) datasets.

Recent versions of DAWN contain a new user interface for building custom data processing pipelines, allowing both step-by-step analysis and batch processing of raw datasets. A command-line version and scripting interface are available for calibration and reduction, with a calibration routine that provides accurate values for energy and the detector geometry, even for experiments with large detector tilts or at very high energy. DAWN provides standardised and versatile data input and output, and automatically saves all processing steps and parameters needed to recreate the same output, maintaining the provenance of the data.

With DAWN, results that would previously have taken months of data processing can now be available within minutes. For example, going from a series of raw diffraction images to fully processed data used to be a manual, multi-stage process that required the use of several software packages. The process can now be fully automated within DAWN, with users able to queue up datasets for processing and to process multiple pipelines at the same time. Dr Andy Smith, the Beamline Scientist for the Small Angle Scattering and Diffraction beamline (I22) recalls one recent experiment where this last feature was important: “Comparing injured and healthy bones should lead to better treatments. Bone is made from collagen and plates of calcified material, which results in two different signals, and two different processing pipelines. The correlation between the two signals tells us about the structural integrity of the bone, and processing those two pipelines simultaneously is a real improvement.”

DAWN also deals elegantly with very large datasets, by loading the data in manageable chunks. Running standard pipelines after each run gives researchers almost real-time feedback on how an experiment is progressing.

Open source

The modular nature of DAWN makes it flexible and customisable, allowing users to create their own data processing pipelines. Its open source nature means that users can write their own modules in Python, for which DAWN will create a user interface. There is a marketplace for user-created modules, which could be included into DAWN in the future if they prove to be widely applicable. The flexible nature of DAWN means that it can be, and has been, used at other facilities. Dr Dominik Daisenberger, Senior Support Scientist on Diamond’s Extreme Conditions beamline (I15), finds that “Two thirds of new user groups on I15 each month are converting to DAWN – users are happy to adopt it.”

Future development of DAWN will allow it to meet the changing needs of Diamond’s users and evolving beamline functionality. In the meantime, Senior Software Scientist Dr Jacob Filik says, “the next steps are to take the new tools and incorporate them back into the GDA, allowing (for example) automated calibration directly in the beamline software.”


To find out more about DAWN please contact Dr Jacob Filik and Dr Mark Basham: jacob.filik@diamond.ac.uk, mark.basham@diamond.ac.uk.
 

Related publications:

Filik J et al. Processing two-dimensional X-ray diffraction and small-angle scattering data in DAWN 2. Journal of Applied Crystallography 50 (2017). DOI:10.1107/S1600576717004708.

Basham M et al. Data Analysis WorkbeNch (DAWN). Journal of Synchrotron Radiation 22:3 (2015). DOI:10.1107/S1600577515002283.