- The dataset that brought us past the 3PB mark: a coloured reconstruction of a bee’s compound eye, produced from studies on I13. Courtesy of Gavin Taylor, Emily Baird, Andrew J Bodey & Andreas Enstrom (University of Lund)
Around 3000 scientists visit Diamond each year to study, scan, and scrutinise samples. They could be investigating anything, from a type of virus or bacteria, to smart materials like graphene, or samples from dinosaur skeletons. All of this scientific research produces some pretty big data, up to 20 terabytes a day to be precise – that’s equivalent to all the storage on about 40 standard laptops – and Diamond’s data is only getting bigger.
Diamond recently passed a big data milestone: a massive three petabytes have now been processed at the facility. Now for those of us who don’t know our gigabytes from our terabytes, three petabytes is about the equivalent of 187,500 modern iPhones*, or 6000 years of Mp3 music; it’s more data than a human being could ever process in their lifetime.
And the amount of data being processed at Diamond is increasing rapidly; in fact, the number has gone up by 35% in the last 12 months alone. New beamlines, improved capabilities, and proliferating numbers of users mean that Diamond is running at a higher level than ever before. Diamond’s data is expected to increase even more rapidly in the coming years; 8 new beamlines will become operational by 2018, and new integrated facilities like the eBIC centre for biological research and the new materials characterisation facility will mean that Diamond is soon producing more data than almost any other facility in the UK.
So how do you cope with that much information? Well, Diamond has an entire team of computing specialists dedicated to processing the raw information generated by scientific experiments. Their job is to transform this data into a format that can be interpreted by the scientists carrying out the experiment, like a graph or an image. These people are the unsung heroes of science – with advanced software and techniques, the computing team ensure that scientists’ experiments generate clear and meaningful information.