From terabytes to exabytes: research data management for the digital age
Wednesday, December 14th, 2011
This month Cardiff played host to a roadshow about research data management, one in a series organised by the JISC-funded Digital Curation Centre. The aim of the events is to “allow every institution in the UK to prepare for effective research data management and understand more about how the DCC can help.” The event was attended by several Welsh universities as well as staff from further afield. Perhaps appropriately for a topic that includes preservation, we met in the historic surroundings of the Royal Welsh College of Music and Drama‘s Anthony Hopkins Centre. It soon became clear from the presentations that good research data management is about much more than simply preservation of data. It also has much to do with management and communication of research, and so needs to engage with the whole research lifecycle.
Most of the day focussed on case studies from Cardiff, Bristol and Swansea Universities, showing how they were tackling various challenges relating to research data management, often making use of free resources developed by the DCC. While the DCC has been around for some time, its role has shifted away from tools development to focus more on capacity building and advocacy, working closely with the research community. The leadership and support that it offers must be invaluable to support staff in institutions who can often find themselves working in isolation and have to combine research liaison with other responsibilities.
The most useful presentations for me were both about data curation in the humanities. Sarah Phillips, Records Manager at Cardiff University, gave a really interesting talk about how she has been working with Dr Steve Mills archaeology to ensure the longevity of his important data relating to historic sites in Southern Romania. As one of the projects was a community engagement project it was important that the data should be available for public access as well as for more purely academic purposes, and better curation could help minimise the many risks attached when academics are working on data in different locations. Making access to research data easier also saves academics time in the long run.
Later, after a session from Caroline Gardiner of Bristol University on storing data securely, Stephen Gray talked about data.bris, a JISC project. Stephen has an exciting-sounding role as a digital support officer, working with academics in the humanities to navigate the technology needed to support their research, not least so they can demonstrate maximum research impact in the REF (Research Excellence Framework) come 2014. Such was the original role of subject librarians a generation ago, and although their role in many universities has since become much more focussed on teaching, there seems to be a need for staff who can offer a ‘translation service”, combining digital or information fluency with research capability.
Martyn Guest of ARCCA (Advanced Research Computing @ Cardiff) focussed on the topic of High Performance Computing and whilst I did not grasp all the technical detail I did learn the meaning of a terabyte (1000 gigabytes), petabyte (1000 terabytes) and exabyte (one quintillion bytes). I also gained a better understanding of High Performance Computing Wales . This involves £40m worth of supercomputing capacity, research institute and skills academy that aim to bring academia and industry together. In simple terms, as I understood it, it will be possible to move data around and work on it wherever there is most capacity. This will help the current “data tsunami” situation where apparently it is easier to send large quantities of data via Fedex rather than over networks!
For me, some of the most interesting discussion points during the day were around the need for better communication across departments (eg subject researchers and IT staff; records managers and archives) and the need for advocacy with senior management. IT services are critical to universities’ business engagement and the public dissemination of research that is required for economic prosperity as well as research quality. Yet research data are often still managed within faculties.
Apologies to Liz Lyon and Alex Roberts for not including reference to their presentations which unfortunately I missed! It was a pity that train problems made it difficult to attend the whole day. Not for the first time, I did wonder whether it would be helpful for events organisers to consider running ‘taster’ webinars alongside the face to face event programme. These could act as a ‘trailer’ or ‘taster’ for the big event and could widen the audience particularly at times of year where travel is particularly hard. Face to face contact is important though: the value of bonding over a spot of food and drink can be underrated, and the social element is also key to building cross-disciplinary relationships, especially in relatively new fields of endeavour.
The DCC website provides links to a wide range of resources and tools on research data management. For those wanting help with advocacy or just an overview, I can recommend their short briefing paper Making the case for RDM (research data management). You can also follow @digitalcuration on Twitter. Thanks to Janet Peters at Cardiff University and the staff of DCC for making the effort to bring the roadshow to Wales!
All the presentations from the event are now available here.

