According to an article published on  iSGTW , iSGTW was at the EUDAT second conference in Rome, Italy. EUDAT, which is funded under the European Commission's FP7 scheme, seeks to support a collaborative data infrastructure which will allow researchers to share data within and between communities and enable them to carry out their research effectively. Now two years into the project, the EUDAT team is ready to provide solutions that will be affordable, trustworthy, robust, persistent, and easy to use. These solutions are B2SAFE to replicate research data safely, B2SHARE to store and share long-tail research data, B2FIND to find research data, and B2STAGE to get data to computation. These are also to be followed by new services, such as Dynamic Data and Semantic Annotation, which were discussed intensively during the conference.

Kimmo Koski, project coordinator of EUDAT, spoke during the opening plenary session at the event: "Science is global and so should be the e-infrastructures and the related services," he says. "What we do at a European level we need to link tightly to national and global activities." Koski also stressed the importance of trust between communities: "Services need to be user-driven; we need to build trust between researchers and e-infrastructure providers."

Data entropy

While giving an overview of the DataONE project in the US, Bill Michener of the University of New Mexico, neatly summed-up the challenges faced by EUDAT and the greater community. "We're working across disciplines and in large teams to deal with the grand challenges in science. However, the data that we need in order to address these challenges has many problems."

During his presentation, Michener cited research which shows that the information content of databases typically decreases over time — "data has entropy," he argues. This may be due to poor metadata that makes the data difficult to interpret, poor archiving strategies, or even data archives simply being difficult to discover. "Over five million repositories exist worldwide, but you need to know which repository is holding the data you're looking for before you can go and search through it."

Michener also cited a paper by Carol Tenopir and colleagues which shows that scientists are generally interested in sharing their data, but that they often don’t know how to go about doing so. “They’re particularly confused about the issue of metadata,” says Michener, who laments the lack of standardized approaches among scientists for documenting their datasets. He reported on a working group within the Research Data Alliance which seeks to develop a collaborative, open directory of metadata standards. “This is a very focused goal, which should easily be accomplished within the 18-month-or-so targeted timeline,” says Michener.

