Aim

To identify and clarify issues in the automatic migration of datasets and their links to associated metadata. In deploying this prototype, we will be using physical files and databases that exist at this moment in time, and all the identifiers will essentially be static in time. It is quite clear that the eventual success of any such activity will involve provision for migration of the metadata and information objects to which they refer. This logical and physical separation of data from its associated metadata, and dealing with their migration, is a problem that exists in all data management contexts, not just CLADDIER. The issue is most difficult when planning for migration to future technologies that are not yet fully developed and/or large volumes are involved, requiring automatic processing. Here we will review relevant recent developments and technologies in data and metadata archive migration, and identify a set of common themes and problems inherent in these. We will also carry out a case study analysis by developing an understanding of the automatic migration processes from an appropriate current technology (such as the Storage Resource Broker, SRB), in order to identify the key issues involved in such migrations. At the same time, we will examine how the institutional and data repositories at Southampton, CCLRC, and BADC handle their metadata/data linkages, and identify critical issues which require further work, and suggest some guidelines which should be helpful to all data storage centres in modifying or better aligning their practices to make the necessary changes and improvements.

Partners

CCLRC e-science

Deliverables

  • Overall Document II, discussing issues associated with migration of linkages.