Last modified: 2014-11-16
Abstract
Dublin Core Description Set Profiles (DSP) are a component of DCMI Application Profiles. A DSP describes the structures and constraints of metadata in an application (e.g. resource classes, properties cardinality, value scheme). Metadata schema registries which collect and provide metadata schemas have a large potential for helping metadata schema designers find, compare and adopt existing schemas. However, most LOD datasets are not published with their DSPs. As a result, metadata schema designers have to look at each dataset and guess the DSPs.
This paper proposes a method to extract the structural constraints of metadata records automatically from metadata instances using existing metadata schema. The goal of this study is to reduce the cost of metadata schema extraction and to increase the number of metadata schemas registered in metadata schema registries. We have experimentally extracted constraints from LOD datasets using SPARQL. In order to evaluate our approach, we applied our approach to 10 datasets in the DataHub. By comparing the structural constraints which were extracted using our approach with a manual approach, we found that our approach was able to extract more constraints.