Author |
Title & Abstract |
Steven Folsom & Jason Kovari |
OCS: 433 TITLE: Ontology Assessment and Extension : A Case Study on LD4L and BIBFRAME ABSTRACT: Representatives of the Andrew W. Mellon funded Linked Data for Libraries - Labs and Linked Data for Production teams will discuss their assessment strategy and alignment progress between the BIBFRAME and LD4L ontologies, including semantic patterns and ontology reuse. Further, the talk will discuss the ontology extension work underway within the LD4P program, focusing on those directed by Cornell and Harvard Universities. |
Rebecca Green |
OCS: 435 TITLE: Data-Driven Development of the Dewey Decimal Classification ABSTRACT: Changes involved in maintaining the Dewey Decimal Classification (DDC), a general classification system, have derived in the past from many distinct sources. These include (but are not limited to) questions/ideas/complaints from end users, classifiers, translators, or members of the Decimal Classification Editorial Policy Committee (EPC); mappings of other knowledge organization systems to the DDC; and personal awareness of events, emerging issues, and trends. On the one hand, these phenomena may bring to light ambiguity or redundancy in the current system. On the other hand, they may bring to the attention of the editorial team new topics needing provision within the system.
Without disregarding these sources, the DDC editorial team is also considering data-driven methods of (1) identifying existing areas of the DDC warranting further development or (2) identifying topics with sufficient literary warrant to justify explicit inclusion in the DDC. The use of two sources of data is under investigation. The topics and schedule areas identified through these means require investigation to ascertain if they are viable candidates for further development. Preliminary work with these data sources reveals that the strategies hold promise. |
Joseph A. Busch, Branka Kosovac, Katie Konrad, & Martin Svensson |
OCS: 436 TITLE: Save the Children Resource Libraries: Aligning Internal Technical Resource Libraries with a Public Distribution Website ABSTRACT: Save the Children (STC) is an international NGO that promotes children's rights, provides relief and helps support children across the globe. With international headquarters in London, STC has 30 national members and supports local partners in over 100 countries worldwide. STC International maintains technical infrastructures that are available to members and local partners including SharePoint, Drupal and other information management applications. An effort to specify and implement a common resource library for curating and sharing internal technical resources has been underway since November 2015. This has included an inventory of existing (but heterogeneous) resource libraries on STC's work in the thematic area of Health and Nutrition, and agreement on a common metadata specification and some controlled vocabularies to be used going forward. This internal resource library has been aligned with STC's Resource Centre (resourcecentre.savethechildren.se), a public web-accessible library that hosts comprehensive, reliable and up-to-date information on STC's work in the thematic areas of Child Protection, Child Rights Governance and Child Poverty. The goal is to make it easy for content curators to identify items in the internal technical resource library, and to publish them to the public Resource Centre with a minimum transformation of metadata required. This presentation discusses how this project has reached consensus on how to accommodate and balance internal research and external communications requirements by developing a light-weight application profile. |
Saho Yasumatsu, Akiko Hashizume, & Julie Fukuyama |
OCS: 437 TITLE: Japanese Metadata Standards "National Diet Library Dublin Core Metadata Description (DC-NDL)": Describing Japanese Metadata and Connecting Pieces of Data ABSTRACT: The National Diet Library (NDL) is the sole national library in Japan. This poster mainly presents the National Diet Library Dublin Core Metadata Description (DC-NDL), which is a descriptive metadata standard utilized primarily for converting catalog records of publications held by the NDL into metadata based on the Dublin Core Metadata Element Set (DCMES) and the DCMI Metadata Terms. The key functions of the DC-NDL are the follows: (1) Representing the yomi (pronunciation), one of the characteristics of the Japanese language, (2) Connectivity with Linked Data especially for URI, and (3) Compatibility with digitized materials. Furthermore, we describe an example of implementing DC-NDL for use with NDL Search. We concluded by pointing out issues for future research on the DC-NDL. |
Mariana Curado Malta, Elena Gonzalez-Blanco & Paloma Centenera |
OCS: 440 TITLE: POSTDATA -- Towards publishing European Poetry as Linked Open Data ABSTRACT: POSTDATA is a 5 year's European Research Council (ERC) Starting Grant Project that started in May 2016 and is hosted by the Universidad Nacional de Educación a Distancia (UNED), Madrid, Spain. The context of the project is the corpora of European Poetry (EP), with a special focus on poetic materials from different languages and literary traditions. POSTDATA aims to offer a standardized model in the philological field and a metadata application profile (MAP) for EP in order to build a common classification of all these poetic materials. The information of Spanish, Italian and French repertoires will be published in the Linked Open Data (LOD) ecosystem. Later we expect to extend the model to include additional corpora.
The final goal of the POSTDATA project is: (1) to be able to publish all the data locked in the WIS, in LOD, where any agent interested will be able to build applications over the data in order to serve final users; (2) to build a Web platform where: (a) researchers, students and other final users interested in EP will be able to access poems (and their analyses) of all databases; (b) researchers, students and other final users will be able to upload poems, the digitalized images of manuscripts, and fill in the information concerning the analysis of the poem, collaboratively contributing to a LOD dataset of poetry. |
Maria Esteva & Ramona Walls |
OCS: 442 TITLE: Identifier Services: Tracking Objects and Metadata Across Time and Distributed Storage Systems ABSTRACT: This presentation describes research around Identifier Services (IDS). IDS is designed to bind dispersed data objects and verify aspects of their identity and integrity, independent of where the data are located and whether they are duplicate, partial, private, published, active, or static. IDS will allow individuals and repositories to manage, track, and preserve different types of identifiers and their associated data and metadata. The IDS data model which focuses on research processes and the relationship between their data inputs and outputs will significantly improve provenance metadata of distributed collections at any point of their lifecycle. |
Richard Wallis |
OCS: 456 TITLE: Cognitive and Contextual Computing—Laying A Global Data Foundation ABSTRACT: A search of the current computing and technology zeitgeist will not have to look far before stumbling upon references to Cognitive Computing, Contextual Computing, Conversational Search, the Internet of Things, and other such buzz-words and phrases. The marketeers are having a great time coming up with futuristic visions supporting the view of computing becoming all pervasive and 'intelligent'. From IBM's Watson beating human quiz show contestants, to the arms race between the leading voice-controlled virtual assistants–Siri, Cortana, Google Now, Amazon Alexa.
All exciting and interesting, but what relevance has this for DCMI, metadata standards, and the resources we describe using them? In a word, "context". No matter how intelligent and human-like a computer is, it's capabilities are only as good as the information it has to work with. If that information is constrained by domain, industry specialised vocabularies, or a lack of references to external sources; it is unlikely the results will be generally useful. In the DCMI community we have expertise in sharing information within our organisations and on the web. Dublin Core being one of the first widely adopted generic vocabularies. A path that Schema.org is following and in its breadth of adoption is now exceeding.
From his wide experience working with Google, OCLC, European and National Libraries, the banking industry and others, Richard will explore new initiatives and the processes being undertaken to prepare and widely share data in a generally consumable way on the web. Schema.org has been a significant success. Used by over 12 million domains, on over a quarter of sampled pages. It is enabling a quiet revolution of preparing and sharing data to be harvested into search engine Knowledge Graphs. Knowledge Graphs that power Rich Snippets, Knowledge Panels, Answer Boxes, and other search engine enhancements. Whilst delivering on one revolution, it is helping to lay the foundations of another. Building a global web of interconnected entities, for intelligent agents to navigate, these Knowledge Graphs fed by the information we are starting to share generically, are providing the context that will enable Cognitive, Contextual and associated technologies scale globally. Ushering in yet another new technology era. |
Hannah Tarver & Mark Phillips |
OCS: 458 TITLE: Identifier Usage and Maintenance in the UNT Libraries' Digital Collections ABSTRACT: At the University of North Texas (UNT) Libraries we work with a large number of identifiers in relation to our Digital Collections (The Portal to Texas History, the UNT Digital Library, and the Gateway to Oklahoma History). Since our Digital Collections comprise items from other library and campus departments, as well as a large number of cultural heritage institutions across Texas and Oklahoma, many of the materials have assigned identifiers that are important to the group that owns the physical materials. We document any relevant identifiers in each item's metadata record, whether they belong to an international or established standard (e.g., ISSNs or call numbers) or have a specific context (e.g., agency-assigned report numbers). Most discrete collections have partner-assigned identifiers that range from established accession numbers to sequentially-assigned numbers; these identifiers allow for a connection between a digital item in the public interface, copies of the associated digital files, and the physical object. To ensure that identifiers are unique within the Digital Collections, we routinely add codes that identify the partner institution at the front of each identifier, separated with an underscore (e.g., GEPFP_62-1). This makes it relatively easy to distinguish the original identifier from the code that we have added, but also prevents the inclusion of several hundred items identified as "0005" if a user wants to use an identifier to search for a particular object.
Internally, our digital infrastructure uses ARK (Archival Resource Key) identifiers to track and connect archival copies of files stored in our Coda repository with web-derivative copies in our Aubrey access system. We also currently use PURLs (Permanent Uniform Resource Locators) to identify and manage controlled vocabulary terms. For name authority, we create local authority records that act similarly to item records in terms of identifiers: each record has a system-unique identifier that generates a stable URL, but contains a field to include alternate established identifiers (e.g., ISNIs, VIAF record numbers, ORCIDs, etc.) that also refer to the entity, when applicable.
This presentation would discuss some of the complexities inherent in managing both locally-created and externally-assigned identifiers, why we use different types of identifiers throughout our infrastructure, and the implementation of various identifiers in our Digital Collections. |
Thomas Baker, Caterina Caracciolo, Anton Doroszenko, Lori Finch, Osma Suominen, & Sujata Suri |
OCS: 459 TITLE: The Global Agricultural Concept Scheme and Agrisemantics ABSTRACT: This presentation discusses development of Global Agricultural Concept Scheme (GACS) in which key concepts from three thesauri about agriculture and nutrition—AGROVOC, CAB Thesaurus, and NAL Thesaurus—have been merged. The respective partner organizations—Food and Agriculture Organization of the UN (FAO), CAB International (CABI), and the USDA National Agricultural Library (NAL)—undertook this initiative in 2013 with the goal of facilitating search across databases, improving the semantic reach of their databases by supporting queries that freely draw on terms from any mapped thesaurus, and achieving economies of scale from joint maintenance. The GACS beta release of May 2016 has 15,000 concepts and over 350,000 terms in 28 languages.
GACS is seen as a first step for Agrisemantics, an emerging community network of semantic assets relevant to agriculture and food security. Within Agrisemantics, the general-purpose, search-oriented concepts of GACS are intended to serve as a hub for concepts defined, with more precision, in a multitude of ontologies modeled for specific domains. Ontologies, in turn, are intended to provide global identity to concepts used in a vast diversity of quantitative datasets, such as sensor readings and crop yields, defined for a multitude of software applications and serialization formats.
Agrisemantics aims at improving the discoverability and semantic interoperability of agricultural information and data for the benefit of researchers, policy-makers, and farmers with the ultimate goal of enabling innovative responses to the challenges of food security under conditions of climate change. Achieving these goals will require innovation in processes for the cooperative maintenance of linked semantic assets in the modern Web environment. |