- Applying Taxonomies in Publishing
- Learning Resource Metadata (LRMI): Current Work: Building on Schema.org to
Describe Learning Resources - University Metadata and Retrieval: The Death of the Library Catalog?
- Competency-Based Discovery for Learning Linked Data
- Program for Cooperative Cataloging (PCC) Report: Task Group on URIs in MARC
- DCMI Roundtable: Stable, Persistent Metadata Terms
Everyone seems to be creating a taxonomy these days. But what are we doing with them? How are we making access to the literature we publish better or easier for our readers? This session will focus on publishers who have created a taxonomy for their publication(s) and
what public-facing applications they have built or used to highlight their content, and
what internal uses they may have have made of the data generated.
Facilitators/Presenters
Monica Bradford, American Association for the Advancement of Science, United States
Charlotte McNaughton, American Society for Civil Engineers, United States
Jay Ven Eman, Access Innovations, United States
- Marjorie Hlava, Moderator, President & Chairman of Access Innovations/Data Harmony, United States
This session in two parts will present the current state of development of LRMI metadata, including how it builds on and contributes to schema.org, how it is being used in practice, current development work, and how it relates to the wider context of metadata in education. The LRMI specification is a collection of classes and properties for markup and description of educational resources. The specification builds on the extensive vocabulary provided by Schema.org and other standards. Initiated in 2011 as a project which was co-led by Creative Commons and the Association of Educational Publishers, since 2014 LRMI has been a community specification of DCMI [2], the maintenance and development of which is overseen by the DCMI LRMI task group.
Part 1 will provide: (1) an introduction to LRMI, explaining its background in the context of schema.org and as a specification in its own right; (2) examples of how LRMI is being used by publishers of educational materials, in development projects, and in metadata-based discovery systems; and (3) how LRMI is currently developing in the context of other educational metadata specifications and technological developments around education.
Part 2 will be handed over to implementers of LRMI who will explain how they have made use of the specification, both in presenting information about learning resources on the Web (i.e. within the original context set by schema.org) and beyond, for example for internal content management or as a stand-alone metadata specification. This may also include the use of LRMI and schema.org as a base specification for data models in applications that go beyond content management and discovery. Finally we will look to the future, with a panel drawn from the LRMI Technical Working Group and current users of LRMI who will discuss issues they see as important for the future development of LRMI.
Facilitators/Presenters
- Phil Barker, Research Fellow, Heriot-Watt University, United Kingdom
Universities have been cataloging and adding subject headings as well as other metadata to give access to their collections for well over 100 years. The standard metadata offering is generally small and highly constrained by the many rules surrounding the MARC and related cataloging rules. Title, author and keywords are the main search avenues allowed in the standard library OPAC (Online Public Access Catalog). The results are often confusing, limited, and do not give a fair view of the breadth or depth of the collection holdings of a serious library. With the advent of the internet, expectations and ease of search by the user have changed significantly. They are no longer satisfied with the traditional catalog search. A full new paradigm has developed using the post-coordinate headings from thesauri and other knowledge organization systems supplanting the traditional pre-coordinated and complex subject headings used since the advent of Dewey, Cutter, and the Library of Congress Subject Headings. The search provided by the library OPAC is limited and falls far short of user experience in Google and other search systems.
Does this foreshadow the death of the library catalog? What are cutting edge libraries doing to bridge the gap between traditional cataloging and the new methodologies used in current metadata schemas? How are they expanding the accessibility to collections via digital means to support this new paradigm?
Facilitators/Presenters & Topics
Judy Russell, Dean of Libraries at University of Florida in Gainesville will describe the Florida Thesis Portal Project and how, via scanning OCR and semantic enrichment of the data the information from this and other special collections is making the data fully accessible for research and for business.
Dr. Ying-Hsang Liu, School of Information Studies, Charles Sturt University, Australia, will describe recent work on the search effectiveness of using thesaurus terms to supplement the sparse metadata records of library catalog from information retrieval perspectives. Another recent user study of domain experts' eye gaze behavior in interacting with novel search interfaces, in which MeSH terms were displayed in different ways, will also be presented. This research demonstrates the interdisciplinary approach to user experience study, including organization of information, information retrieval and search user interface design.
Dr. Koraljka Golub, Associate Professor, School of Cultural Sciences, Faculty of Arts and Humanities, Linnaeus University, will speak on the integration of social tagging and automatic indexing of the existing system of knowledge systems in libraries and provide an outline of a comprehensive evaluation.
Marjorie Hlava will wrap up with a discussion of how the simplification of a full library OPAC dataset mapped to a Dublin Core version of the similar fields can provide significantly improved retrieval on the library resources especially when combined with a taxonomy rather than using traditional library subject headings. A case study of the process and tools used will also be presented.
Linked data is becoming an important tool for access to, and sharing of, resources. However, the knowledge necessary to use Linked Data effectively is broadly scattered and difficult for novices to learn. This Special Session will use early experiences from one effort—Linked Data for Professional Education (LD4PE)—to systematize the discovery and learning processes as a starting point for discussion and exploration.
The LD4PE project, in partnership with DCMI, is developing a Web-based referatory, Exploring Metadata [3], to help alleviate this struggle by providing learners with access to hundreds of learning resources aligned to the Linked Data concepts and terminologies, learnable skills, and competencies that are necessary to the effective practice of Linked Data principles. At the heart of Exploring Metadata is a competency framework for Linked Data practice that supports indexing the learning resources that have been gathered from across the Web by the LD4PE project to the specific competencies those resources address. Future developments of the Exploring Metadata referatory will be managed by DCMI and will extend its scope to areas of metadata design and practice beyond Linked Data using additional competency frameworks developed by crowd-sourcing community expertise.
This session will begin with a tutorial in the use of both the Exploring Metadata referatory and the underlying rationale for use of the Competency Index for Linked Data from the standpoint of a potential learner (or educator-learner) in the Linked Data world. It will introduce delegates to the community feedback mechanism to be used in future referatory development. The intent of the first part of the session is to show how an educator or learner can: (1) use competency-based discovery of learning resources; and (2) how they can create trajectory maps for personalized learning or development of learning modules. Using this demonstration as a starting point, the session will further discuss the issues in the Linked Data learning environment as a whole. Attendees will be encouraged to contribute their own insights, lessons learned, and future requirements, using the competency framework as a working document to flesh out gaps and areas for further development in the teaching and learning of Linked Data concepts and skills. The attendees of this session would come away with a sense of what kinds of learning materials are available—and how Exploring Metadata can help find them. Attendees will be encouraged to join in the further community development of Exploring Metadata and its enabling toolkit.
Facilitators/Presenters
Thomas Baker, DCMI Chief Information Officer (Communications, Research and Development), Germany
Michael Crandall, Principal Research Scientist, Information School, University of Washington, United States
Marjorie M.K. Hlava, President & Chairman of Access Innovations/Data Harmony, United States
Stuart A. Sutton, Associate Professor Emeritus, Information School, University of Washington & DCMI Managing Director, United States
David Talley, MLIS Graduate, Information School, University of Washington, United States
Marcia Lei Zeng, Professor, School of Library and Information Science, Kent State University, United States
The Program for Cooperative Cataloging (PCC) Task Group on URIs in MARC was established by the PCC Steering Committee in September 2015 and was charged: (1) to identify and investigate issues surrounding the use of identifiers in MARC records; (2) develop guidelines and policies, (3) formulate work plans to implement identifiers in $0 and other fields and/or subfields in ILSs and PCC-affiliated utilities; (4) in consultation with the MARC Advisory Committee, technologists, and other stakeholders, take appropriate measures ensuring library data will transition smoothly from the MARC environment to a Web-based and Linked Open Data platform or service. [1]
To that end, the Task Group conducted pilot tests throughout 2016, and compiled findings and comments that served the foundation of several MARC proposals at the 2016 American Library Association Annual Meeting. The papers covered functionality of URI/IRI in MARC subfields such as $0, $4, $o, etc. The goal of the proposed refinements and expansion of usage of these MARC subfields is to prepare and extend the richness of library data to the wider information world with no or little programmatic intervention. The TG's efforts benefit information professionals and researchers beyond library borders.
In this Special Session, The PCC Task Group will report on the work from the inception of the Task Group to date. This will include a technical discussion around entification of MARC library data and comments on the status of existing infrastructure within the Libraries to support this work.
Facilitators/Presenters
Terry Reese, Head, Digital Initiatives, Ohio State University Libraries, United States
Jackie Shieh, Coordinator, Resource Description Group, George Washington University Libraries, United States
Speaker Biographies:
Carol Jean Godby is a Senior Research Scientist at OCLC. She currently manages a team of researchers conducting projects on linked data and data science. Jean has a Ph.D. in linguistics from Ohio State University. She is the chair of the PCC-URI subcommittee on RWOs (real world objects in linked data).
Jackie Shieh is the Resource Description Coordinator for George Washington University Libraries. Currently chairs the PCC URI in MARC Task Group. Previously worked as the Original Cataloger for Electronic Resources for University of Virginia Library and as the Team Leader for Special Collections and Projects, University of Michigan Library.
Steven Folsom is Metadata Technologies Program Manager for Harvard Library where he focuses on metadata maintenance and strategic interoperability activities, allowing for integration of metadata between systems. Steven participates in research and development efforts (e.g. as a member of the Linked Data for Libraries Mellon funded grants) and advocates for related standards developments.
Tiziana Possemato holds a degree in Philosophy (La Sapienza Rome), diplomas in Archival Science and Library Sciences (Vatican Schools) and a master degree in Library Sciences (University of Florence). She has led numerous projects for library automation, analysis, mapping, conversion and for the transformation in Linked Open Data and the publication of catalogue data from numerous institutions. She is the Chief Information Officer of Casalini Libri, and founding managing partner of @Cult.
Terry Reese is Head of Digital Initiatives at the Ohio State University where he focuses on the development and implementation of the Libraries preservation and data management systems. Over the past 17 years as a librarian, his research has focused on metadata and metadata interoperability. He is the creator of MarcEdit, a popular metadata management application, which focuses on improving real-world cataloging workflows and lowering barriers to implementing emerging metadata standards.
Abstracts:
TG One-yr activities sum-up (Jackie Shieh)
Overview of the PCC Task Group's work from the inception of the TG to date. Includes motivation, environmental scan of bibliographic and authority data, operating assumptions related to identifier in MARC data, etc.
Where we see this work fitting in with other transition efforts such as LD4P, BIBFLOW, linked data authorities, and the PCC strategic plan. Where MARC stands and what a transition strategy might look like.
URI Pilot Test Tech Challenges (Terry Reese)
One of the primary challenges associated with utilizing linked data within the library community is the significant amount of legacy data that exists, and the need to entify the existing data. A wide range of services currently exist providing access to linked data end points, but these come with a lot of baggage. Most of these services were not created to support this type of work, and understanding data modelling decisions, and limitations can make this process challenging and sometimes brittle. This part of the presentation will talk about the technical challenges in developing a scalable pilot process in MarcEdit, and how this process has been developed to be flexible and support real-world authority control applications.
Real World Object (Jean Godby)
To advance the library community's goal of making the transition from MARC to linked data, the PCC-URI RWO subcommittee would like to recommend that URIs in MARC bibliographic records resolve to real world objects–i.e., to people, places, organizations, and things, and not to Web pages or descriptive records for the object. But RWOs, as currently defined and implemented, present many conceptual, technical, and logistical barriers that must be overcome. This presentation reviews these problems and offers practical solutions that support our recommendation.
Creation, Management and Reconciliation of URIs: a Use Case: the Share Catalogue (Tiziana Possemato)
The presentation offers an overview of how a supplier manages the URIs to propose enriched bibliographic exploitation. The author will describe the different steps in the process, from URI management in our cataloguing environment, the enrichment of the data with URIs from different sources, the reconciliation of entities through the creation of RWOs (clustering process) and the conversion into Linked Open Data, to publication in accordance with the BIBFRAME model, with the addition of other ontologies supported by the URIs. Following this vision, Casalini Libri seeks to provide a Knowledge base of clustered names, works and subjects in order to create a new model for co-operative cataloguing and an improved search experience for end users.
Documents in support of implementation (Steven Folsom)
Use Cases for URIs in MARC for Systems Developers/Vendors/Authority is a document developed within the context of the PCC Task Group on URIs in MARC. The intent is to provide library platform developers/vendors and metadata service providers general guidance on areas of need/desired functionality we anticipate as URIs become more ubiquitous in library data. Recognizing that URIs in MARC is part of a larger discussion about libraries transitioning to linked data technologies, desired native linked data tools/functionality are included.
Formulating and Obtaining URIs has been established as a guide for metadata practitioners interested in capturing URI identifiers from data sources on the web in their bibliographic data; it is not a policy document. Inclusion of a particular data source in this document is not necessarily an endorsement, but rather an acknowledgment that they are commonly used by the library, archives and museum communities for the purposes of bibliographic description. Implementation patterns for use and provisioning of URIs by data publishers vary immensely. No judgment is made about which patterns are preferred, but rather the document is designed to help consumers of these data sources to navigate the patterns to select the appropriate URI to meet specific use cases.
[1] https://www.loc.gov/aba/pcc/bibframe/TaskGroups/URI-TaskGroup.html
Universities have been cataloging and adding subject headings as well as other metadata to give access to their collections for well over 100 years. The standard metadata offering is generally small and highly constrained by the many rules surrounding the MARC and related cataloging rules. Title, author and keywords are the main search avenues allowed in the standard library OPAC (Online Public Access Catalog). The results are often confusing, limited, and do not give a fair view of the breadth or depth of the collection holdings of a serious library. With the advent of the internet, expectations and ease of search by the user have changed significantly. They are no longer satisfied with the traditional catalog search. A full new paradigm has developed using the post-coordinate headings from thesauri and other knowledge organization systems supplanting the traditional pre-coordinated and complex subject headings used since the advent of Dewey, Cutter, and the Library of Congress Subject Headings. The search provided by the library OPAC is limited and falls far short of user experience in Google and other search systems.
Does this foreshadow the death of the library catalog? What are cutting edge libraries doing to bridge the gap between traditional cataloging and the new methodologies used in current metadata schemas? How are they expanding the accessibility to collections via digital means to support this new paradigm?
Facilitators/Presenters & Topics
Judy Russell, Dean of Libraries at University of Florida in Gainesville will describe the Florida Thesis Portal Project and how, via scanning OCR and semantic enrichment of the data the information from this and other special collections is making the data fully accessible for research and for business.
Dr. Ying-Hsang Liu, School of Information Studies, Charles Sturt University, Australia, will describe recent work on the search effectiveness of using thesaurus terms to supplement the sparse metadata records of library catalog from information retrieval perspectives. Another recent user study of domain experts' eye gaze behavior in interacting with novel search interfaces, in which MeSH terms were displayed in different ways, will also be presented. This research demonstrates the interdisciplinary approach to user experience study, including organization of information, information retrieval and search user interface design.
Dr. Koraljka Golub, Associate Professor, School of Cultural Sciences, Faculty of Arts and Humanities, Linnaeus University, will speak on the integration of social tagging and automatic indexing of the existing system of knowledge systems in libraries and provide an outline of a comprehensive evaluation.
Marjorie Hlava will wrap up with a discussion of how the simplification of a full library OPAC dataset mapped to a Dublin Core version of the similar fields can provide significantly improved retrieval on the library resources especially when combined with a taxonomy rather than using traditional library subject headings. A case study of the process and tools used will also be presented.
Moderator: Thomas Baker, DCMI Directorate, Germany
The long-term usefulness of vocabularies—whether of metadata terms or of terms in authorities for people, places, or topic categories—depends on the reliability, stability, and persistence of their documentation and identifiers. Small organizations or projects may not be in a position to make long-term commitments on their own. Third-party redirection services, such as purl.org, facilitate the migration of documentation and identifiers to future owners, though the unavailability of the purl.org service over the past year, recently resolved with its migration to Internet Archive, reminds us of the inherent fragility of persistence architectures built around single points of failure. Arrangements for DNS inheritance such as the DCMI-FOAF agreement could be generalized to larger consortia of stakeholders, but how could we bootstrap such a development?
This year's DCMI Roundtable will discuss questions of vocabulary stability and persistence in light of several ongoing developments:
Consultant Richard Wallis will describe the newly announced purl.org service at the Internet Archive in the context of the archive's own strategy for persistence.
Leif Andresen, Advisor to the Director of the Royal Library of Denmark, will report on the ongoing process to publish the remaining properties and classes of DCMI Metadata Terms as ISO 15836 Part 2 and comment on the evolving practice of the International Standards Organization with regard to vocabulary identification and persistence.
Paul Walk, Head of Technology Strategy and Planning for EDINA at the University of Edinburgh, Scotland, will report on recent progress on replacing DCMI's 25-year-old processes for generating terms documentation with more generic and modern technology and and comment on the redundancy and versioning functionality of social coding platforms such as Github.
After these lightning talks, the speakers will be joined by Eva Mendez (Universidad Carlos III, Spain), Ana Alice Baptista (University of Minho, Portugal), and others for a panel discussion, with audience input, on the future of publishing and identifying metadata terms.
DCMI's work is supported, promoted and improved by « Member organizations » around the world:
DCMI's annual meeting and conference addresses models, technologies and applications of metadata