Menu
Search

Data Science & Metadata Research

To be discoverable by today’s online users, traditional library data must be transformed. OCLC Research analyzes bibliographic data to derive new meaning, insights, and services for use by library and information seekers. This work includes special projects in metadata enrichment, authorities & identities, linked data, subjects & classification, and data analysis.

Highlighted Data Science & Metadata Research projects, publications, and presentations

  Application
ArchiveGrid application

ArchiveGrid

ArchiveGrid is a collection of millions of archival material descriptions, including MARC records from WorldCat and finding aids harvested from the web.  

ArchiveGrid provides access to detailed archival collection descriptions such as documents, personal papers, family histories, and other archival materials held by thousands of libraries, museums, historical societies, and archives. It also provides contact information for the institutions where these collections are kept.

Learn more about ArchiveGrid

  Application

FAST (Faceted Application of Subject Terminology)

FAST is a vocabulary of controlled terms that can be used to describe the subject content of any kind of intellectual or creative work. The terms used by FAST are derived from the Library of Congress Subject Headings system.

FAST has several exploratory interfaces:

  • searchFAST—A full feature search interface to the FAST database.
  • FAST Converter—A web interface for the conversion of LCSH headings to FAST headings.
  • assignFAST—A Web service that automates the manual selection of FAST Subjects based on autosuggest technology.
  • FAST Linked Data—FAST as a Linked Data service to interact with the Semantic Web.
  • importFAST [beta]—importFAST allows you to import Library of Congress Personal or Corporate names into the FAST Authorities, with the immediate assignment of a FAST number. Topical heading and subdivision combinations can also be assigned.

 

Learn more about FAST

 

  Project

Stewarding the Collective Collection: Exploratory Data Analysis

A collaborative project between OCLC and The Partnership for Shared Book Collections has examined the nature of retention commitments currently registered in WorldCat. This builds on the previous work of OCLC and the Center for Research Libraries, funded by the Andrew W. Mellon Foundation, to support shared print.

 

Learn more

 

  Publication
Responsible Operations: Data Science, Machine Learning, and AI In Libraries

Responsible Operations: Data Science, Machine Learning, and AI in Libraries

by Thomas Padilla

Responsible Operations is intended to help chart library community engagement with data science, machine learning, and artificial intelligence (AI) and was developed in partnership with an advisory group and a landscape group comprised of more than 70 librarians and professionals from universities, libraries, museums, archives, and other organizations.

 

Read the Report

 

Hanging Together: the OCLC Research blog

For information and insights on the topics and challenges faced by the library, archive, and museum communities check out the OCLC Research blog, Hanging Together.

Go to Hanging Together