Data Science & Metadata Research

To be discoverable by today’s online users, traditional library data must be transformed. OCLC Research analyzes bibliographic data to derive new meaning, insights, and services for use by library and information seekers. This work includes special projects, data science research, engagement with metadata communities, publications and presentations, and the creation of illustrative experimental applications.

Completed Projects

Measuring Up: Assessing Accuracy of Reported Use and Impact of Digital Repositories

This project aims to better improve data collection and information sharing for institutional repositories and digitized collections. "Measuring Up" is led by Montana State University and includes partnerships with OCLC Research, the Association of Research Libraries and the University of New Mexico.
Learn more »

MARC Usage in WorldCat

This project will study the use of MARC tags and subfields in WorldCat and produce reports to inform decisions about where we go from here.
Learn more »

Scholars' Contributions to VIAF

This activity explores the potential benefits of collaborating with scholars to enrich the Virtual International Authority File (VIAF) with new names and additional script forms for names already represented. The experience and knowledge gained from working with diverse files may inform third parties’ development of authority tools used by scholars.
Learn more »

Work Records in WorldCat

View rich descriptions for books and other library materials.
Learn more »

Getting Found: SEO for Digital Repositories

This activity is part of an IMLS-funded project to develop strategies for improving the visibility of library digital repositories in Internet search engines through developing an RDF model based on for the people, places, organizations, and objects associated with an institutional repository and its contents.
Learn more »

Europeana Innovation Pilots

This collaborative initiative aims to pilot the use of existing and newly developed OCLC Research methods and techniques for cleansing and enriching large aggregations of metadata to identify and create semantic links between heterogeneous objects that are connected.
Learn more »

Sharing and Aggregating Social Metadata

Identify the user contributions that would enrich the descriptive metadata created by libraries, archives, and museums and the issues that need to be resolved to communicate and share user contributions on the network level.
Learn more »

WorldCat Genres

Genre profiles allow users to browse genre terms for hundreds of titles, authors, subjects, characters, places, and more, ranked by popularity in WorldCat.
Learn more »

Name Extraction

This project attempts to develop tools that advance the state of the art in extracting names from unstructured text and disambiguating them using authority files developed in the library community.
Learn more »

Terminology Services

This project provides Web-based services for controlled vocabularies.
Learn more »

Metadata Schema Transformation Services

The goal of the Metadata Schema Transformation project is to develop a simple, web-accessible service that translates metadata records from one publicly defined format into another.
Learn more »


OAICat is a Java Servlet implementation of the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) v2.0.
Learn more »


The SRW/U Open Source project offers software that implements both the SRW Web Service and the SRU REST model interface to databases. Included are interfaces that support DSpace and Lucene implementation and OCLC's Pears database.
Learn more »


The VIAF (Virtual International Authority File) service provides libraries and library users with convenient access to the world’s major name authority files.
Visit the VIAF service page»
Learn more about the OCLC Research Project»

Cookbook Finder

Cookbook Finder is a works-based application that provides access to thousands of cookbooks and other works about food and nutrition described in library records. You can search by person, place, topic (e.g., course, ingredient, method, and more) and browse related works by author and topic (supplied by the Kindred Works/Recommender API). Results include links to full-text when available from HathiTrust and Project Gutenberg.
Learn more »

Kindred Works

Kindred Works is a demonstration interface built upon an experimental content-based recommender service. Various characteristics associated with a sample resource, such as classification numbers, subject headings, and genre terms, are matched to WorldCat to provide a list of recommendations.
Learn more »

WorldCat Identities Network

The WorldCat Identities Network gives users the opportunity to visually explore the interconnectivity and relationships between WorldCat Identities.
Learn more »


OCLC Research Archive

OCLC Research continually evolves what we investigate, research, and report on as the field's needs change. For historical project information, explore the OCLC Research Archive, which holds a wealth of information about the work of OCLC produced over the decades..

Access the OCLC Research Archive >