Menu
Search

Using OpenRefine and the WorldCat Metadata API to get current OCLC numbers

One issue that a lot of catalogers and metadata librarians struggle with is getting the current OCLC number for a given record into their local system. This is because records and OCLC numbers are merged over time. When a library pulls a record, they get the record and OCLC number at that moment in time. However, most libraries want to keep their OCLC numbers current. So, what’s the best way to do this?

One could write some Python, Ruby, Perl, or whatever programming language to look for merged OCLC numbers, but that requires coding skills. Non-coders can use OpenRefine in combination with the WorldCat Metadata API to accomplish this task.

Using OCLC APIs that are protected with Access Tokens

Most OCLC APIs are protected with a stronger form of authentication and authorization called Access Tokens. You might think that this means you need to know how to write Jython in order to interact with them via OpenRefine. However, that isn’t true. A valid Access Token can be inserted into a “Fetch by URL” operation.

How do you obtain a valid Access Token if you don’t know how to write code? The easiest way is by using OCLC’s API Explorer.

  1. Go to the DevNet page for OAuth, ClientCredentialGrant GetToken.
  2. Choose “Use my credentials.”
  3. Enter your WSKey, secret, principal ID, and principal ID namespace (IDNS).
    (You can find out more about how to get a principal ID and principal IDNS by reading the User Level Authentication and Authorization page.)
  4. For the prompted URL, fill in the following:
    1. authenticatingInstitutionID: your library’s registry ID
    2. contextInstitutionID: your library’s registry ID
    3. scopes: WorldCatMetadataAPI (the ID(s) for the service you want to access)
  5. Click the “Send the Request” button.
  6. Look at the JSON response, and find the value labeled “access_token.” This will look something like: tk_ruQqgWU2PZsBdkMTq2Po73PCvYxWybJgzlIR.
file

Now you’ve got a valid Access Token that is good for 20 minutes. You can use this token in OpenRefine by putting it in the “Authorization” field of your “Fetch by URL” operation.

Note the field labeled “Accept” with the value */* in it. If the API you are using supports JSON, change this to “application/json” to get JSON back.

file

Making the request to the WorldCat Metadata API

Now that you have an Access Token, you need to make a request to the WorldCat Metadata API in order to get the current OCLC numbers.

“https://worldcat.org/bib/checkcontrolnumbers?oclcNumbers=” + value

This is done by adding a column to Fetch by URL.

file

This will return a response that looks like the following:

{
  "entry": [
    {
      "requestedOclcNumber": "2416076",
      "currentOclcNumber": "24991049",
      "httpStatusCode": "HTTP 200 OK",
      "id": "http://worldcat.org/oclc/24991049",
      "institution": "OCPSB"
    }
  ],
  "startIndex": 1,
  "totalResults": 1,
  "itemsPerPage": 1
}

Now we want to add a new column named "Current OCLC Number". The contents of this column will be based on parsing the data in the fetched response. The following snippet parses the data.

value.parseJson().entry[0].currentOclcNumber
file

Checking if holdings are set

This same technique can be used to make an API call to see if the library has holdings set on a particular record. To do this, “Add a column by fetching URLs” and use the following snippet to create the requests:

“https://worldcat.org/ih/checkholdings?oclcNumber=”+ value
file

Add a new column, named "Holding set?", based on parsing the data in the fetched response.

{
  "requestedOclcNumber": "2416076",
  "currentOclcNumber": "24991049",
  "isHoldingSet": false,
  "id": "http://worldcat.org/oclc/24991049",
  "institution": "OCPSB"
}

The following snippet parses the data.

value.parseJson().entry[0].isHoldingSet
file

This single API request actually returns both

  • the current OCLC number in the field currentOclcNumber, and
  • whether or not the library has holdings set in the field isHoldingSet.

Based on the data we can create a spreadsheet of current OCLC Numbers and whether or not holdings are set.

file

Next steps

These examples demonstrate how to make authenticated API call using an Access Token. However, they don’t help with API calls that require a different HTTP method, such as POST, PUT, or DELETE. We’ll cover how to use custom Jython code to make those types of requests within OpenRefine in the next post.

  • Karen Coombs

    Karen Coombs

    Senior Product Analyst