OCLC Developer Network

MARCView/MARConvert Software Description

In early 2010 Systems Planning donated the MARCView and MARConvert source code to OCLC. OCLC is making the source code for these tools available under an Apache 2.0 open source license. We have created a Subversion repository for the source code and have provided a download of an executable of MARCView for installation purposes. We know this tool has a strong user group and are hoping that someone from the community will step forward and act as maintainer for the project. Community members interested in contributing to the project should send inquiries to Karen Coombs, Product Manager for the OCLC Developer Network.

NOTE: This software is provided AS IS and no support for it is provided by OCLC.

Download the software

MARConvert

MARConvert handles special problems or unusual requirements in converting records into or out of MARC21, UNIMARC, or MARCXML bibliographic or authority formats. It will also convert MARC records from one character set to another.

MARConvert is available only in custom versions, it can be setup to work with your specialized or non-standard data. Custom versions can be created from the source code.

Database Integration

MARConvert can write to databases such as SQL Server. There are three ways the data can be loaded:

  1. MARConvert converts MARC records to delimited-text files for you to load.
  2. MARConvert produces delimited-text files and immediately loads them into the database using a third-party utility, such as BCP for SQL Server. One million MARC records can be converted and loaded in only 20 minutes.
  3. MARConvert inserts directly into the database using ODBC without generating output files.

Formats

Since MARConvert can be customized to your requirements, it can convert MARC records to or from any other format:

  • Delimited ASCII files (using whatever delimiter you require)
  • Tagged-text formats (in which each field is on a separate line preceded by the fieldname)
  • Markup formats such as XML or HTML
  • Relational tables or databases
  • From MARC21 to UNIMARC or the reverse
  • From MARC21 to MARCXML or the reverse

There is no limit to the number of records whether converting into or out of MARC. The size of the MARC file (whether converting into or out of MARC) is limited to 2 gigabytes (2,147,483,647 bytes, about 2 million average MARC records); if this is too small, multiple files can be converted in a single operation. Input files are unaffected by the conversion.

Character Encodings

Character encodings can be converted from anything to anything. Conversion can be based on the Leader byte 09 in each record (which indicates the MARC-8 or UTF-8 encoding) or another data element. MARConvert currently handles MARC-8, Latin-1, UTF-8, UNIMARC, and the character set used in the Library of Congress' MakrBrkr programs. Other conversions can be added to your requirements.

For both MARC-8 and UTF-8, conversions can include the entire MARC-8 repertoire of almost 17,000 characters, including Chinese, Japanese, Korean, Hebrew, Arabic, Greek, and Cyrillic. Both pre-composed and decomposed UTF-8 characters are converted.
Modes

MARConvert operates in two modes:

  1. Interactive mode: An interactive Windows program in which you specify the file to convert and other options using a Windows interface. In this mode you also have all the features of MARCView™, so you can view, search, and print MARC21, UNIMARC, and MARCXML files.
  2. Batch mode: MARConvert can be called from a batch program so you can integrate it with other programs. Or you can write a command file so numerous files can be converted quickly without having to select each one.

These two modes are combined into a single product.

Flexibility

For simpler conversions, MARConvert uses a plain-text Translation Table that specifies how records are converted. You can edit the table with Notepad to modify the conversion if your needs change. The Translation Tables provide flexibility to describe:

  • How fields are combined and separated
  • The separator used between the field label and the data, as well as the separator between records (separators may be more than one character and can include spaces, tabs, and line-feeds)
  • How constant character strings are combined with record variables (data may be preceded, followed, or surrounded by any text)
  • Subfields added to the MARC record either by default or only if a specified field is present in the source file

Complex conversions must be coded into the program and cannot be altered using the Translation Table.

Data Validation

Validate or modify your data to your requirements, such as:

  • Records combined or split during conversion
  • Split output into multiple files
  • Conversion that depends on relationships between records
  • Values replaced or modified from authority files
  • Code lookup and replacement
  • Checking for invalid MARC record structure
  • Checking for illegal characters in records

Conversions

MARConvert has been used to convert many records to a variety of format. Some example conversions include:

  • Conversion of MARC21 records into a specialized SQL Server database. MARConvert produces delimited-text files for the client to load in a separate process, as well as the option to load directly using ODBC. Character encoding is converted from MARC-8 to UTF-8.
  • Daily conversion of one million MARC21 records into a specialized SQL Server database. MARConvert produces delimited-text files and immediately loads them into the database using BCP (Microsoft's bulk-loading utility). MARConvert performs the conversion and loading of one million records in under 20 minutes. Data is loaded using the MARC-8 encoding.
  • Conversion of MARC21 records to a text format suitable for importing into a mainframe application.
  • Lossless round-tripping of MARC21 records from UTF-8 to MARC-8 and back again. The conversions handle the entire MARC repertoire of almost 17,000 characters, including Chinese, Japanese, Korean, Hebrew, Arabic, Greek, and Cyrillic. Both pre-composed and decomposed UTF-8 characters are converted.
  • Conversion of MARC21 records to UNIMARC.
  • Conversion of MARC-8 records to UTF-8, and UTF-8 records to MARC-8, with customized error reporting. In both directions, the entire MARC-8 repertoire of almost 17,000 characters is handled, including Chinese, Japanese, Korean, Hebrew, Arabic, Greek, and Cyrillic. Both pre-composed and decomposed UTF-8 characters are converted.
  • Conversion of Medline OVID records into MARC21 bibliographic records.
  • Conversion of BASIS TechLib records into MARC21 bibliographic and holdings records.
  • Conversion of MARC21 records from the UTF-8 character set to MARC-8.
  • Conversion of MARC21 records to three tab-delimited files, suitable for loading into a relational database. Some data modifications based on values in the MARC records.
  • Conversion of MARC21 records to blocked files of segmented records.
  • Conversion of Amico records to MARC21.
  • Conversion of MARC21 records from MARC-8 to UTF-8 character set.
  • Export of MARC21 records to a unique format used by the Canadian Heritage Information Network. In addition to the format conversion, numerous fields have data transformations, such as date reformating, adding new fields based on certain values, and rearranging data from ISBD form to a different order. Extensive data validation is performed to check for mandatory fields, to check for correct values, and to check for correct relationships between values in different fields.


MARCView

MARCView provides a user-friendly way to view, search, and print ANSI/ISO standard MARC records, UNIMARC records, and MARCXML records. It will also handle many files that are not strictly MARC. (It cannot be used to edit MARC records.)

There is no limit to the number of records in a file. However, the file size is limited to 2 gigabytes (2,147,483,647 bytes).

Handles records up to the MARC limit of 99,999 bytes and fields up to the MARC limit of 9,999 bytes.

MARCView is designed to not modify your files in any way.

Display and Navigation Features

  • Reads MARC21 records with MARC-8, UTF-8, or Latin-1 encoding.
  • Reads MARCXML records with UTF-8 encoding.
  • Reads UNIMARC records.
  • Displays special characters and diacriticals correctly if there are corresponding Latin-1 characters.
  • For MARC21 files, reads Leader byte 09, which indicates whether the record is in MARC-8 or UTF-8. (This allows for files of mixed MARC-8 and UTF-8 records.)
  • Scrollable Navigation Grid in left pane. Selected record is shown in right pane. You decide what fields and subfields to show in the Navigation Grid.
  • Panes and columns are resizable by dragging.
  • Prints any record the same way it appears on the screen. Prints a single record, a range of records, or the entire file.
  • Displayed record (or portion) may be copied and pasted into another application.
  • Display font is user-selectable.

Searching

  • Search for any term, including terms with diacriticals.
  • Search for data in specific fields and subfields.
  • Search for the presence of specific tags and subfield codes.
  • Search on indicator values.
  • Tags may include the "x" wildcard character.
  • Search Leader or specific Leader bytes.
  • Terms found in search are highlighted.

Display the MARC Record in Native Format

  • By clicking a button, you can view and print the original MARC record in hexadecimal and text view (works for all formats: MARC21, MARCXML, UNIMARC).
  • Color coded to make it easier to read. If you double-click on a MARC21 or UNIMARC directory entry, the corresponding field is highlighted too; double-clicking on a field highlights the directory entry too.
  • By changing the width of the Hexview window, you can cause the record to wrap according to your needs.
  • MARCXML records are displayed in color-coded XML format.
  • A separate Hexview window is opened for each record, so you can compare several records in their native format.

Statistics

  • Record- and field-level statistics on the MARC file (MARC21, UNIMARC, and MARCXML). (Enlarge the image to see what statistics are reported.)
  • Statistics can be printed.

Customization

  • Settings let you customize MARCView.
  • Records can include non-numeric tags (e.g. RLIN acquisitions fields).
  • Separate settings for bibliographic and authority files, and for UNIMARC files.

Other Features

  • MARCView is designed to read any file of MARC21, UNIMARC, or MARCXML bibliographic or authority records, in MARC-8, UTF-8, or Latin-1.
  • Shows complete fields with no truncation.
  • Currently runs under Windows 95/98/NT/2000/XP/Vista.

Follow the OCLC Developer Network:

The OCLC Developer Network supports the use of OCLC Web Services—a set of tools and APIs that expose data and services for WorldCat and our member libraries and partner institutions or companies. learn more »

© 2010 OCLC Domestic and international trademarks and/or service marks of OCLC Online Computer Library Center, Inc. and its affiliates


Powered by Drupal, an open source content management system