OAI-PMH2 XMLFile File-based Data Provider

Download : OAI-XMLFile-2.1.tar.gz
Download : OAI-XMLFile-2.0.tar.gz


This is a data provider module that operates over a set of XML files 
that contain the metadata. It is meant to require a minimal of effort 
while retaining all the flexibility of the OAI protocol.

- OAI v2.0 protocol support

- no installation or compilation - Perl scripts need only be copied
- code layout for separate components or libraries of components
- one installation can easily be used for multiple archives
- supports almost all general features of the protocol by default.
- clean separation between engine, configuration and data 
  (class/instance model)
- hierarchical sets mapped from directory structure
- multiple metadata formats generated on the fly
- harvesting by date based on the file modification dates
- does not couple tightly with the web server and requires no special
- designed for easy migration to accelerators such as FastCGI

- source files may be in multiple formats
- full namespace support for xml data (to enable
  multi-format source data)
- seconds granularity for harvesting
- uses standard VTOAI perl module (http://dlrl.cc.vt.edu/projects/OAI)
- symlinked files with same name correspond to records in multiple sets
- resumptionToken support
- identification description container support
- support for sets of filenames expressed in regular expressions
- arbitrary data file location
- six sample data sets are provided

- *NEW* mappings from setnames to setspecs


- Perl
- Web server with ability to run CGI scripts

Installation Instructions

1. Copy all files with default directory structure into a directory
   from which CGI scripts may be run

2. Use the repository explorer (http://purl.org/net/oai_explorer)
   to test the sample interface accessible at 
   'OAI-XMLFile/XMLFile/test1/oai.pl'. You will need to prefix this 
   with the full URL prefix to the script. 

3. Create new data providers by changing to the 'OAI-XMLFile/XMLFile'
   directory and running './configure.pl' with the parameter being
   the name of the archive. For example,
     ./configure.pl etdlibrary
4. Create translation scripts and stylesheets as necessary to transform
   your metadata into the formats you need. Some of the sample archives
   use "/usr/bin/xsltproc" to do this translation - make sure you have it
   and its location is correct if you want to use this.
5. Create ('identity.xml', 'identity2.xml', ...) files to contain 
   optional descriptions for the Identify service request.
6. Create setname values either using a mapping file in
   the configuration directory or using "_name_" files in each
   set directory (set test6 for examples of both)

7. Test the OAI-PMH2 interface
   - use the Repository Explorer at http://purl.org/net/oai_explorer
     and point it to the 'oai.pl' script in the archive directory
8. Create additional archives as necessary             

What are the samples ?

test1 : plain vanilla data provider with data in some kind of XML and XSLT
 used to transform into DC and VRA-Core
test2 : test1, with additional identity containers
test3 : plain dc as native format - no XSLT
test4 : multi-format source and multiple metadata format archive
test5 : lots and lots of files
test6 : setspec -> name mappings

Upgrading ?

To upgrade from a previous version, i recommend that you install these 
scripts in a new directory and point to your data source. Here are a 
few things to keep in mind:

- all configuration is now done through the ./configure.pl script
- your XSLT (if you used that) must now produce fully-qualified XML records
  with namespaces and schema information (see test1 for how this is done)
- additional identity containers must now be individual files that you
  store in the archive/instance directory (e.g., OAI-XMLFile/XMLFIle/test1/)

Module Layout

 - utility modules (in pure-perl)

 - OAI2DP = generic data provider
 - XMLFileDP = data provider for XMLFile components


This software is part of the larger project to build componentized
Digital Libraries based on the work of the Open Archives Initiative.
See http://oai.dlib.vt.edu/odl, http://dlrl.cc.vt.edu/projects/OAI,
and http://www.openarchives.org for more information.

This software is produced in part for the AmericanSouth.org
project (http://americansouth.org)

This is a research project, and we are always interested in 
feedback - questions, comments, and suggestions for improvement.
Please contact hussein@vt.edu as appropriate.

12 December 2002

Back to VTOAI Home Page     Back to DLRL Home Page     Back to Open Archives Home Page
Last updated: 12 December 2002