Welcome to the second post on our blog! I'm the Web & Digitization Application Developer (a.k.a. junior programmer), which means that my posts will be the most important. Before I begin, a shout-out to all my dead homies. Now that that's out of the way, I can proceed.
If you work with metadata in a digital repository, the time may come when you need to export it for other purposes. Your repository software may or may not make that easy for you; but as long as it supports OAI-PMH, it's both possible and simple to do in a standards-compliant way. To do the job, we will use PHP's DOM core; this is one of several ways to work with XML in PHP. It can, of course, also be done in any other common scripting language.
<?php
$qs = array(
'verb' => 'ListRecords',
'metadataPrefix' => 'oai_dc',
'set' => 'hughes'
);
$url = 'http://digital.library.unlv.edu/cgi-bin/oai.exe?'
. http_build_query($qs);
$xml = file_get_contents(utf8_encode($url));
if (!$doc = DOMDocument::loadXML($xml)) {
// abort
}
$xpath = new DOMXPath($doc);
$xpath->registerNamespace("dc",
"http://purl.org/dc/elements/1.1/");
?>
<dl>
<!-- The query() method returns a DOMNodeList object
(an array-like list of DOMNodes). The first parameter is
an XPath query. -->
<? foreach ($xpath->query('//dc:*', $doc) as $node): ?>
<dt><?= $node->nodeName ?></dt>
<dd><?= $node->nodeValue ?></dd>
<? endforeach ?>
</dl>
From here, you can do anything with it - print it to the screen, save it to a database, export it to a spreadsheet, etc. Handy. You can also fetch just a single record, using GetRecord in place of ListRecords:
$qs = array(
'verb' => 'GetRecords',
'metadataPrefix' => 'oai_dc',
'identifier' => 'oai:digital.library.unlv.edu:hughes/87'
);
A mobot - who wouldn't want one?
There is more information in the OAI record than just the DC fields; to access it, just register the necessary namespace(s) and modify the XPath query. You can always take a look at the contents of $xml to see the raw XML string.