doms:ContentModel_File

Extends [:DataModel/ContentModel_DOMS: doms:ContentModel_DOMS]

In DOMS, we have found it beneficial to separate the abstract concept of "Image" or "Audio" from the concrete implementations such as "jpeg" and "mp3". The metadata about the image will be relevant no matter the manifestation of the image, and as such should not reside along with the technical metadata about the manifestation. To support this separation, we have introduced the concept of File objects.

A File object is an object, that contain a link to the file (in Bitstorage), and the technical metadata about this file. Only File objects are allowed to reference a file in Bitstorage. File objects must all have a Content Model that extends ContentModel_File.

It defines that objects of ContentModel_File can have "doms-relations:hasOriginal" relations to other objects of "doms:ContentModel_File". If a file A is the result of a migration of file B and both are in Doms, the File A data object will have a "doms-relations:hasOriginal" relation to the data object for File B.

Data objects of doms:ContentModel_File must have the datastream "CHARACTERISATION", "CONTENTS", "ORIGIN" and "PRONOMID". Each of these deserve some description.

CONTENTS

"CONTENTS" is where the actual file is. It will always have the the "E" controlgroup, meaning that the file is externally referenced. The reference must allways be to a File in Bitstorage, and only this datastream is ever allowed to reference files.

If you get the datastream through the standard API, you get the contents of the file, not the link.

PRONOMID

In order to perform proper digital preservation, we need to store the exact format of each file somehow. The National Archives, Great Britain have developed the PRONOM scheme, http://www.nationalarchives.gov.uk/pronom For each file format, or version thereof, in the registry, they have a signature file, that enables their tool (DROID) to identify files of this type.

We selected PRONOM because they are currently able to identify all the relevant preservation file formats used in DOMS, and was the closest we could find to an uniformly accepted standard.

The schema from the PRONOMID_SCHEMA datastream.

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance#"
            targetNamespace="http://doms.statsbiblioteket.dk/properties/pronomID/0/1/#"
            xmlns="http://doms.statsbiblioteket.dk/properties/pronomID/0/1/#"
            elementFormDefault="qualified"
            attributeFormDefault="unqualified">

    <xsd:element name="pronomID" type="extpropertiesType"/>

    <xsd:complexType name="extpropertiesType">
        <xsd:attribute name="value" type="xsd:string"/>
    </xsd:complexType>
</xsd:schema>

ORIGIN

CHARACTERISATION

Requirements for objects described by ContentModel_File

The characterisation datastream could look like this.

<?xml version="1.0" encoding="UTF-8"?>
<char:characterisation xsi:schemaLocation="http://doms.statsbiblioteket.dk/types/characterisation/0/1/# http://developer.statsbiblioteket.dk/DOMS/types/characterisation/0/1/characterisation/characterisation-0-1.xsd"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:char="http://developer.statsbiblioteket.dk/DOMS/types/characterisation/0/1/#"
  xmlns:jhove="">
  <char:characterisationRun>
    <char:tool>JHove</char:tool>
    <char:output>
      <jhove:...>
      </jhove:...>
    </char:output>
  </char:characterisationRun>
</char:characterisation>