Differences between revisions 1 and 2
Revision 1 as of 2008-06-26 12:26:05
Size: 5555
Editor: kfc
Comment: Created by the PackagePages action.
Revision 2 as of 2010-03-17 13:12:45
Size: 5741
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
The DOMS is a common system for storage and processing of digital material and metadata. The metadata is stored as digital metadata objects in the Fedora metadata storage. The material is stored in an external bitstorage and referenced by the [#metadata metadata objects]. The DOMS is a common system for storage and processing of digital material and metadata. The metadata is stored as digital metadata objects in the Fedora metadata storage. The material is stored in an external bitstorage and referenced by the [[#metadata|metadata objects]].
Line 17: Line 17:
When ingesting new collections, files of different formats will be converted to one of the above recommended formats, and the new files will be saved along with the originals. For text we will always extract and save a UTF8 version, and if the original text was formatted, we will also convert to one of PDF and OOXML. The 'compulsory' file formats for the different kinds of material will be specified by the [#types type objects (see below)], and the required metadata for the different formats will be specified by the [#formats file format objects (see below)]. When ingesting new collections, files of different formats will be converted to one of the above recommended formats, and the new files will be saved along with the originals. For text we will always extract and save a UTF8 version, and if the original text was formatted, we will also convert to one of PDF and OOXML. The 'compulsory' file formats for the different kinds of material will be specified by the [[#types|type objects (see below)]], and the required metadata for the different formats will be specified by the [[#formats|file format objects (see below)]].
Line 19: Line 19:
[[Anchor(metadata)]] <<Anchor(metadata)>>
Line 22: Line 22:
The metadata objects in the Fedora metadata storage are described using [http://www.fedora.info/download/2.0/userdocs/digitalobjects/introFOXML.html FoxML]. All metadata objects in the Fedora storage must have an [#types object type]. The metadata objects in the Fedora metadata storage are described using [[http://www.fedora.info/download/2.0/userdocs/digitalobjects/introFOXML.html|FoxML]]. All metadata objects in the Fedora storage must have an [[#types|object type]].
Line 24: Line 24:
[[Anchor(types)]] <<Anchor(types)>>
Line 30: Line 30:
The predefined types in DOMS are drawn as a hierarchy in [#fig1 fig. 1]. The specifications of a type is inherited by the subtype, and an object of a given type must comply with the specifications of this type as well as all super types. The predefined types in DOMS are drawn as a hierarchy in [[#fig1|fig. 1]]. The specifications of a type is inherited by the subtype, and an object of a given type must comply with the specifications of this type as well as all super types.
Line 32: Line 32:
[[Anchor(fig1)]] [[ImageLink(http://merkur/viewvc/trunk/docs/datamodel/fig/alternativeTypeHierarchy.png?root=doms&view=co)]] <<Anchor(fig1)>> [[http://merkur/viewvc/trunk/docs/datamodel/fig/alternativeTypeHierarchy.png?root=doms&view=co|{{http://merkur/viewvc/trunk/docs/datamodel/fig/alternativeTypeHierarchy.png?root=doms&view=co}}]]
Line 34: Line 34:
Figure 1. DOMS object type hierarchy. [http://merkur/viewvc/trunk/docs/datamodel/fig/alternativeTypeHierarchy.dia?root=doms&view=co Dia source]. Figure 1. DOMS object type hierarchy. [[http://merkur/viewvc/trunk/docs/datamodel/fig/alternativeTypeHierarchy.dia?root=doms&view=co|Dia source]].
Line 38: Line 38:
[[Anchor(formats)]] <<Anchor(formats)>>
Line 41: Line 41:
File format objects are referenced by objects with object type file. File objects also reference content, i.e. a file in bitstorage, and they include a technical datastream (see [:DataModel/Type file:Type File]). The file format objects describe the file format of the content in bitstorage and the required metadata in the technical datastream. There will be predefined file format objects for the recommended formats: TIFF, WAV, BWF, MPEG1, MPEG2, UTF8, PDF, OfficeOpenXML, and possibly also for other common formats. File format objects are referenced by objects with object type file. File objects also reference content, i.e. a file in bitstorage, and they include a technical datastream (see [[DataModel/Type file|Type File]]). The file format objects describe the file format of the content in bitstorage and the required metadata in the technical datastream. There will be predefined file format objects for the recommended formats: TIFF, WAV, BWF, MPEG1, MPEG2, UTF8, PDF, OfficeOpenXML, and possibly also for other common formats.
Line 50: Line 50:
All the predefined types and file formats have been created as objects as part of a [:DataModel/ExampleObjects SBCollection:'base SB collection']. The base collection also includes at least one rights object, a collection object and some file format objects. This collection is meant to be ingested as the first collection in the DOMS metadata storage. All the predefined types and file formats have been created as objects as part of a [[DataModel/ExampleObjects SBCollection|'base SB collection']]. The base collection also includes at least one rights object, a collection object and some file format objects. This collection is meant to be ingested as the first collection in the DOMS metadata storage.
Line 53: Line 53:
 * [:DataModel/ExampleObjects logoCollection:The 'logo' example objects]
 * [:DataModel/ExampleObjects ramCollection:The 'radioavismanuskript' example objects]
 * [[DataModel/ExampleObjects logoCollection|The 'logo' example objects]]
 * [[DataModel/ExampleObjects ramCollection|The 'radioavismanuskript' example objects]]
Line 57: Line 57:
[[Include(DataModel/Alternative_Type_DOMS,,)]]
[[Include(DataModel/Type_collection,,)]]
[[
Include(DataModel/Type_image,,)]]
[[Include(DataModel/Type_audio,,)]]
[[
Include(DataModel/Type_video,,)]]
[[
Include(DataModel/Type_text,,)]]
[[
Include(DataModel/Type_file,,)]]
[[Include(DataModel/Type_type,,)]]
[[
Include(DataModel/Type_rights,,)]]
[[
Include(DataModel/Type_fileformat,,)]]
<<Include(DataModel/Alternative_Type_DOMS,,)>>
<<Include(DataModel/Type_collection,,)>>
<<
Include(DataModel/Type_image,,)>>
<<Include(DataModel/Type_audio,,)>>
<<
Include(DataModel/Type_video,,)>>
<<
Include(DataModel/Type_text,,)>>
<<
Include(DataModel/Type_file,,)>>
<<Include(DataModel/Type_type,,)>>
<<
Include(DataModel/Type_rights,,)>>
<<
Include(DataModel/Type_fileformat,,)>>
Line 69: Line 69:
[[ImageLink(http://merkur/viewvc/trunk/docs/datamodel/fig/GentofteCollection.png?root=doms&view=co)]] [[http://merkur/viewvc/trunk/docs/datamodel/fig/GentofteCollection.png?root=doms&view=co|{{http://merkur/viewvc/trunk/docs/datamodel/fig/GentofteCollection.png?root=doms&view=co}}]]
Line 71: Line 71:
Figure 1. [http://merkur/viewvc/trunk/docs/datamodel/fig/GentofteCollection.dia?root=doms&view=co Dia source]. Figure 1. [[http://merkur/viewvc/trunk/docs/datamodel/fig/GentofteCollection.dia?root=doms&view=co|Dia source]].

Alternative DOMS Data Model

The DOMS is a common system for storage and processing of digital material and metadata. The metadata is stored as digital metadata objects in the Fedora metadata storage. The material is stored in an external bitstorage and referenced by the metadata objects.

Digital Material and File Formats

Based on existing collections at SB, we have identified four kinds of digital material:

  1. Images (TIFF, PNG, JPEG2000, etc.)
  2. Audio (Broadcast WAV, WAV, etc.)
  3. Video (MPEG1, MPEG2, etc.)
  4. Text (PDF, XML, etc.)

For each kind of material, we have chosen a number of recommended formats. The DOMS will ensure access to content of files in these formats.

  1. Images: TIFF
  2. Audio: WAV, BWF
  3. Video: MPEG1, MPEG2
  4. Text: UTF8, PDF, OfficeOpenXML

When ingesting new collections, files of different formats will be converted to one of the above recommended formats, and the new files will be saved along with the originals. For text we will always extract and save a UTF8 version, and if the original text was formatted, we will also convert to one of PDF and OOXML. The 'compulsory' file formats for the different kinds of material will be specified by the type objects (see below), and the required metadata for the different formats will be specified by the file format objects (see below).

Metadata Objects

The metadata objects in the Fedora metadata storage are described using FoxML. All metadata objects in the Fedora storage must have an object type.

Object Types

All objects in the DOMS metadata storage must have a type. The type is also an object. The type describes the content model, i.e. the compulsary and legal content of an object of this type. An object has a relation to the type it claims to be. It should be possible to validate that the object is indeed of this type. A number of base types (base type objects) are predefined, and it is possible to define new type objects when needed. The 'object type' is type SB and all other types extend type SB (directly or indirectly).

Predefined Object Type Objects

The predefined types in DOMS are drawn as a hierarchy in fig. 1. The specifications of a type is inherited by the subtype, and an object of a given type must comply with the specifications of this type as well as all super types.

http://merkur/viewvc/trunk/docs/datamodel/fig/alternativeTypeHierarchy.png?root=doms&view=co

Figure 1. DOMS object type hierarchy. Dia source.

The types are described on the following sub pages. The DOMS type is the base type of all objects, and all the other types define additions to the DOMS type. Some technical content descriptions can be found where the content is introduced.

File Format Objects

File format objects are referenced by objects with object type file. File objects also reference content, i.e. a file in bitstorage, and they include a technical datastream (see Type File). The file format objects describe the file format of the content in bitstorage and the required metadata in the technical datastream. There will be predefined file format objects for the recommended formats: TIFF, WAV, BWF, MPEG1, MPEG2, UTF8, PDF, OfficeOpenXML, and possibly also for other common formats.

Levels of Metadata

In summary, we have three levels of metadata:

  1. Common Core: The metadata that must be present in all objects. This includes the core properties defined by the FoxML format, an SB Dublin Core description and an index representation disseminator and is formalised as the SB type below.
  2. Base Object Type Specified: The metadata defined by the base object types described above. For example the image object type specifies that an object of type image must have a hasFile relation to an object of type file with reference to the TIFF file format.

  3. Collection Object Type Specified: New object types can be introduced along with a new collection.

SB Collection

All the predefined types and file formats have been created as objects as part of a 'base SB collection'. The base collection also includes at least one rights object, a collection object and some file format objects. This collection is meant to be ingested as the first collection in the DOMS metadata storage.

Example Collections

http://merkur/viewvc/trunk/docs/datamodel/fig/GentofteCollection.png?root=doms&view=co

Figure 1. Dia source.

DataModel/AlternativeDataModel (last edited 2010-03-17 13:12:45 by localhost)