Differences between revisions 1 and 35 (spanning 34 versions)
Revision 1 as of 2009-01-12 15:14:01
Size: 120
Editor: jrg
Comment:
Revision 35 as of 2010-03-17 13:08:49
Size: 5294
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
This page will contain information on the data of De Danske Aviser, and its representation and data model in MiniDOMS. This page contains information on the data of De Danske Aviser, and its representation and data model in MiniDOMS.

== Data model objects and inter-object relations ==

Below, the fourth draft of the data model.

{{attachment:DDADataModelV4.png}}

The data model loosely follows the structure of the original publication.
As such it is divided into a number of volumes (based on period).

Each Volume is divided as
 * Region
 * Town
 * Paper
 * Page
 * .png

These divisions loosely follow those of the original publication.

 * Each '''Page''' in the original publication has been digitalized as a .png file, found in the bit storage, and associated with a page object.
 These must be connected to each the former and the previous page.

 * Each '''Paper''' has an entry covering one or more pages and refers to the first page of the entry.

 * The '''Region''' object references a few pages containing information on the specific area during the relevant period. The end-users should see the word "Gruppering" rather than "Region", as this level of information in the structure of the physical books is used for geographical regions as well as other ways of grouping.
 * Similarly each '''Volume''' contains a few pages dealing with subject relevant to newspaper publication on a national basis.


== On-object metadata ==
Below we summarize the different metadata that will be present on the objects in the data model.

'''Object: supervolume'''<<BR>>
|| PID || Example PID ||
|| {{{doms:dda_supervolume}}} || {{{doms:dda_supervolume}}} ||
<<BR>>
|| element-type || name || example || minOccurs || maxOccurs || notes ||
|| DC || title || De Danske Aviser 1634-1989 || 1 || 1 || ||
|| DC || creator || Sølling, Jette D. || 1 || * || ||
|| DC || description || Håndbog over ... || 1 || 1 || ||
|| DC || publisher || Statsbiblioteket || 1 || * || ||
|| DC || type || Text || 1 || * || ||
|| DC || source || 3-binds værk udg. af ... || 1 || * || ||
|| DC || language || dan || 0|| * || ||
<<BR>>


'''Object: volume'''<<BR>>
|| PID || Example PID ||
|| {{{doms:dda_volume_vol<VOLUME>}}} || {{{doms:dda_volume_vol3}}} ||
<<BR>>
|| element-type || name || example || minOccurs || maxOccurs || notes ||
|| DC || title || De Danske Aviser 1634-1989 || 0 || * || This is the same as the title of the supervolume for volume 1 and 2. The exception is the title of vol. 3, which is 'De Danske Aviser 1634 - 1991' due to a late extention of the project. ||
|| DC || alternative || Bd. 1: 1634 - 1847 || 0 || * || ||
|| DC || extent || 356 p. || 1 || 1 || ||
<<BR>>


'''Object: region'''<<BR>>
|| PID || Example PID ||
|| {{{doms:dda_region_vol<VOLUME>_<REGIONNAME>}}} || {{{doms:dda_region_vol3_Koebenhavn}}} ||
<<BR>>
|| element-type || name || example || minOccurs || maxOccurs || notes ||
|| DC || title || Fyn || 1 || * || ||
<<BR>>


'''Object: town'''<<BR>>
|| PID || Example PID ||
|| {{{doms:dda_town_vol<VOLUME>_number<NUMBER>}}} || {{{doms:dda_town_vol3_number1}}} ||
<<BR>>
|| element-type || name || example || minOccurs || maxOccurs || notes ||
|| DC || title || Odense || 1 || * || This name of a town will be filled in by the full preingester. The preingester will look in the datastream XMLForPaperFromOCR for a given paper object, to determine the name of its town(s). This town name will be a translation of a number like "1-19" (via the table found in f.x. "BIND_1.xml") into a town name like "Ribe". ||
<<BR>>


'''Object: paper'''<<BR>>
|| PID || Example PID ||
|| {{{doms:dda_paper_vol<VOLUME>_number<NUMBER>}}} || {{{doms:dda_paper_vol3_number1-19}}} ||
<<BR>>
|| element-type || name || example || minOccurs || maxOccurs || notes ||
|| DC || title || Vejle Amts Folkeblad || 1 || * || ||
|| Data Stream || XMLForPaperFromOCR|| paper_0354.xml || N/A || N/A || ||
<<BR>>



'''Object: page'''<<BR>>
|| PID || Example PID ||
|| {{{doms:dda_page_vol<VOLUME>_page<PAGE>}}} || {{{doms:dda_page_vol3_page123}}} ||
<<BR>>
|| element-type || name || example || minOccurs || maxOccurs || notes ||
|| DC || title || bind1_354 || 1 || 1 || ||



<<BR>>


'''Object: pngfile'''<<BR>>
|| PID || Example PID ||
|| {{{doms:dda_pngfile_vol<VOLUME>_page<PAGE>}}} || {{{doms:dda_pngfile_vol3_page123}}} ||

The filename (including path) in bit storage, referenced by a pngfile object, should be: {{{DeDanskeAviser/<original filename>}}}<<BR>>
that is, an example would be: {{{DeDanskeAviser/dda3_123.png}}}<<BR>>

The URL should be: {{{http://bitfinder.statsbiblioteket.dk/DeDanskeAviser/<original filename>}}}<<BR>>
that is, an example would be: {{{http://bitfinder.statsbiblioteket.dk/DeDanskeAviser/dda3_123.png}}}<<BR>>


<<BR>>

== Regarding bundling of objects ==
A bundle contains a number of objects, and one of the objects has special status - namely the object that is ''central'' to the bundle.

Special note: We also need bundles for page objects. Such a bundle should comprise volume, region, town, and paper objects, since they all reference pages. This makes it possible to "click back via a link" in the final product. (Via a Summa-xslt, it will be set so that pages are not searchable in Summa, but they ''will'' be shown when reached via a link from a paper.)

This page contains information on the data of De Danske Aviser, and its representation and data model in MiniDOMS.

Data model objects and inter-object relations

Below, the fourth draft of the data model.

DDADataModelV4.png

The data model loosely follows the structure of the original publication. As such it is divided into a number of volumes (based on period).

Each Volume is divided as

  • Region
  • Town
  • Paper
  • Page
  • .png

These divisions loosely follow those of the original publication.

  • Each Page in the original publication has been digitalized as a .png file, found in the bit storage, and associated with a page object. These must be connected to each the former and the previous page.

  • Each Paper has an entry covering one or more pages and refers to the first page of the entry.

  • The Region object references a few pages containing information on the specific area during the relevant period. The end-users should see the word "Gruppering" rather than "Region", as this level of information in the structure of the physical books is used for geographical regions as well as other ways of grouping.

  • Similarly each Volume contains a few pages dealing with subject relevant to newspaper publication on a national basis.

On-object metadata

Below we summarize the different metadata that will be present on the objects in the data model.

Object: supervolume

PID

Example PID

doms:dda_supervolume

doms:dda_supervolume


element-type

name

example

minOccurs

maxOccurs

notes

DC

title

De Danske Aviser 1634-1989

1

1

DC

creator

Sølling, Jette D.

1

*

DC

description

Håndbog over ...

1

1

DC

publisher

Statsbiblioteket

1

*

DC

type

Text

1

*

DC

source

3-binds værk udg. af ...

1

*

DC

language

dan

0

*


Object: volume

PID

Example PID

doms:dda_volume_vol<VOLUME>

doms:dda_volume_vol3


element-type

name

example

minOccurs

maxOccurs

notes

DC

title

De Danske Aviser 1634-1989

0

*

This is the same as the title of the supervolume for volume 1 and 2. The exception is the title of vol. 3, which is 'De Danske Aviser 1634 - 1991' due to a late extention of the project.

DC

alternative

Bd. 1: 1634 - 1847

0

*

DC

extent

356 p.

1

1


Object: region

PID

Example PID

doms:dda_region_vol<VOLUME>_<REGIONNAME>

doms:dda_region_vol3_Koebenhavn


element-type

name

example

minOccurs

maxOccurs

notes

DC

title

Fyn

1

*


Object: town

PID

Example PID

doms:dda_town_vol<VOLUME>_number<NUMBER>

doms:dda_town_vol3_number1


element-type

name

example

minOccurs

maxOccurs

notes

DC

title

Odense

1

*

This name of a town will be filled in by the full preingester. The preingester will look in the datastream XMLForPaperFromOCR for a given paper object, to determine the name of its town(s). This town name will be a translation of a number like "1-19" (via the table found in f.x. "BIND_1.xml") into a town name like "Ribe".


Object: paper

PID

Example PID

doms:dda_paper_vol<VOLUME>_number<NUMBER>

doms:dda_paper_vol3_number1-19


element-type

name

example

minOccurs

maxOccurs

notes

DC

title

Vejle Amts Folkeblad

1

*

Data Stream

XMLForPaperFromOCR

paper_0354.xml

N/A

N/A


Object: page

PID

Example PID

doms:dda_page_vol<VOLUME>_page<PAGE>

doms:dda_page_vol3_page123


element-type

name

example

minOccurs

maxOccurs

notes

DC

title

bind1_354

1

1


Object: pngfile

PID

Example PID

doms:dda_pngfile_vol<VOLUME>_page<PAGE>

doms:dda_pngfile_vol3_page123

The filename (including path) in bit storage, referenced by a pngfile object, should be: DeDanskeAviser/<original filename>
that is, an example would be: DeDanskeAviser/dda3_123.png

The URL should be: http://bitfinder.statsbiblioteket.dk/DeDanskeAviser/<original filename>
that is, an example would be: http://bitfinder.statsbiblioteket.dk/DeDanskeAviser/dda3_123.png


Regarding bundling of objects

A bundle contains a number of objects, and one of the objects has special status - namely the object that is central to the bundle.

Special note: We also need bundles for page objects. Such a bundle should comprise volume, region, town, and paper objects, since they all reference pages. This makes it possible to "click back via a link" in the final product. (Via a Summa-xslt, it will be set so that pages are not searchable in Summa, but they will be shown when reached via a link from a paper.)

DataModelForDDAIntoMiniDOMS (last edited 2010-03-17 13:08:49 by localhost)