Fedora View Blobs

DOMS employ an overall atomistic data model. Atomistic data models are much more flexible than traditional compound data models, but they have one big (and largely unmet) challenge. When working with data objects you will frequently need to operate on a number of objects as if they were a common whole. The easiest usecase for this is the public dissemination of data. If the data that should go into one Dissemination Information Package is distributed over several objects, the system need to understand this.

The DOMS team have laboured long and hard to find a nice way to model this in a Fedora context. This is their product.

Views

A view is a way of combining objects in the DOMS into a domain-relevant group. It is a way of seeing a number of objects as related - as a whole.

Each view contains an object the view is centered around. We call this the main object, and the ID of the View is the ID of the main object. All the other objects in the View are related to the main object by some chain of relations. Therein lies a crucial feature of this View system; Rather than having special relations from the main object to all objects in the view, some of the structural relations are annotated to be view relations. Or rather, we list the relations that should be followed to find the objects in the view, rather than define view-relations.

The view nessesary for a proper public dissemination of the objects might not be the same as what is required for a useful GUI access, through. The way around this is to define multiple views on the same objects. Each named view has it's own main objects and set of annotated relations to follow from these main objects. In no way do they interact, and we can therefore have radically different ways of viewing the same data.

The VIEW datastream

Now we come to another crucial feature of this view system; Views are defined on the content model level. An data object does not identify itself as a main object. The content model for this object tells that all objects of this class are main objects. Everything is defined in the classes of objects, never in the actual data objects. As such, it is easy to change and add views on a collection-wide basis.

To facilitate this, the "VIEW" datastream in content models have been designated as Reserved and Required. The "VIEW" datastream is, basicly, a sequence of named views, each with their designated relations.

The schema for the VIEW datastream is as follows:

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"
            xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            targetNamespace="http://doms.statsbiblioteket.dk/types/view/0/1/#"
            xmlns="http://doms.statsbiblioteket.dk/types/view/0/1/#"
            elementFormDefault="qualified"
            attributeFormDefault="unqualified">

    <xsd:element name="views" type="viewsType"/>

    <xsd:complexType name="viewsType">
        <xsd:sequence>
            <xsd:element name="view" type="viewType" minOccurs="0" maxOccurs="unbounded"/>
        </xsd:sequence>
    </xsd:complexType>

    <xsd:complexType name="viewType">
        <xsd:sequence>
            <xsd:element name="relations" type="relationsType" minOccurs="0" maxOccurs="1"/>
            <xsd:element name="inverse-relations" type="inverse-relationsType" minOccurs="0" maxOccurs="1"/>
        </xsd:sequence>
        <xsd:attribute name="name" type="xsd:string" use="required"/>
        <xsd:attribute name="mainobject" type="xsd:boolean" default="false"/>
    </xsd:complexType>

    <xsd:complexType name="relationsType">
        <xsd:sequence>
            <xsd:any namespace="##any" processContents="skip" maxOccurs="unbounded"/>
        </xsd:sequence>
    </xsd:complexType>

    <xsd:complexType name="inverse-relationsType">
        <xsd:sequence>
            <xsd:any namespace="##any" processContents="skip" maxOccurs="unbounded"/>
        </xsd:sequence>
    </xsd:complexType>

</xsd:schema>

Multilevel Views

The system described above works as follows.

  1. Start with a main object.
  2. Read the list of view relations from it's content model
  3. Follow these relations to other objects.
  4. Keep following these relations until no new objects are found.

The implementation of the view system detailed above does have one lack, which the clever reader might have spotted. It is not local. One of the fundamental design requirements for expansions to Fedora is that data objects should only be described by content models they subscribe to, and content models should only describe the objects that subscribe to them.

For that reason, the meaining of the relations mentioned in the "VIEW" datastream is changed somewhat: Each data object has a view, encompassing the object and the views of other directly related data objects. So, if the VIEW datastream in a main object was

<view:views  xmlns:view="http://doms.statsbiblioteket.dk/types/views/0/1/#">
  <view:view name="GUI" mainobject="true">
    <view:relations>
      <doms:hasFile xmlns:doms="http://doms.statsbiblioteket.dk/relations/default/0/1/#"/>
    </view:relations>
    <view:inverse-relations>
      <doms:isPartOfCollection xmlns:doms="http://doms.statsbiblioteket.dk/relations/default/0/1/#"/>
    </view:inverse-relations>
  </view:view>
</view:views>

then the View of this main object encompass the main object itself, and the View of any objects that the main object has a "doms:hasFile" relation to and any object that has a "doms:isPartOfCollection" relation to this object.

The procedure to calculate the total view of a main object is detailed in this bit of pseudo code. It basicly performs a depthfirst search of the objects. The order of the objects in the View does not carry any sort of meaning, and will be random.

Set<Object> visitedObjects;

List<Object> CalculateView(Object o) {
   List<Objects> view = new List<Objects>();

   if (visitedObjects.contain(o){
      return view;
   }

   visitedObjects.add(o);
   ContentModel c = o.getContentModel();
   List<Relations> view-rels = c.getViewRelations();
   for (Relation r : view-rels){
     view.addAll(CalculateView(r.getObject());
   }

   List<Relations> view-invrels = c.getInverseViewRelations();
   for (Relation r : view-invrels){
     view.addAll(CalculateView(r.getSubject());
   }

   return view;
}

Each object has a reserved datastream, called "VIEW". This datastream is subdivided into named views.

A view for an object O is represented by a Datastream VIEW on the Content Model object for O. This Datastream also mark the object as Main, if this is the case. Please note that the view is defined on Content Model level, so the same rules are used to generate the view for all objects using that Content Model. When creating totally new objects in the GUI, they should subscribe to main view content models from the current collection.

The datastream will just contain a list of relation names and reverse relation names. Following these relations will give you the view.

Inheritance rules

Views are inherited when Content Models extends each other. Keep three seperate lists, one for datastreams, one for relations and one for inverse relations. Just concatenate the entries from all parent content models to these lists, and remove duplicates. Then use these three lists to generate the list of objects in the view.

Definitions:

In addition, we suggest to augment the 1-step approach with the idea of "includes". What this means is that when object O has a view defined by following relations from O once, and an object P is in the view of O, then the view of P will be included in the view of O.

View datastream contain xml of the form

<?xml version="1.0" encoding="UTF-8"?>
<view:views  xmlns:view="http://doms.statsbiblioteket.dk/types/views/0/1/#">
  <view:view name="GUI" mainobject="true">
    <view:relations>
      <doms:hasFile xmlns:doms="http://doms.statsbiblioteket.dk/relations/default/0/1/#"/>
    </view:relations>
    <view:inverse-relations>
      <doms:isPartOfCollection xmlns:doms="http://doms.statsbiblioteket.dk/relations/default/0/1/#"/>
    </view:inverse-relations>
    <view:datastreams>
      <view:datastream>DC</view:datastream>
    </view:datastreams>
  </view:view>
</view:views>

As can be seen, it describes all relations to be followed outwards, both directly and reverse. When including the object, only the named datastreams from the datastreams tag should be used. There can be several views, with different views in an object. The GUI should use the view with the name "GUI".