Action View Datastream

Assigned
ABR+KFC

Prev assigned
JRG (description review)

Tasks adressed

TaskB.2,TaskB.1

Time estimated
2md

Time used
0md

Priority
6

Status
Described, lacks review

Iteration
13

Notes

The Problem

There is one major outstanding issue with the technological foundation of the DOMS data model. For many purposes, it is useful to regard a number of objects as a combined whole, with one of them being the main object. There are a number of ways to specify this.

In the following, the combination of data from a number of objects will be called the blob.

A system to associate objects and blobs must be defined. The system must adhere to the following restrictions

Current system

The current system is described here: FedoraViewBlobs

In the content models, we have a reserved datastream, called VIEW. VIEW lists the the datastreams from this object to include in the blob, and the relations to be followed from this object to find other objects to include in the blob. Some objects are main objects. To make a blob, you begin from a main object, and follow the listed relations, until there are no more to follow.

This has the nice feature of having the combined whole defined in the content models. No individual object can redefine how it's view looks.

The blobs visible from different angles (interfaces to the system) might not be identical. The blobs used as basis for the gui will probably not be the blobs harvested by a search engine and so on.

The problem arise when performing changes to objects. When an object is changed, by some generic means, the contents of every blob that this object belongs to, should be updated. So, there must be a way to get from an object to the main object, so that the blob can be recomputed. Unfortunately, this is problematic with this implementation.

Progress

There are a number of different models thrown about for how to make this system.

Model 1

Use the current system, but with a database system thrown in. This database will contain the static information about the contents of each blob. It will be recomputed from time to time. Via the database, the lookups for a object to the main object can be achived.

Model 2

Current system with a restriction. There are just one kind of blobs, so that the gui and all other tools operate on the same blobs. Changes are ever only performed through a system that read in the entire blob, and so know the main object when performing the change.

This have the disadvantage that all tools we use must understand the concept of the blob and the VIEW datastream.

Model 3

Current system with an addition. The content model VIEW datastream is bi-directional. Not only does it list the relations to follow from this object, it also lists the possible relations to this object, which would make it part of a VIEW. So, the content model tells you which relations leading into the object should be followed in reverse, when going from an object to the main object.

This will unfortunately mean that the content models will start to define and restrict things that are under the control of other content models. This creates unfortunate interdependencies between the content models.

Model 4

Not the current system.

The view datastream only lists the local datastreams to include. There are no main objects. For each blob, there exist a special aggregate object, that has a "aggregates" relation to each object in the blob. Whenever an object is changed, the aggregate object is notified, so that the blob can be recomputed.

This solution has both advantages and disadvantages. An advantage is that we can automate the blob update, so that all changes to objects notify the aggregate objects automatically, and tools do not need to understand the blob idea. The disadvantage is that we introduce new objects, which perform some of the same tasks as the old main objects. And objects which are not aggregated will not be visible to any blob-enabled tool. It does change the concepts about how to make a new object.

It will make search easier, as the search should only search in the dissemination output of the aggregation objects, which will all be of a particular type. The search output disseminator should be on the aggregation object.

The aggregation type should probably be subclassed for each of the collections, to help define the main objects.

Note: This has been heavily inspired by the OAI-ORE model for ressource maps. Look at their primer LINK.

This system breaks with the principle of having the blobs defined in the content models. The blobs are now defined in special data objects, one for each blob.

Making a new object by this system

The process for making a new object, with templates will be the following.

  1. A new aggregation object is made, empty, but with the title of the blob, if any. The aggregation object is of the kind special to the collection, if any, or just a generic aggregation.
  2. If the aggragation is subclassed, it specifies a content model for the first object. If not, one must be selected.
  3. When the content model for the first object is selected, find the prototypes for objects of this content model. Make a new object from these.
  4. Based on the main object ontology and contents make the other objects it should relate to.
  5. Write the data objects to the repository
  6. Make the aggregation object with relations to each of the objects you made. Write this object to the repository.

Conclusion

Checklist For Working On An Action

The Life Cycle of an Action:

Please make sure that you address the below issues, when working on an action:

ActionViewDatastream (last edited 2010-03-17 13:09:15 by localhost)