2008-06-20 DOMS Architecture Discussion Meeting

A meeting for discussing how to set up fedora to give access to DOMS for the GUI.

We briefly discussed the draft GUI specification from Mjølner.

KFC drew a proposed diagram of the DOMS architecture, which we then augmented. Basically it contained a front-end fedora and a back-end fedora. The back-end fedora providing presentations to f.x. Summa through an OAI-API, and the front-end fedora which the GUI communicates with through a Fedora 3 API. The idea was to use journaling to import/reconstruct journaled objects into the back-end once they had been validated in the front-end, however this turned out not to be feasible. Thus journaling's purpose is to provide consistent backups and possibly distinguishing XACML rights.

http://merkur/viewvc/trunk/docs/architecture/Architecture.jpeg?root=doms&view=co

Validation

We demand that the DOMS system should always be in a consistent state. This makes the creation of new objects difficult, as they will often have invalid relations. To remedy this, we introduce the notion of "states" of objects. Objects can be in three different states

We agreed to validate objects going through the Fedora 3 API only when marked as published by the GUI. And relations are valid, even if they refer to objects that are intermediate.

Points about datamodel

We split the validation of the datamodel into two systems. One is the hard datamodel, derived from the ContentModels, but more restrictive. It should check

This validator is run, as soon as a object is published in the DOMS repository. The hard data model should be expressed in a format, that can easily be reversed, so that Mjølner can check these demands before saving.

The soft Datamodel expresses the overall graph structure of the DOMS. It considers all relations to objects that are not "published" as valid, but checks the rest. This way, the validator will not mark the repository as invalid, even in the split second when new objects are created.

File objects

File objects will get their tech metadata created by validation tools upon the original ingest. The format is extracted from this metadata, and marked in the file. The user metadata should be filled in by the creator of the object, based on a schema, that he can specify.

To make preservation file formats recognisable by the hard datamodel, we merge the object types File and Format, so we have for instance TypeFileBaselineTIFF. We only do this for preservation formats.

To capture other formats we keep TypeFile, and the technical metadatastream will record what the file format is. A part of the technical metadatastream should be a human readable description of the file format.

Fedora 3 API

We decided to use the Fedora 3 API as the interface to the DOMS. There will be repository-wide XACML policies to restrict certain operations. At the moment these are PurgeObject and PurgeDatastream, as they destroy the audit trail.

XACML

User authentication will be by the LDAP server already servicing SB. The user tokens will contain attributes, detailing which licenses they can read, and which collections they can alter. Each object has a Policy Datastream, that refers to a License object. Said object contain a XACML stream, that evaluates if the user has the nessesary attribute to read this license. Write rights were never discussed.

This gives us the ability to generate lists of which "groups" a given user is part of, and which "groups" can access a given object.

But there arise certain restrictions on the XACML policies. A policy cannot name a user directly, only evaluate the attributes.

We do not need the same XACML policies on the backend, as it will only be accessed by the API-A. But journaling ensures that the are synchronized. We need to find out if we can disable object policies on a general basis, without changing the objects. http://merkur/viewvc/trunk/docs/architecture/License.jpeg?root=doms&view=co

Journaling

The reason for journaling becomes obscure. Being able to shut down the front end, while still servicing requests by the backend, or the other way around is one of the big issues. Having the trail of all changes saved is also nice.

Minutes/2008-06-20 DOMS Architecture Meeting (last edited 2010-03-17 13:13:23 by localhost)