Analysis of Task DOMS Client
DOMS Client - Analysis/brainstorm
So far, four DOMS components will use the DOMS Client library. Below are suggestions for functionality that each component may need.
Summa integration
Knows when it has last queried the DOMS Client for changes. DOMS Client functionality:
- Given a specific time, return a bundle of all objects that have changed since that time, with enough metadata for indexing, and enough for identifying the files needed for presentation.
- Given a specific PID (of a main object), get the object view contents
- Given the PID of a collection object, return the time something in this collection last changed.
The above assuming the info identifying files can be used by Summa to directly access them in bitstorage, so that those files need not pass through DOMS Client. Also we have talked about allowing all metadata to be accessed by anybody, and only putting restrictions on access to actual files in bitstorage, but this would be handled outside DOMS Client.
Knows when it has last queried the DOMS Client for changes. The DOMS Client functionality needed here (for searching) would be the same as that needed by the Summa integration. For uses other than searching, maybe more metadata could be useful. A point to take into account is that the OAI-PMH endpoint (as opposed to Summa integration) should only return data that is allowed to be viewed by anybody. This can be done by the OAI-PMH component passing a "minimal" role to the DOMS Client.
The requests that we should handle to comply with OAI-PMH:
GetRecord - Given a PID, return object view contents for a that PID. Returned record should include prefix that identifies metadata format. Three different errors should be returned.
- Identify - Return info on the repository. One possible error should be returned.
ListIdentifiers - Given from-until timestamps, return headers (not whole records), an abbreviated for of ListRecords below. Other arguments too - how much of the functionality is required? e.g. set-based selective harvesting? Five possible errors.
ListMetadataFormats - Given an (optional) PID, list the metadata formats available from the repository (for the item with that PID). Do we need to have schemas ready for these formats? Three possible errors.
ListRecords - As ListIdentifiers, but returns whole records.
ListSets - Given a flow-control token (for continuing a previous ListSets request), return the set structure of the repository, useful for selective harvesting. Do we need to implement this?!? Three possible errors.
What is needed in DOMS Client to handle the above?
The DOMS GUI will need much functionality for accessing, creating, editing and deleting objects. For now, the DOMS Client methods that come to mind are:
- Given a query string, return a list of PIDs for objects found (search done by call to Summa integration component)
- Return a list of all collection objects
- Given a PID of an existing object in Fedora (for main objects, PID either entered by keyboard, or gained from a search), return a Java object representing it. This Java object will atleast have methods to do the following:
- Delete object
- Publish object
- Mark object as "in progress"
- Return a list of the datastreams on the object
- Return title of object
- Given a name and content, add a datastream with that name and content to object
- Given a name and PID, create a relation with the given name from current object to object with given PID
- Given a name and a template, return a new object clone of the template and make a relation with given name from current object to new
- (may need to receive name of view as well, to check that relation is allowed in the view)
- Given view name, return list of templates for objects that are allowed as targets of relations from current object in given view
- Return outgoing relations from this object
- Return allowed relations from this object, and content models allowed for targets
- Given a name, return outgoing relations of this name from current object
- Return the object that is the main object for current object's view
- Return a compound content model (combined from its different content models) for this object
- Template objects in addition have the following methods:
- A method to return a clone object of the template in question (needed for creating the first objects in a collection)
- Main objects have the following methods:
- Given a view name, returns a list of the objects in this main object's view of that name
- Content model objects have the following methods:
- Return list of all objects that subscribe to this content model
- File objects in addition have the following methods:
- Given filename, content, and checksum, attach a file with the given name and content to the file object
- Delete file from file object
- Collection objects have the following methods:
- Return all templates for this collection (one of these may produce a main object)
- Datastream objects have methods to do the following:
- Delete datastream
- Given new content, replace the existing content in the datastream with the new content
- Return datastream content
- Relation objects atleast have methods:
- Return PID that is target of relation
- Delete relation
- Given name, set name of relation
- Given PID, set target of relation
Mass ingest
Will facilitate ingesting FoxML objects and files that are received from a pre-ingester, as well as ingesting files from a hot-folder in which a specifically named configuration file identifies the file-object-template and collection (already existing in Fedora) which should be used for generating corresponding file objects for the freshly digitized files, and adding them to the collection. (Whew, whatta sentence) For both pre-ingest-driven ingesting and hot-folder ingest, the methods mentioned under DOMS GUI above should suffice.
Prerequisites and Assumptions