Differences between revisions 1 and 9 (spanning 8 versions)
Revision 1 as of 2008-06-26 12:26:05
Size: 1866
Editor: kfc
Comment: Created by the PackagePages action.
Revision 9 as of 2008-10-17 13:18:30
Size: 5239
Editor: bam
Comment:
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
To add files to DOMS, three things will need to happen: To add files to DOMS, three things need to happen:
Line 5: Line 5:
 * The file must be uploaded to bitstorage  * The file must be uploaded to bitstorage, and characterized
Line 7: Line 7:
 * The file must be characterised (see the API for this), and the result added to a Technical Metadata-stream in the object (the file and the characterisation result must match, see the data model documentation)  * The characterization output must be added to a Technical Metadata-stream in the object (the file and the characterisation result must match, see the data model documentation)
Line 9: Line 9:
Uploading the file to bitstorage must be done with the following API: Uploading the file to bit storage must be done with API calls to methods {{{uploadFile}}}, {{{approveFile}}}, {{{disapproveFile}}} and {{{spaceLeft}}},
each of which is further specified under Bitstorage API below.
Line 11: Line 12:
(WORK IN PROGRESS! WE NEED TO WRAP THIS IN A WEBSERVICE!) Basically, you upload a file by presenting the bit storage resolvable URI, and the expected MD5 sum. The file will then be made available, and you will get a URI the file is stored on, and characteristics of the file back.
Line 13: Line 14:
The interface is command line based, and the client must have ssh access to halley.statsbiblioteket.dk. Those characteristics MUST be added in the proper places of the accompanying Fedora object.
Line 15: Line 16:
The interface is as follows
{{{$ ssh doms@halley <<command>> <<filename>>}}}
{{{approveFile}}} is used when the accompanying Fedora object is first set to published.
Line 18: Line 18:
Commands are one of
{{{
 save-md5) : save a file and get md5sum back
 get-md5) : get md5sum of a file
 approve) : approve a file
 delete) : delete a file not appoved
 get) : get a file
 getmd5s) : report md5 sums of stored files
 space-left) : report space left
}}}
{{{disapproveFile}}} is used if the accompanying Fedora object is marked as deleted before publishing.
Line 29: Line 20:
Files are exchanged using {{{stdin}}} and {{{stdout}}}. Errors are reported to {{{stderr}}}. == Examples ==
Line 31: Line 22:
To upload a file, use {{{save-md5}}}, and check that the checksum is as expected. Otherwise {{{delete}}} and retry. Adding a file and an accompanying fedora metadata object:
Line 33: Line 24:
Once all Fedora objects referring to the file are validated, use {{{approve}}} to publish the file.  * Make a new fedora object of type File in draft mode
 * Call upload on the bitstorage API
 * Add pronomid and characterisation to the appropriate datastreams
 * Set URL and MD5 on the CONTENT datastream
Line 35: Line 29:
Preferrably, use {{{space-left}}} first as sanity check. Publishing a file fedora object

 * Set fedora metadata objects to state "published"
 * Call {{{approve}}} from the bitstorage API
Line 39: Line 36:
== Bitstorage API ==
Line 40: Line 38:
== Examples == WSDL: attachment:Bitstorage.xml
The following describes those methods of the Bitstorage API that may be called by the GUI.
Line 42: Line 41:
Adding a file to a fedora metadata object: ==== uploadFile ====
Upload the provided file to the temporary area of bitstorage,
giving it the provided file name.
Return a bitstorage object, containing different characteristics about the uploaded file.
The file is only uploaded to a temporary approve-area of
the bitstorage, and needs to be approved by calling approveFile before
it is actually moved to the permanent bitstorage.
Line 44: Line 49:
 * Make a new XML object of type File
 * Call characterisation service on file, add metadata to the object, add file type id to the technical metadata
 * Call {{{save-md5}}} on bitfinder API, check checksum
 * Set fedora metadata objects to state "intermediate"
 * Ingest the fedora file object
 * Set fedora metadata objects to state "published"
 * Call {{{approve}}} from the bitfinder API
If you try to upload a file that is already there, it checks the provided
md5 against the file of the file on the server. If they match, there is no
upload, you just get the return about the file already there.
If they do not match, an exception is thrown.

Input parameters:
 * {{{String fileName}}} The filename to store the file by.
 * {{{URI localurl}}} The url to where the bitstorage webservice can get the file.
 * {{{String provided_md5}}} The locally generated md5sum of the file.

Returns:
 * {{{BitstorageFile}}} A bitstorage object, detailing the characteritics and public url of the uploaded file. Data structure summarized below.

Throws:
 * {{{RemoteException}}} If anything went wrong.
 * {{{CannotGetFile}}} If the service cannot get the file from the localurl.
 * {{{InvalidFileName}}} If the provided fileName is invalid.
 * {{{CannotStoreFile}}} If the service cannot store the file on the server.
 * {{{WrongChecksum}}} If the provided checksum does not match.
 * {{{CharacterizationFailed}}} If the characterization service failed somehow.
 * {{{DifferentFileWithThatNameExist}}} If there is already a file with that name, but a different checksum.


==== approveFile ====
Check the earlier uploaded file against the provided checksum, and if
this succeeds, and possibly other criteria are met, move the file
from the temporary area of bitstorage to the permanent bitstorage.

If you call this method on an already approved file, with the correct checksum, nothing happens.
If the checksum is wrong, you get an exception.

Input parameters:
 * {{{URI fileurl}}} The url to the file (in bitstorage)
 * {{{String md5sum}}} The md5sum of the file

Throws:
 * {{{RemoteException}}} If anything went wrong with the connection.
 * {{{UnknownURI}}} If the URI is not to this bitstorage.
 * {{{WrongChecksum}}} If the provided checksum match.
 * {{{CannotStoreFile}}} If the file cannot be stored in the permanent storage area or if the file is not available in the temporary area.


==== disapproveFile ====
Delete the named file from bitstorage. Only works for files that have
not yet been approved.

If the file is not in temporary bitstorage nothing happens.

Input parameters:
 * {{{URI fileurl}}} The url to the file (in bitstorage).
 * {{{String md5sum}}} The md5sum of the file.

Throws:
 * {{{RemoteException}}} If anything went wrong with the connection.
 * {{{UnknownURI}}} If the URI is not to this bitstorage.
 * {{{WrongChecksum}}} If the checksum does not match the file in temporary.


==== spaceLeft ====
Return the number of bytes left in bitstorage.

Returns:
 * {{{long}}} The number of bytes left in bitstorage.

Throws:
 * {{{RemoteException}}} If something unforseen broke while performing the request.


=== Data structures ===

==== BitstorageFile ====
Returned by uploadFile.

Contains the following public methods.

 * {{{URI getFileurl}}}
 * {{{String getFileName}}}
 * {{{byte[] getCharacterizationOutput}}}
 * {{{String getMd5CheckSum}}}
 * {{{String getPronomID}}}
 * {{{String getValidationStatus}}}

Bitstorage API

To add files to DOMS, three things need to happen:

  • The file must be uploaded to bitstorage, and characterized
  • The file must be connected to a Fedora object datastream of type TypeFile

  • The characterization output must be added to a Technical Metadata-stream in the object (the file and the characterisation result must match, see the data model documentation)

Uploading the file to bit storage must be done with API calls to methods uploadFile, approveFile, disapproveFile and spaceLeft, each of which is further specified under Bitstorage API below.

Basically, you upload a file by presenting the bit storage resolvable URI, and the expected MD5 sum. The file will then be made available, and you will get a URI the file is stored on, and characteristics of the file back.

Those characteristics MUST be added in the proper places of the accompanying Fedora object.

approveFile is used when the accompanying Fedora object is first set to published.

disapproveFile is used if the accompanying Fedora object is marked as deleted before publishing.

Examples

Adding a file and an accompanying fedora metadata object:

  • Make a new fedora object of type File in draft mode
  • Call upload on the bitstorage API
  • Add pronomid and characterisation to the appropriate datastreams
  • Set URL and MD5 on the CONTENT datastream

Publishing a file fedora object

  • Set fedora metadata objects to state "published"
  • Call approve from the bitstorage API

Bitstorage API

WSDL: attachment:Bitstorage.xml The following describes those methods of the Bitstorage API that may be called by the GUI.

uploadFile

Upload the provided file to the temporary area of bitstorage, giving it the provided file name. Return a bitstorage object, containing different characteristics about the uploaded file. The file is only uploaded to a temporary approve-area of the bitstorage, and needs to be approved by calling approveFile before it is actually moved to the permanent bitstorage.

If you try to upload a file that is already there, it checks the provided md5 against the file of the file on the server. If they match, there is no upload, you just get the return about the file already there. If they do not match, an exception is thrown.

Input parameters:

  • String fileName The filename to store the file by.

  • URI localurl The url to where the bitstorage webservice can get the file.

  • String provided_md5 The locally generated md5sum of the file.

Returns:

  • BitstorageFile A bitstorage object, detailing the characteritics and public url of the uploaded file. Data structure summarized below.

Throws:

  • RemoteException If anything went wrong.

  • CannotGetFile If the service cannot get the file from the localurl.

  • InvalidFileName If the provided fileName is invalid.

  • CannotStoreFile If the service cannot store the file on the server.

  • WrongChecksum If the provided checksum does not match.

  • CharacterizationFailed If the characterization service failed somehow.

  • DifferentFileWithThatNameExist If there is already a file with that name, but a different checksum.

approveFile

Check the earlier uploaded file against the provided checksum, and if this succeeds, and possibly other criteria are met, move the file from the temporary area of bitstorage to the permanent bitstorage.

If you call this method on an already approved file, with the correct checksum, nothing happens. If the checksum is wrong, you get an exception.

Input parameters:

  • URI fileurl The url to the file (in bitstorage)

  • String md5sum The md5sum of the file

Throws:

  • RemoteException If anything went wrong with the connection.

  • UnknownURI If the URI is not to this bitstorage.

  • WrongChecksum If the provided checksum match.

  • CannotStoreFile If the file cannot be stored in the permanent storage area or if the file is not available in the temporary area.

disapproveFile

Delete the named file from bitstorage. Only works for files that have not yet been approved.

If the file is not in temporary bitstorage nothing happens.

Input parameters:

  • URI fileurl The url to the file (in bitstorage).

  • String md5sum The md5sum of the file.

Throws:

  • RemoteException If anything went wrong with the connection.

  • UnknownURI If the URI is not to this bitstorage.

  • WrongChecksum If the checksum does not match the file in temporary.

spaceLeft

Return the number of bytes left in bitstorage.

Returns:

  • long The number of bytes left in bitstorage.

Throws:

  • RemoteException If something unforseen broke while performing the request.

Data structures

BitstorageFile

Returned by uploadFile.

Contains the following public methods.

  • URI getFileurl

  • String getFileName

  • byte[] getCharacterizationOutput

  • String getMd5CheckSum

  • String getPronomID

  • String getValidationStatus

Bitstorage API (last edited 2010-03-17 13:09:38 by localhost)