Differences between revisions 1 and 6 (spanning 5 versions)
Revision 1 as of 2008-06-26 12:26:05
Size: 1866
Editor: kfc
Comment: Created by the PackagePages action.
Revision 6 as of 2008-10-16 14:03:45
Size: 5246
Editor: jrg
Comment:
Deletions are marked like this. Additions are marked like this.
Line 5: Line 5:
 * The file must be uploaded to bitstorage  * The file must be uploaded to bitstorage, and characterised
Line 7: Line 7:
 * The file must be characterised (see the API for this), and the result added to a Technical Metadata-stream in the object (the file and the characterisation result must match, see the data model documentation)  * The characterisation output must be added to a Technical Metadata-stream in the object (the file and the characterisation result must match, see the data model documentation)
Line 9: Line 9:
Uploading the file to bitstorage must be done with the following API: Uploading the file to bit storage must be done with API calls for
Line 11: Line 11:
(WORK IN PROGRESS! WE NEED TO WRAP THIS IN A WEBSERVICE!)

The interface is command line based, and the client must have ssh access to halley.statsbiblioteket.dk.

The interface is as follows
{{{$ ssh doms@halley <<command>> <<filename>>}}}

Commands are one of
Line 20: Line 12:
 save-md5) : save a file and get md5sum back
 get-md5) : get md5sum of a file
 approve) : approve a file
 delete) : delete a file not appoved
 get) : get a file
 getmd5s) : report md5 sums of stored files
 space-left) : report space left
  uploadFile - in: filename,URLToFile,md5 out:characterizationOutput,fileName,fileUrl,md5,pronomId,validationStatus
  approveFile - in: fileUrl,md5
  disapproveFile - in: fileUrl,md5
  spaceLeft - out: bytes
Line 29: Line 18:
Files are exchanged using {{{stdin}}} and {{{stdout}}}. Errors are reported to {{{stderr}}}. Basically, you upload a file by presenting the bit storage resolvable URI, and the expected MD5 sum. The file will then be made available, and you will get a URI the file is stored on, and characteristica of the file back.
Line 31: Line 20:
To upload a file, use {{{save-md5}}}, and check that the checksum is as expected. Otherwise {{{delete}}} and retry. Those characteristica MUST be added in the proper places of the accompanying Fedora object.
Line 33: Line 22:
Once all Fedora objects referring to the file are validated, use {{{approve}}} to publish the file. Approve is used when the accompanying Fedora object is first sets published.
Line 35: Line 24:
Preferrably, use {{{space-left}}} first as sanity check. Disapprove is used if the accompanying Fedora object is marked deleted before publishing.

== Examples ==

Adding a file and an accompanying fedora metadata object:

 * Make a new fedora object of type File in draft mode
 * Call upload on the bitstorage API
 * Add pronomid and characterisation to the appropriate datastreams
 * Set URL and MD5 on the CONTENT datastream

Publishing a file fedora object

 * Set fedora metadata objects to state "published"
 * Call {{{approve}}} from the bitstorage API
Line 39: Line 42:
=== Bitstorage API ===
Line 40: Line 44:
== Examples == WSDL: attachment:Bitstorage.wsdl
Line 42: Line 46:
Adding a file to a fedora metadata object: The following describes those methods of the Bitstorage API that may be called by the GUI.
Line 44: Line 48:
 * Make a new XML object of type File
 * Call characterisation service on file, add metadata to the object, add file type id to the technical metadata
 * Call {{{save-md5}}} on bitfinder API, check checksum
 * Set fedora metadata objects to state "intermediate"
 * Ingest the fedora file object
 * Set fedora metadata objects to state "published"
 * Call {{{approve}}} from the bitfinder API
==== uploadFile ====
Upload the provided file to the temporary area of bitstorage,
giving it the provided file name.
Return the MD5 checksum of the file.
The file is only uploaded to a temporary approve-area of
the bitstorage, and needs to be approved by calling approveFile before
it is actually moved to the permanent bitstorage.

If you try to upload a file that is already there, it checks the provided
md5 against the file of the file on the server. If they match, there is no
upload, you just get the return about the file already there.
If they do not match, an exception is thrown.

Input parameters:
 * {{{String fileName}}} The filename to store the file by.
 * {{{URI localurl}}} The url to where the bitstorage webservice can get the file.
 * {{{String provided_md5}}} The locally generated md5sum of the file.

Returns:
 * {{{BitstorageFile}}} A bitstorage object, detailing the characteritics and public url of the uploaded file. Data structure summarized below.

Throws:
 * {{{RemoteException}}} If anything went wrong.
 * {{{CannotGetFile}}} If the service cannot get the file from the localurl.
 * {{{InvalidFileName}}} If the provided fileName is invalid.
 * {{{CannotStoreFile}}} If the service cannot store the file on the server.
 * {{{WrongChecksum}}} If the provided checksum does not match.
 * {{{CharacterizationFailed}}} If the characterization service failed somehow.
 * {{{DifferentFileWithThatNameExist}}} If there is already a file with that name, but a different checksum.


==== approveFile ====
Check the earlier uploaded file against the provided checksum, and if
this succeeds, and possibly other criteria are met, move the file
from the temporary area of bitstorage to the permanent bitstorage.

If you call this method on an already approved file, with the correct checksum, nothing happens.
If the checksum is wrong, you get an exception.

Input parameters:
 * {{{URI fileurl}}} The url to the file (in bitstorage)
 * {{{String md5sum}}} The md5sum of the file

Throws:
 * {{{RemoteException}}} If anything went wrong with the connection.
 * {{{UnknownURI}}} If the URI is not to this bitstorage.
 * {{{WrongChecksum}}} If the provided checksum match.
 * {{{CannotStoreFile}}} If the file cannot be stored in the permanent storage area or if the file is not available in the temporary area.


==== disapproveFile ====
Delete the named file from bitstorage. Only works for files that have
not yet been approved.

If the file is not in temporary bitstorage nothing happens.

Input parameters:
 * {{{URI fileurl}}} The url to the file (in bitstorage).
 * {{{String md5sum}}} The md5sum of the file.

Throws:
 * {{{RemoteException}}} If anything went wrong with the connection.
 * {{{UnknownURI}}} If the URI is not to this bitstorage.
 * {{{WrongChecksum}}} If the checksum does not match the file in temporary.


==== spaceLeft ====
Return the number of bytes left in bitstorage.

Returns:
 * {{{long}}} The number of bytes left in bitstorage.

Throws:
 * {{{RemoteException}}} If something unforseen broke while performing the request.


=== Data structures ===

==== BitstorageFile ====
Returned by uploadFile.

Contains the following public methods.

 * {{{URI getFileurl}}}
 * {{{String getFileName}}}
 * {{{byte[] getCharacterizationOutput}}}
 * {{{String getMd5CheckSum}}}
 * {{{String getPronomID}}}
 * {{{String getValidationStatus}}}

Bitstorage API

To add files to DOMS, three things will need to happen:

  • The file must be uploaded to bitstorage, and characterised
  • The file must be connected to a Fedora object datastream of type TypeFile

  • The characterisation output must be added to a Technical Metadata-stream in the object (the file and the characterisation result must match, see the data model documentation)

Uploading the file to bit storage must be done with API calls for

  uploadFile  - in: filename,URLToFile,md5 out:characterizationOutput,fileName,fileUrl,md5,pronomId,validationStatus
  approveFile - in: fileUrl,md5
  disapproveFile - in: fileUrl,md5
  spaceLeft - out: bytes

Basically, you upload a file by presenting the bit storage resolvable URI, and the expected MD5 sum. The file will then be made available, and you will get a URI the file is stored on, and characteristica of the file back.

Those characteristica MUST be added in the proper places of the accompanying Fedora object.

Approve is used when the accompanying Fedora object is first sets published.

Disapprove is used if the accompanying Fedora object is marked deleted before publishing.

Examples

Adding a file and an accompanying fedora metadata object:

  • Make a new fedora object of type File in draft mode
  • Call upload on the bitstorage API
  • Add pronomid and characterisation to the appropriate datastreams
  • Set URL and MD5 on the CONTENT datastream

Publishing a file fedora object

  • Set fedora metadata objects to state "published"
  • Call approve from the bitstorage API

Bitstorage API

WSDL: attachment:Bitstorage.wsdl

The following describes those methods of the Bitstorage API that may be called by the GUI.

uploadFile

Upload the provided file to the temporary area of bitstorage, giving it the provided file name. Return the MD5 checksum of the file. The file is only uploaded to a temporary approve-area of the bitstorage, and needs to be approved by calling approveFile before it is actually moved to the permanent bitstorage.

If you try to upload a file that is already there, it checks the provided md5 against the file of the file on the server. If they match, there is no upload, you just get the return about the file already there. If they do not match, an exception is thrown.

Input parameters:

  • String fileName The filename to store the file by.

  • URI localurl The url to where the bitstorage webservice can get the file.

  • String provided_md5 The locally generated md5sum of the file.

Returns:

  • BitstorageFile A bitstorage object, detailing the characteritics and public url of the uploaded file. Data structure summarized below.

Throws:

  • RemoteException If anything went wrong.

  • CannotGetFile If the service cannot get the file from the localurl.

  • InvalidFileName If the provided fileName is invalid.

  • CannotStoreFile If the service cannot store the file on the server.

  • WrongChecksum If the provided checksum does not match.

  • CharacterizationFailed If the characterization service failed somehow.

  • DifferentFileWithThatNameExist If there is already a file with that name, but a different checksum.

approveFile

Check the earlier uploaded file against the provided checksum, and if this succeeds, and possibly other criteria are met, move the file from the temporary area of bitstorage to the permanent bitstorage.

If you call this method on an already approved file, with the correct checksum, nothing happens. If the checksum is wrong, you get an exception.

Input parameters:

  • URI fileurl The url to the file (in bitstorage)

  • String md5sum The md5sum of the file

Throws:

  • RemoteException If anything went wrong with the connection.

  • UnknownURI If the URI is not to this bitstorage.

  • WrongChecksum If the provided checksum match.

  • CannotStoreFile If the file cannot be stored in the permanent storage area or if the file is not available in the temporary area.

disapproveFile

Delete the named file from bitstorage. Only works for files that have not yet been approved.

If the file is not in temporary bitstorage nothing happens.

Input parameters:

  • URI fileurl The url to the file (in bitstorage).

  • String md5sum The md5sum of the file.

Throws:

  • RemoteException If anything went wrong with the connection.

  • UnknownURI If the URI is not to this bitstorage.

  • WrongChecksum If the checksum does not match the file in temporary.

spaceLeft

Return the number of bytes left in bitstorage.

Returns:

  • long The number of bytes left in bitstorage.

Throws:

  • RemoteException If something unforseen broke while performing the request.

Data structures

BitstorageFile

Returned by uploadFile.

Contains the following public methods.

  • URI getFileurl

  • String getFileName

  • byte[] getCharacterizationOutput

  • String getMd5CheckSum

  • String getPronomID

  • String getValidationStatus

Bitstorage API (last edited 2010-03-17 13:09:38 by localhost)