Analysis of 20.1 Task

Prerequisites and Assumptions

TSH, ABR, and JRG had a long discussion about how to implement Planets integration with the DOMS characterizer. We decided that the best solution would be to:

A major point of discussion regarded what Jhove and Droid should return when called with a file to characterize. Somewhere should be placed code to parse the raw Jhove/Droid output and decide on it. The discussion ended in choosing basically between two different solutions:

  1. Raw tool-output handled by DOMS-characterizer: Jhove and Droid wrapping webservices return only an output dump from the tools, and the DOMS characterise module takes care of parsing and deciding.

    • Advantages:
      • Simpler interface, because of simpler return-values to DOMS-characterizer.
      • We avoid double-"parsing", so less code: First parsing raw tool-output in Planets webservices, then "parsing" the returned properties of the second solution.
    • Disadvantages:
      • Not as modular a solution.
  2. Raw tool-output handled by Planets-services: Jhove and Droid wrapping webservices parse tool output and return a few properties (file type URIs) that the DOMS characterise module decides on, as well as the full dump for storing on the file objects.

    • Advantages:
      • Because of the same properties returned from the webservices, the DOMS characterizer would to a higher degree be unchanged for replaced/new Planets characterize-webservices.
      • Tool-specifics are handled near the tool (each characterizing webservice).
    • Disadvantages:
      • In choosing a specific properties-format in which to put certain information from the full characterization output dump, we may not foresee and include all the data which may some day be needed for e.g. searching files of specific types in DOMS. (This info would then have to be generated later, from the full characterization dumps which land on file objects in DOMS.)

We have to make a choice.

As ABR and TSH each preferred a different of the above solutions, JRG had to choose. The choice is 2. (based on a thought that modularity matters more - I could be wrong, but we have to make a choice).

Other points are:

Resources