Analysis of 20.1 Task
Prerequisites and Assumptions
TSH, ABR, and JRG had a long discussion about how to implement Planets integration with the DOMS characterizer. We decided that the best solution would be to:
- make new wrappings (inside the Planets code base) for tools Droid and Jhove. These wrappings should implement (for both Droid and Jhove) a characterize webservice, that will be called by the DOMS characterize module.
A major point of discussion regarded what Jhove and Droid should return when called with a file to characterize. Somewhere should be placed code to parse the raw Jhove/Droid output and decide on it. The discussion ended in choosing basically between two different solutions:
Raw tool-output handled by DOMS-characterizer: Jhove and Droid wrapping webservices return only an output dump from the tools, and the DOMS characterise module takes care of parsing and deciding.
- Simpler interface, because of simpler return-values to DOMS-characterizer.
- We avoid double-"parsing", so less code: First parsing raw tool-output in Planets webservices, then "parsing" the returned properties of the second solution.
- Not as modular a solution.
Raw tool-output handled by Planets-services: Jhove and Droid wrapping webservices parse tool output and return a few properties (file type URIs) that the DOMS characterise module decides on, as well as the full dump for storing on the file objects.
- Because of the same properties returned from the webservices, the DOMS characterizer would to a higher degree be unchanged for replaced/new Planets characterize-webservices.
- Tool-specifics are handled near the tool (each characterizing webservice).
- In choosing a specific properties-format in which to put certain information from the full characterization output dump, we may not foresee and include all the data which may some day be needed for e.g. searching files of specific types in DOMS. (This info would then have to be generated later, from the full characterization dumps which land on file objects in DOMS.)
We have to make a choice.
As ABR and TSH each preferred a different of the above solutions, JRG had to choose. The choice is 2. (based on a thought that modularity matters more - I could be wrong, but we have to make a choice).
Other points are:
- If none of the characterizing tools (Droid/Jhove/...) know the type of the given file, the DOMS characterizer should reject the file. (Right?)
- A point in question is which file type URIs should be returned to the DOMS. From Planets we sometimes get Pronom URIs, sometimes ad-hoc URIs, and maybe sometimes even no URIs. When mapping these answers to some common "standard", we could to some extent re-use the existing URIs received from Planets. We probably also need to create our own URIs for the cases where Planets does not give one.