Differences between revisions 13 and 34 (spanning 21 versions)
Revision 13 as of 2009-02-19 16:45:07
Size: 5565
Editor: jrg
Comment:
Revision 34 as of 2010-03-17 13:09:14
Size: 6060
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 21: Line 21:
 Tasks adressed:: ["Collection"]  Tasks adressed:: [[Collection]]
Line 33: Line 33:
 Time used:: md  Time used:: 13+ md
## JRG: 13 md
Line 45: Line 45:
 Status:: Not started  Status:: Finished
Line 72: Line 72:
In {{{DDAPreingester.java}}}, make a method that generates FoxML objects for towns. This will be done in two or three passes. In {{{DDAPreingester.java}}}, make a method that generates FoxML objects for towns. This will happen as part of the process where the FoxML objects for each paper are created.
Line 74: Line 74:
The first pass gathers the stuff that will (in the second pass) be inserted in the town-templates at {{{<!-- INSERT_HASPAPER_RELATIONS_HERE -->}}}.
The pass uses an array (or HashMap) for each volume, let's call it {{{townRelations}}}, to store for each town(number) the string (containing relations) to be inserted in the corresponding towns template. (Remember here that for a given town, it can exist in several volumes, and those towns will then be separate here.)
This pass will consist of a loop over volumes containing a loop over the paper xmls of the volume.
From each paper xml will be extracted (via xpath code like in last iteration) the "number" of the paper, and that will be used for constructing the pid of that paper, which in turn will be used to construct a string {{{<doms:hasPaper rdf:resource="info:fedora/INSERT_HASPAPER_PID_HERE"/>}}} This string is appended to the relations-string (in {{{townRelations}}}) for the town (and volume) of the current paper xml. In this way, the first pass will have collected the hasPaper-relations for each town for each volume.

The second pass loops over all volumes and papers to record in an array {{{townFirstpage}}} for each volume the first page of each volume-town-occurrence. This is done by looking in the paper xml, where town number and pagenumber can be found, and then recording the pagenumber for the town, if a smaller pagenumber had not already been found for the volume-town-occurrence.

The third pass will generate the FoxML objects for towns. Looping over all volumes and all towns in {{{townRelations}}} we will make a copy of the town template and insert the hasPaper-relations from {{{townRelations}}} for the given volume and town at {{{<!-- INSERT_HASPAPER_RELATIONS_HERE -->}}} in the template. For the PID that needs to be inserted, we will use the town number and volume. For title, we will make a look up in a table constructed from data from the {{{BIND_1.xml}}},...,{{{BIND_3.xml}}} files. The PID of the firstpage-relation will be constructed from volume number and a pagenumber from {{{townFirstpage}}}.
As each paper is processed to generate the relevant FoxML object, their PIDs are collected and the number of their associated town is extracted.
Each time a new town is encountered, or no more papers exist, a town FoxML object is generated and relations to the relevant papers is inserted, along with metadata on the town itself.
Line 85: Line 78:
* UNDER CONSTRUCTION *
In {{{DDAPreingester.java}}}, make a method that generates FoxML objects for regions. This will be done in three passes.

The first pass builds a table that maps each town to the region+volume it belongs to.
The following proposes a way to build this. (Could there be a better way?)
Given is a table (in an object of class {{{DDARegionPageList}}}) that maps each volume number and region name to the page number this region appears on in the given volume.
Using {{{DDATownList}}} we loop through all volumes and towns. For each town number, we find (using {{{DDATownList}}}) the page number of that town. Then, using the table mapping volume+regionname to page numbers, we will find two region-pages that this towns page lies in between. That should provide the volume+region this town belongs to.

The second pass gathers the stuff that will (in the third pass) be inserted in the region-templates at {{{<!-- INSERT_HASTOWN_RELATIONS_HERE -->}}}.
This pass constructs an array of strings, one entry for each region(+volume), each entry being a string containing the hasTown-relations that should be inserted in the region-template at
{{{<!-- INSERT_HASTOWN_RELATIONS_HERE -->}}}. For constructing a hasTown relation, we need (for the town PID) the numbers of volume and town.
Using {{{DDATownList}}} we loop through all volumes and towns. For each town number, we find (using {{{DDATownList}}}) the number of that town.
Then that newly constructed hasTown-relation will be inserted in the string for the relevant region. And how do we find out which region+volume it belongs to? We make a look up in the table constructed in the first pass, thus finding volume+region.

The third pass, finally, creates the actual FoxML files. It loops over volumes and regions using {{{DDARegionPageList}}}, inserting needed values in a copy of the region-template.
For PID, we get the region name and volume from {{{DDARegionPageList}}}. That region name is also used to insert as title. The hasTown-relations to be inserted will be taken from the table from the second pass. For the firstpage relation, the pid is simply build using the page number from {{{DDARegionPageList}}}.
Line 89: Line 98:
* UNDER CONSTRUCTION * Hand-craft FoxML objects for the three volumes and the super volume.
Line 92: Line 101:
[[BR]] <<BR>>
Line 94: Line 103:
[[BR]] <<BR>>

Action Code Full Preingester

Assigned
PKO+JRG

Prev assigned

Tasks adressed

Collection

Time estimated
9 md

Time used
13+ md

Priority
1

Status
Finished

Iteration
19

Notes

(Test with actual objects, ingesting a few of the results into test-Fedora by Admin application) Here, the object templates written in an earlier action will be used. Any TODO's in those templates should be finished.

First step

In DDAPreingester.java, make a method that generates FoxML objects for towns. This will happen as part of the process where the FoxML objects for each paper are created.

As each paper is processed to generate the relevant FoxML object, their PIDs are collected and the number of their associated town is extracted. Each time a new town is encountered, or no more papers exist, a town FoxML object is generated and relations to the relevant papers is inserted, along with metadata on the town itself.

Second step

In DDAPreingester.java, make a method that generates FoxML objects for regions. This will be done in three passes.

The first pass builds a table that maps each town to the region+volume it belongs to. The following proposes a way to build this. (Could there be a better way?) Given is a table (in an object of class DDARegionPageList) that maps each volume number and region name to the page number this region appears on in the given volume. Using DDATownList we loop through all volumes and towns. For each town number, we find (using DDATownList) the page number of that town. Then, using the table mapping volume+regionname to page numbers, we will find two region-pages that this towns page lies in between. That should provide the volume+region this town belongs to.

The second pass gathers the stuff that will (in the third pass) be inserted in the region-templates at <!-- INSERT_HASTOWN_RELATIONS_HERE -->. This pass constructs an array of strings, one entry for each region(+volume), each entry being a string containing the hasTown-relations that should be inserted in the region-template at <!-- INSERT_HASTOWN_RELATIONS_HERE -->. For constructing a hasTown relation, we need (for the town PID) the numbers of volume and town. Using DDATownList we loop through all volumes and towns. For each town number, we find (using DDATownList) the number of that town. Then that newly constructed hasTown-relation will be inserted in the string for the relevant region. And how do we find out which region+volume it belongs to? We make a look up in the table constructed in the first pass, thus finding volume+region.

The third pass, finally, creates the actual FoxML files. It loops over volumes and regions using DDARegionPageList, inserting needed values in a copy of the region-template. For PID, we get the region name and volume from DDARegionPageList. That region name is also used to insert as title. The hasTown-relations to be inserted will be taken from the table from the second pass. For the firstpage relation, the pid is simply build using the page number from DDARegionPageList.

Third step

Hand-craft FoxML objects for the three volumes and the super volume.



Checklist For Working On An Action

The Life Cycle of an Action:

  • Assign people for action definition: Done at start of iteration status meeting. Fill out Assigned

  • Define the action: Describe information about what is to be done and how. Fill out Tasks Addressed and Time Estimated.

  • Review the definition: Get another project group member to review the action definition, and update it.

  • Assign people for action implementation: Done by project manager, usually the same persons who wrote the definition. Fill out Assigned and Prev assigned if new persons are assigned.

  • Implement the action: See details below

  • Review the action: Get another project group member to review what is implemented (code and documentation), and update it.

  • Finish the action: Change the status to "Finished" and update the "time used" field on the action page.

Please make sure that you address the below issues, when working on an action:

  • Update the state of the action to "In Progress" when you start working on it.
  • Check if the tasks addressed by this action have their status set to "In Progress". If that is not the case, then change the state of them.
  • Keep track of how much time that has been spent working on the action. If it addresses more than one task, then make a note on the action page about how much of the elapsed time that has been spent on the individual tasks. Hint: Continually updating the "Time used" field will make it easier for you.

  • Update the "Progress History" and documentation pages of each task addressed by this action when appropriate. This depends on the situation, but in general, the task pages should hold all important related information about the work done, experiences gathered, identified requirements and so on.

ActionCodeFullPreingester (last edited 2010-03-17 13:09:14 by localhost)