Differences between revisions 18 and 19
Revision 18 as of 2009-02-18 11:50:55
Size: 6489
Editor: jrg
Comment:
Revision 19 as of 2010-03-17 13:09:13
Size: 6491
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 21: Line 21:
 Tasks adressed:: ["Collection"]  Tasks adressed:: [[Collection]]
Line 78: Line 78:
In {{{DDAPreingester.java}}}, fill out method {{{generateFoxMLFilesForEachPage}}}.[[BR]] In {{{DDAPreingester.java}}}, fill out method {{{generateFoxMLFilesForEachPage}}}.<<BR>>
Line 81: Line 81:
The PID should be constructed as mentioned under '''page''' on the [:DataModelForDDAIntoMiniDOMS:datamodel page]. Here is also mentioned how the title should look. The PID should be constructed as mentioned under '''page''' on the [[DataModelForDDAIntoMiniDOMS|datamodel page]]. Here is also mentioned how the title should look.
Line 85: Line 85:
In {{{DDAPreingester.java}}}, fill out method {{{generateFoxMLFilesForEachPngFile}}}.[[BR]] In {{{DDAPreingester.java}}}, fill out method {{{generateFoxMLFilesForEachPngFile}}}.<<BR>>
Line 88: Line 88:
The PID should be constructed as mentioned under '''pngfile''' on the [:DataModelForDDAIntoMiniDOMS:datamodel page]. Here is also mentioned how the filename and URL should look. The PID should be constructed as mentioned under '''pngfile''' on the [[DataModelForDDAIntoMiniDOMS|datamodel page]]. Here is also mentioned how the filename and URL should look.
Line 96: Line 96:
In {{{DDAPreingester.java}}}, fill out method {{{generateFoxMLFilesForEachPaper}}}.[[BR]] In {{{DDAPreingester.java}}}, fill out method {{{generateFoxMLFilesForEachPaper}}}.<<BR>>
Line 100: Line 100:
[[BR]] <<BR>>
Line 102: Line 102:
[[BR]] <<BR>>

Action Code Basic Preingester

Assigned
PKO+JRG

Prev assigned

Tasks adressed

Collection

Time estimated
9 md

Time used
7 md

Priority
2

Status
Finished

Iteration
18

Notes

Code basic ("bottom-most") steps of preingester for De Danske Aviser - that is code for preingesting pngfile-, page-, and paper-objects.

(Test with actual objects, ingesting a few of the results into test-Fedora by Admin application) Here, the object templates written in an earlier action will be used. Any TODO's in those templates should be finished.

First step

In DDAPreingester.java, fill out method generateFoxMLFilesForEachPage.
Generate FoxML objects for each page. This is done by running through the filenames of the png-files; for each filename, make a copy of the template for a page and use volume number and page number (from the filename) to construct the strings to be inserted at INSERT_PID_HERE and INSERT_TITLE_HERE in the copy. Insert them, and save the copy to disk with a filename like dda_page_vol3_page123.xml.

The PID should be constructed as mentioned under page on the datamodel page. Here is also mentioned how the title should look.

Second step

In DDAPreingester.java, fill out method generateFoxMLFilesForEachPngFile.
Generate FoxML objects for each pngfile. This is done by running through the filenames of the png-files; for each filename, make a copy of the template for a pngfile and use volume number and page number (from the filename) to construct the strings to be inserted at INSERT_PID_HERE, INSERT_FILENAME_HERE, and INSERT_URL_HERE in the copy. Insert them, and save the copy to disk with a filename like dda_pngfile_vol3_page123.

The PID should be constructed as mentioned under pngfile on the datamodel page. Here is also mentioned how the filename and URL should look.

Third step

In DDAPreingester.java, expand method generateFoxMLFilesForEachPage so that it handles relations hasFile, previousPage, and nextPage. Replace strings INSERT_PNGFILE_PID_HERE, INSERT_PREVIOUSPAGE_PID_HERE, and INSERT_NEXTPAGE_PID_HERE with the appropriate PIDs. See the datamodel page for appropriate PIDs for pages and pngfiles.

Fourth step

In DDAPreingester.java, fill out method generateFoxMLFilesForEachPaper.
Generate FoxML objects for each paper-xml-file. This is done by running through the three volumes, and for each of these further running through all papers, performing for each paper what needs to be done. The things to be done are as follows. Load the paper-xml-file. Extract number from the outermost tag in the xml (preferably via xpath). Construct the PID from volume and number (as explained under paper on the datamodel page). Extract the title from name of the outermost tag in the xml (xpath again). Then, in a copy of the template for paper, insert the whole contents of the paper xml-file at <!-- INSERT_EMBEDDED_XML_CONTENT_HERE -->, insert the constructed PID at INSERT_PID_HERE, and the constructed title at INSERT_TITLE_HERE.



Checklist For Working On An Action

The Life Cycle of an Action:

  • Assign people for action definition: Done at start of iteration status meeting. Fill out Assigned

  • Define the action: Describe information about what is to be done and how. Fill out Tasks Addressed and Time Estimated.

  • Review the definition: Get another project group member to review the action definition, and update it.

  • Assign people for action implementation: Done by project manager, usually the same persons who wrote the definition. Fill out Assigned and Prev assigned if new persons are assigned.

  • Implement the action: See details below

  • Review the action: Get another project group member to review what is implemented (code and documentation), and update it.

  • Finish the action: Change the status to "Finished" and update the "time used" field on the action page.

Please make sure that you address the below issues, when working on an action:

  • Update the state of the action to "In Progress" when you start working on it.
  • Check if the tasks addressed by this action have their status set to "In Progress". If that is not the case, then change the state of them.
  • Keep track of how much time that has been spent working on the action. If it addresses more than one task, then make a note on the action page about how much of the elapsed time that has been spent on the individual tasks. Hint: Continually updating the "Time used" field will make it easier for you.

  • Update the "Progress History" and documentation pages of each task addressed by this action when appropriate. This depends on the situation, but in general, the task pages should hold all important related information about the work done, experiences gathered, identified requirements and so on.

ActionCodeBasicPreingester (last edited 2010-03-17 13:09:13 by localhost)