Action Code Basic Preingester

Assigned
PKO+JRG

Prev assigned

Tasks adressed

Collection

Time estimated
9 md

Time used
7 md

Priority
2

Status
Finished

Iteration
18

Notes

Code basic ("bottom-most") steps of preingester for De Danske Aviser - that is code for preingesting pngfile-, page-, and paper-objects.

(Test with actual objects, ingesting a few of the results into test-Fedora by Admin application) Here, the object templates written in an earlier action will be used. Any TODO's in those templates should be finished.

First step

In DDAPreingester.java, fill out method generateFoxMLFilesForEachPage.
Generate FoxML objects for each page. This is done by running through the filenames of the png-files; for each filename, make a copy of the template for a page and use volume number and page number (from the filename) to construct the strings to be inserted at INSERT_PID_HERE and INSERT_TITLE_HERE in the copy. Insert them, and save the copy to disk with a filename like dda_page_vol3_page123.xml.

The PID should be constructed as mentioned under page on the datamodel page. Here is also mentioned how the title should look.

Second step

In DDAPreingester.java, fill out method generateFoxMLFilesForEachPngFile.
Generate FoxML objects for each pngfile. This is done by running through the filenames of the png-files; for each filename, make a copy of the template for a pngfile and use volume number and page number (from the filename) to construct the strings to be inserted at INSERT_PID_HERE, INSERT_FILENAME_HERE, and INSERT_URL_HERE in the copy. Insert them, and save the copy to disk with a filename like dda_pngfile_vol3_page123.

The PID should be constructed as mentioned under pngfile on the datamodel page. Here is also mentioned how the filename and URL should look.

Third step

In DDAPreingester.java, expand method generateFoxMLFilesForEachPage so that it handles relations hasFile, previousPage, and nextPage. Replace strings INSERT_PNGFILE_PID_HERE, INSERT_PREVIOUSPAGE_PID_HERE, and INSERT_NEXTPAGE_PID_HERE with the appropriate PIDs. See the datamodel page for appropriate PIDs for pages and pngfiles.

Fourth step

In DDAPreingester.java, fill out method generateFoxMLFilesForEachPaper.
Generate FoxML objects for each paper-xml-file. This is done by running through the three volumes, and for each of these further running through all papers, performing for each paper what needs to be done. The things to be done are as follows. Load the paper-xml-file. Extract number from the outermost tag in the xml (preferably via xpath). Construct the PID from volume and number (as explained under paper on the datamodel page). Extract the title from name of the outermost tag in the xml (xpath again). Then, in a copy of the template for paper, insert the whole contents of the paper xml-file at <!-- INSERT_EMBEDDED_XML_CONTENT_HERE -->, insert the constructed PID at INSERT_PID_HERE, and the constructed title at INSERT_TITLE_HERE.



Checklist For Working On An Action

The Life Cycle of an Action:

Please make sure that you address the below issues, when working on an action:

ActionCodeBasicPreingester (last edited 2010-03-17 13:09:13 by localhost)