Design of 33 Mass Ingest

The HotFolder functionality will keep an eye on selected folders, looking for filenames which match selected patterns and will automatically ingest the metadata of these files into DOMS and move the given file into BitStorage. The HotFolder process has a configuration file which maps files matching a given pattern or files in a specific folder to a specific action. This action fetches metadata and a template, put the metadata into the template, submit this to DOMS and move the file to BitStorage. The HotFolder will work as surveillance on the folders - it will keep checking for new files at a given interval so when files are moved here, the given action will take place. Each time the HotFolder is done looking through the folders, it will check the ConfigFile for new lines (A line is a 3-tupple consisting of a regexp to match some files, a folder to look in and an action to be executed on the matching files) so we can add new places to be surveilled on-the-fly.

hotfolder.png

Prerequisites and Design Decisions

We had to pick between making a global or local design. The global idea is a single process looking at one or more folders and the local idea is one process per folder being surveilled. Both ideas have merit, but the local idea would probably not scale too well on larger systems seeing it requires one process running per folder. The only concern with the global idea is the possibility of a crash - how should we recover if we are in the middle of an "action" while crashing. We set a lock on the given file we are doing the action on before we attempt executing the action. This will prevent the HotFolder from detecting the same file again and possible crash the same way again. The problem then has to be logged some way so it can be solved manually.

Required Software and Modules

Resources

Tasks/33/DesignDocument (last edited 2010-06-08 08:47:01 by mar)