MDF - A Metadata Processing Framework

MDF is a combination of a simple approach to creating reusable modules for the processing of metadata and an implementation of that approach using Java.

The driving concept behing MDF is that the processing of metadata involves a number of different stages. Depending on the source and eventual usage of the metadata any one or all four of the following stages may be required:

Within each of these stages, there are any number of different approaches which could be taken. For example, discovery could be by web-crawling, by executing searches or by recursing through file system directories. Extraction may require processing specific to the format of the resource retrieved. Cleaning could involve simple lexical processing (such as forcing all strings to a single case or splitting a string on particular boundaries) or complex extraction processing (such as named entity recognition on text). Finally the aggregation step might write RDF; a topic map in the XTM interchange syntax; a topic map in ISO 13250; or might be used to update a database or other datastore.

MDF attempts to improve the reusability of the different processing functions for each of these stages by defining a framework in which the functions may be designed and implemented separately and then linked together in any combination to provide the desired processing.

MDF is described in more detail in the technical specification. You can also download the Java MDF implementation; or read about the MDF modules provided with the Java MDF implementation.

Previous: TM4J - A Topic Map Engine For Java Next: MDF Technical Specification