The following table lists the modules which are currently part of the Java implementation of MDF which may be downloaded from here. All the modules are found in subpackages of the package com.techquila.mdf.impl
| Module | Description |
| basic.MDFApp | This is not a module, but a simple MDF application which is configured by an XML file to build a chain of modules and pass one or more sets of data into the chain. |
| basic.BasicPrinterModule | Writes the received metadata set to an output stream (stdout by default). |
| basic.SplitterModule | Creates multiple metadata values from a single value by dividing the string on specified characters. |
| basic.TranslatorModule | Changes the key of specific entries in the metadata set. |
| html.SpiderModule | Spiders over a specified URL, passing each spidered URL and a DOM representation of the HTML found to the downstream module. |
| rdf.SimpleRDFMapper | Creates an RDF model from the recevied metadata sets. It is possible to specify which metadta properties define resources and which properties define values of specific RDF properties of those resources. |
| xtm.SimpleXTMMapper | Creates an XTM model from the metadata sets. It is possible to specify which metadata properties define topics and which properties define names, occurrences, subject indicators or associations between those topics. |
| xtm.ConfigurableXTMMapper | Extends SimpleXTMMapper to allow configuration of the mapping from meta data sets to topics, associations and occurrences to be done using an XML file. This file may be passed to the module at initialisation time. |
| xml.XPathExtractor | Extracts metadata sets from a DOM model using XPath expressions. It is possible to create multiple metadata sets from a single XML source. |
| xml.ConfigurableXPathExtractor | Extends xml.XPathExtractor to allow configuration of the extraction using an XML file. This file may be passed to the module at initialisation time and defines the metadata sets to be extracted and the properties to be found for each metadata set. |
A control framework application for constructing and running an MDF processing chain with one or more sets of meta data.
MDFApp reads a single XML configuration file which must be passed in to the application as the only command line parameter. This configuration file may contain the following elements which are recognised and processed by MDFApp. The following element descriptions all assume that the prefix 'mdf' is mapped to the MDFApp namespace: 'http://www.techquila.com/mdfapp/1.0'
The configuration file must have the mdf:chain element as its root document element. This element may contain the following elements recognised by MDFApp:
The mdf:module element contains only #PCDATA. The content of the element must be the full Java class name of a class which implements the com.techquile.mdf.framework.Module interface. Modules are added to the processing chain in the order in which the mdf:module elements appear within the parent mdf:chain element.
This element contains a list of mdf:property elements and defines a single meta data set which will be passed to each of the modules in the chain for initialisation prior to the first processing run.
This element contains a list of mdf:property elements and defines a single meta data set which will be passed into the first module in the chain for processing. The parent mdf:chain element may contain multiple mdf:run child elements. These elements will be processed in the order in which they occur in the XML file.
The mdf:property element defines a single property/value pair in a meta data set. The element may contain only #PCDATA which is treated as the value of the property./value pair. The element also has a single, required attribute, mdf:key, which specifies the property name of the property/value pair.
Writes the contents of the received metadata set to an output stream.
This module simply writes the key and string value of every entry in the received metadata set. String values are determined by calling .toString() on the value object.
Divides the string value of a specified property into multiple values for another property.
If the input metadata set contains the key defined by com.techquila.mdf.impl.basic.SplitterModule.SPLIT_FIELD then the string value of that property is split on the characters defined by com.techquila.mdf.impl.basic.SplitterModule.SPLIT_CHARS and each split value is written to a property with the key string specified by com.techquila.mdf.impl.basic.SplitterModule.OUTPUT_FIELD with _n appended where n is an integer value starting from 0.
Implements the translation of property names. An instance of this class could be placed between two processing modules, converting the names of properties generated by the upstream module into names that may be recognised and processed by the downstream module.
This class alters the names of properties only. It does not remove the property generated by the upstream module , but simply creates a new property with the same value but a different name.
A translation is specified as a pair of names, the key name to translate from and the key name to translate to. A translation may be specified in one of two ways:
See description (above)
Any property in the received metadata set which has a key name which matches that in the translation table will be removed and its value reinserted with the translated key name specified in the translation table.
Spiders a specified URL, generating a DOM representation of the HTML found there. The spidering follows only links in <a> tags and is performed in a depth-first manner. The maximum depth of the spidering may be controlled by module initialisation.
Note: Each metadata set processed will result in one metadata set being generated for each page spidered. As well as the SOURCE_URL and SOURCE_DOM properties, all other properties in the initial metadata set will be copied to each generated metadata set.
Maps the metadata in the received metadata sets into statements in an RDF model.
After processing, the RDF model may be retrieved by calling getModel() on the module, or it may be written to a file by calling the write() function.
Currenlty this module must be initialised programmatically by calling the function addProperty(). See the javadoc for this module for more information on this function. Calling this function specifies an RDF property, the key which is used to locate the subject(s) of the property and the key which is used to locate the object(s) of the property. Both subject and object may be found in multi-valued keys. When an object key may have multiple values, you may also specify whether the multiple values are collected together in an RDF Seq, Alt or Bag.
The mapper module may have any number of property definitions, and it is allowed for different property definitions to use the same metadata set values.
For each metadata set received, each property definition is processed and if the keys for both subject and object are found in the metadata set, a new RDF statement is generated.
This module creates or updates a topic map from the received metadata sets.
Each metadata set may contain keys which define topics. For each topic, other keys in the same metadata set may define subject indicators, name strings or occurrences. Additionally, associations may be created between topics which are created from the same metadata set.
The initialisation of this module is complex and can only be done programatically (as opposed to using the metadata set passed to the init() function). See the javadoc for more details.
For each metadata set received, each topic definition is processed and if the key which 'triggers' the creation of that topic is found then a new topic is created. For each characteristic definition of the topic definition, if the key for that characteristic definition is present, then the characteristic is created using the value mapped to the key for that definition. For names, the string value of the value is used as the name string. For occurrences, the string value is used as either the occurrence reference or occurrence data (depending upon the configuration of the characteristic definition). For subject identities, the string value is prepended with a fixed string defined in the subject identity characteristic definition.
This module extracts one or more metadata sets from an XML source. A metadata set is defined for each node which matches a specified XPath expression. Properties within that set are defined for each node which match another XPath expression (using the node which defines the metadata set as the root of the expression).
This module may generate many metadata sets for downstream modules as the result of processing a single metadata set from an upstream module.
Definition of the metadata set and property xpaths is currently possible only through additional functions in the module's API. See the javadoc for more details.
The module xml.ConfigurableXPathExtractor allows an XPathExtractor to be configured by passing an XML file name in the initialisation metadata set
For each metadata set received, the property under the key defined by the com.techquila.mdf.impl.xml.XPathExtractor.SOURCE_PROPERTY initialisation property will be located and the source DOM/URL will be extracted.
For each metadata set definition specified for this module, the set of nodes which match the XPath expression for that metadata set definition will be extracted from the XML. For each node in that set, one metadata set will be passed on to the downstream module. For each metadata set to be passed on, the property definitions of that set will be enumerated and for each property definition, the XPath expression will be executed using the node of the parent metadata set as the root. For each node which matches, the string value of that node will be inserted under the key defined for that property.
This module performs the same processing as the xml.XPathExtractor module, but is configurable from an XML file.
The configuration file contains elements to define the XML source to parse, the metadata sets to be created and the properties to be defined for each of those metadata sets. In the following element descriptions, the prefix 'ex' should be mapped to the namespace http://www.techquila.com/mdf/xpathextractor/1.0
type attribute which may have one of the following values: FILE, DOM or STRING. If type is file, then the element may have an attribute src which specifies the file name of the source to be parsed. Otherwise, the element must have an attribute property which defines the property which specifies the source to be parsed. If type is FILE, then the property specified will be expected to contain a file name. If type is DOM, the the property will be expected to contain a DOM Document. If type is STRING, then the property will be a string which can be parsed as XML. This element has no content model.name which specifies an identifier for the metadata set definition. It must have the attribute xpath which specifies the XPath expression for locating the root node of the metadata set. Note that one metadata set will be created for each node that the expression resolves to.property which defines the key string of the property. It must have the attribute xpath which specifies the XPath expression to be executed from the root node of the metadata set to determine values for the property. It may have the attribute multi which if present and specified with any value indicates that the resulting nodes should be treated as separate values for a multi-valued property. If the multi attribute is specified then the string value of each node which results from evaluating the XPath expression will be assigned to a key value property_n where property is the property name specified by the property attribute and n is an integer value starting from 0. If the multi attribute is not specified then only a single value is added to the metadata set, using the value of the property attribute as the key and the concatenation of the string values of all nodes matching the XPath expression as a value.This module performs the same function as the SimpleXTMMapper, but is configurable from an XML file.
The configuration file contains elements to define the mapping from properties in meta data sets to topics and / or associations between topics to be created. In the following element descriptions, the prefix 'xtm' should be mapped to the namespace http://www.techquila.com/mdf/xtm/mapper/1.0
xtm:property - REQUIRED - specifies the property key which must be present in the meta data set for the topic to be created. if a meta data set is received which contains a property with this key value, then a topic will always be created. if a meta data set is received which does not contain a property with this key value, a topic will never be createdThis element defines a mapping from two or more properties in a meta data set to an association structure in the output topic map. An association is created only if one or more topics has been created for each of the members of the association during the processing of the current meta data set.
This element may contain the following child elements:
This element may appear as a child of the following elements:
This element contains only #PCDATA
The #PCDATA content of this element defines the subject indicator URI of a topic which will be used to type the topic map object created by the parent element. By default, the processor will create a single, unnamed topic with this subject indicator URI before beginning the mapping process and all objects which specify this URI as their type-identity value will regard the generated topic as defining their type (or their roleSpec, in the case of association members.
This element wraps a list of xtm:name elements which are used to provide base name strings for the topic used to represent the type of the object generated by the parent element. This element has no attributes and contains one or more xtm:name elements.
This element appears only within an xtm:type-names element, an xtm:assoc-names element, or an xtm:topic element.
This element may contain only #PCDATA
This element has the following attributes:
xtm:property - OPTIONAL - If specified, then the property named in the attribute value will be used to provide the content of the name string. In this case, the content of this element will be ignored.xtm:prefix - OPTIONAL - The content of this attribute will be prefixed to the generated name string.xtm:suffix - OPTIONAL - The content of this attribute will be appended to the generated name string.This element defines the value of a base name string to be assigned to either the typing topic of the object generated by the parent of the xtm:type-names element which contains this element, or to the topic generated as a result of processing the parent xtm:topic element of this element.
The name string value is either simply copied from the content of this element, or if the xtm:property attribute is specified, the name string value is taken from the value of the named property in the meta data set. Each xtm:name element defines a separate base name string. Multiple names may be assigned by the use of multiple xtm:name elements, where allowed by the containing element.
This element may appear only as a child of an xtm:topic element.
This element must be empty
This element has the following attributes:
xtm:property - REQUIRED - The property which will provide the value string for the subject indicator. If this property is not present in the processed meta data set, then no subject indicator will be generated. If the property is present, then the generated subject indicator will use the value of this property.xtm:prefix - OPTIONAL - A string which will be prefixed to the value of the property specified by xtm:property in order to create the subject indicator.This element may appear as a child of the xtm:topic element.
This element may contain the following child elements:
This element may have the following attributes:
xtm:property - REQUIRED - Specifies the key of the meta data set property which provides the value for this occurrence. If a property with this key is not found in the received meta data set, then no occurrence will be generated.xtm:inline - OPTIONAL - If specified with the value "1", then the value of the occurrence will be resourceData rather than resourceRef, that is the value of the property specified by xtm:property will be used as an inline resource string rather than as the address of an out-of-line occurrenve resource.xtm:multi-valued - OPTIONAL - If specified with the value "1", then the property named by xtm:property may occur more than once in the processed meta data set. The processor will create one occurrence for each property. NOTE that the representation of multiple values is (property name)_x where x is an integer value starting from 0 - the value of the xtm:property attribute should only be the property name, the suffix will be determined automatically by the processor.This element defines a mapping from a property in the processed meta data set to one or more players of a given role in the association. For this mapping to take place, the same property must be used to create topics in the topic map (i.e. the same property key must also appear as a value of an xtm:property attribute of an xtm:topic element).
This element may appear only as a child of an xtm:association element.
This element may contain the following child elements:
This element may have the following attributes:
xtm:property - REQUIRED - specifies the property which is used to create the role players of this member. This property must be used to generate one or more topics in the topic map (that is, the same value must appear in an xtm:property attribute of an xtm:topic element).xtm:multi-valued - OPTIONAL - If specified with the value "1", then the property named in the xtm:property attribute will be treated as multi-valued. Each value of the property will result in the creation of a single player of this role.Defines a list of base name strings for the topic which types the association created by the containing xtm:association element, where each name is scoped by the topic which defines the roleSpec created by the containing xtm:member or xtm:root-member element.
This element may occur only as a child of an xtm:member or xtm:root-member element.
This element may contain the following child elements:
This element has no recognised attributes.
Defines the anchor member of an association. The processing of this element is almost exactly the same as the processing of an xtm:member element. The only difference is that if this element is specified as being multi-valued (by having the value "1" for its xtm:multi-valued attribute, then one association is created for each value of the property. In each association thus created, the players of other members which are also defined as multi-valued will be taken only from the value with the matching index of the root-member.
| Up: MDF - A Metadata Processing Framework | |
| Previous: MDF Technical Specification | Next: Downloading and Building MDF |