Identity

So far the topics that we have created have little in the way of machine-processable identity. For example, the topic with id "sort-string" is used to indicate that a variant name is a normalised sort string. But if we were to try the topic map in a topic map application, how would the application know that it is this particular topic that indicates that the nature of the variant name is suitable for sorting ? What is needed is some commonly agreed identity for the topic that represents the concept of "suitable for sorting".

This problem is not limited to the "structural" topics, which might be used to express processing options to a topic map application, but applies also to all other topics we create. For example, if we have created a topic for the concept of a 'company', ideally we would like to apply some commonly agreed identity for the concept to that topic so that the topic map could be more easily interchanged.

In addition to application-specific use, a topic map processor also takes note of the identity assigned to topics. When the processor determines that two topics are 'about' the same thing, then those topics will be merged. How a processor determines that two topics are 'about' the same thing may be application specific, however XTM does define some basic principles, based on the different forms of identity described below.

The subject that a topic represents can be identified either by reference to the resource that the topic represents; or else by reference to a resource that in some way describes the subject in a way that is meaningful to a human being. These resources are known as subject-constituting and subject-indicating resources respectively. In addition to these formal indicators of identity, the topic map paradigm also includes a mapping of the names of topics to an identity. This name-to-identity mapping is defined by a rule called the topic naming constraint.

Subject-Constituting Resources

If we want to make some assertions about a Web page, or some other retrievable resource with a unique address, we can use the address of the resource as the identifier for the topic we create to represent it. In this case, the identifier is said to reference a subject-constituting resource. A topic may only reference a single subject-constituting resource - this makes sense because a topic can only ever be about a single thing. It is considered an error if an attempt is made to merge two topics with different subject-constituting resources. When two topics have the same subject constituting resource, a topic map processor will regard them as being about the same thing and will merge them. The mark-up for a subject-constituting resource is the <resourceRef> child element inside the <subjectIdentity> element. The resource pointed to by the Xlink href attribute of the <resourceRef> element is the subject-constituting resource for the topic.

<topicMap>
  <topic id="redmond-home-page">
    <subjectIdentity>
      <resourceRef xlink:href="http://www.redmondcomputers.com/" />
    </subjectIdentity>
    <baseName>
      <baseNameString>
        The Redmond Computers Inc. Home Page
      </baseNameString>
    </baseName>
  </topic>
</topicMap>
          

Sample 7 - Using a Subject Constituting Resource

Subject-Indicating Resources

In the example above, we use the address of the company's web-site in order to make some assertions about the site itself. If we want to make some assertions about a company itself, we could use the address of the company homepage as an identifier for the topic. In using the address of the company web-site in this way, we are assuming that a reader who reads the home-page of the web-site will understand that it is the company we are describing in our topic map. In this case, the identifier is said to reference a subject-indicating resource. A topic may have any number of subject-indicating resources because each of these resources describes the thing that the topic represents and is not the represented thing itself. For the same reason, a topic may have both a subject-constituting resource and one or more subject-indicating resources. When two topics have one or more subject-indicating resources in common, a topic map processor will consider them to be about the same subject and will merge them. In addition, if one the subject-indicating resources of a topic is the address of another topic, the topic map processor will consider those topics to be about the same subject and will merge them.

At first, this last constraint may seem a little strange and it is worth describing in a little more detail. Consider two topics, A and B. Topic A represents the concept of a company. Topic B has the address of topic A as a subject indicating resource. This means that topic B is about the subject described by topic A. This situation is shown in Figure 5, below. As we already know that topic A represents the concept of a company, it must therefore be obvious that the one subject that A describes best is that concept. Therefore B must also represent the concept of a company because A is the descriptor for that subject. This means that A and B are 'about' the same subject and should be merged.

An example of two topics representing the same subject.

Figure 5 - One way in which two topics can represent the same subject (SVG Version)

Sample 8 shows the syntax for treating a URI as a subject-indicating resource. The <subjectIndicatorRef> element is a child of the <subjectIdentity> element, which uses an XLink simple link to point to the resource that describes the subject of the topic.

<topicMap>
  <topic id="xyzzy">
    <subjectIdentity>
       <subjectIndicatorRef
           xlink:href="http://www.redmondcomputers.com/"
       />
    </subjectIdentity>
    <baseName>
      <baseNameString>Redmond Computers Inc.</baseNameString>
    </baseName>
  </topic>
</topicMap>
          

Sample 8 - Using a subject-indicating resource

The Topic Naming Constraint

The topic naming constraint states that, in the words of the XTM specification "any topics having the same base name in the same scope implicitly refer to the same subject". This rule essentially makes the label assigned to a topic into a form of identity for the topic. It is important when creating the labels for topics that the author be aware of this rule. When creating a new base name, an author should be sure to qualify the name either within the label string itself, or else to scope it appropriately.

Sample 9 shows how this rule can lead to some unexpected results. The topics with id "rci-sales" and "abc-sales" are intended to represent the sales departments of Redmond Computers Inc. and ABC software respectively, but because each has the name "sales" in the unconstrained scope, a topic map processor will assume that those topics refer to the same scope and will merge them. Obviously, in this case such a merge would be incorrect. In order to prevent the merge from happening, the author of this topic map must apply more qualified names to these topics.

<topicMap>
	<topic id="xyzzy">
     <baseName>
       <baseNameString>Redmond Computers Inc.</baseNameString>
     </baseName>
   </topic>
	
<!-- Departments in Redmond Computers Inc. -->

   <topic id="rci-sales">
     <baseName>
       <baseNameString>Sales</baseNameString>
     </baseName>
   </topic>
	...

  <topic id="abc">
    <baseName>
      <baseNameString>ABC Software</baseNameString>
    </baseName>
  </topic>

<!-- Departments in ABC Software -->

  <topic id="abc-sales">
    <baseName>
      <baseNameString>Sales</baseNameString>
    </baseName>
  </topic>
  ...
</topicMap>
          

Sample 9 - Unexpected topic merging

Sample 10 shows two different ways in which more qualified names can be created. One approach is to keep the same name string and add a differentiating scope; the other way is to modify the name string to include some differentiating information. In this example, we use the company itself as a differentiator. The assumption made here is that any given company will have only one sales department (for a multinational company, of course, both company and geographic region might be required for complete differentiation).

To differentiate using scope, we simply add a <scope> element to the <baseName>, containing a <topicRef> pointing to the topic which represents the appropriate company. To differentiate using a modified name string, we include the company name in the department name string. If it is intended that the topics representing the departments should be accessible outside the context of the company, a combination of these two approaches is most appropriate as the more qualified name string will be useful for display in cases where the two departments might occur in the same list of search results set.

<topicMap>
  <topic id="xyzzy">
     <baseName>
       <baseNameString>Redmond Computers Inc.</baseNameString>
     </baseName>
   </topic>
	
<!-- Departments in Redmond Computers Inc. -->

   <topic id="rci-sales">
     <baseName>
       <scope><topicRef xlink:href="xyzzy"/></scope>
       <baseNameString>Sales</baseNameString>
     </baseName>
     <baseName>
       <baseNameString>Redmond Computers Inc., Sales</baseNameString>
     </baseName>
   </topic>
	...

  <topic id="abc">
    <baseName>
      <baseNameString>ABC Software</baseNameString>
    </baseName>
  </topic>

<!-- Departments in ABC Software -->

  <topic id="abc-sales">
    <baseName>
      <scope><topicRef xlink:href="abc"/></scope>
      <baseNameString>Sales</baseNameString>
    </baseName>
     <baseName>
       <baseNameString>ABC Software, Sales</baseNameString>
     </baseName>
  </topic>
  ...
</topicMap>
          

Sample 10 - Two ways to avoid name-based merging

Up: Topic Map Basics
Previous: Scope Next: Topic Merging