INST > Clients > JISC

Type or class hierarchies for
e-portfolio interoperability with
Semantic Web architecture

Simon Grant – 2008-01-04

The context and the challenge

There is much personal and personal-related information created, stored or used in the course of practice associated with e-portfolio systems. Early attempts at e-portfolio-related interoperability such as IMS LIP chose to try to represent a large sample of that information, and in the attempt to reduce ambiguity, to place all of the elements into hierarchical structures. The problem was that this did not fit many uses.

It is perfectly possible to use blogs or blog-like tools to support e-portfolio-related practice. Blogs are in any case a natural way of supporting learning logs, reflective diaries, and other related practice, provided the blogging system ensures appropriate privacy for entries that are not intended for wider reading. A blog entry can serve to record any fact, desire or reflection. If appropriate access controls are in place, a blogging system could be used to present these things to an external audience.

In a blogging system used for portfolio practice, the meaning and significance of each entry is represented only by text, which is largely acceptable for human readers, but does not help automatic machine processing.

On the other hand, other current systems with e-portfolio capabilities have a more detailed model than blogging systems. Such models may need to support various kinds of personal development practice, and also the construction and presentation of information supporting progression, such as CVs. Because there is much general commonality in CVs, in application forms, and in personal development records, there is scope for an interoperability specification to enable information to be transferred from one system to another in a way that will allow some or all the information to be placed so that they are useful for supporting portfolio-related practice in the context of the receiving system.

The challenge is therefore to create an interoperability specification which can be used for transfers between any of the kinds of systems mentioned above, without needing to impose changes in practice. The more similar two systems are, the less work should be needed to restore any unavoidable loss of information when information is transferred.

The concept of a type hierarchy in this context

When considering information from the kinds of systems outlined above, it is clear that some information represented simply as entries in a blogging system may be represented as some more specific type in another system designed for more specific use with portfolios. That is, for example, a meeting would be a specific type of entry.

This is reflected in the information structure associated with each type. For a standard blog entry, the typical information is given by standards such as RSS and Atom. The terms differ between RSS, Atom and elsewhere, but conceptually the most basic list could be:

Any activity, event or experience in general has both start and finish dates and times, whether actual in the past or projected in the future, and this is useful to record properly, in a similar way to the dates of creation and modification. These dates can then be used to order the records for chronological display, to determine which events overlap, how events relate chronologically to goals or targets, and probably many other uses.

This kind of record therefore needs

A meeting is a specific kind of activity. Generally, both from common sense and current practice, this will also have start and finish dates and times, but also perhaps fields such as:

Thus, it is reasonable to see a meeting as a kind of event, and an event as a kind of entry.

Something a bit like an inheritance hierarchy

We can now imagine these three as being in a kind of class hierarchy tree, with entry being the most general, event in the middle, as a subclass of entry, and meeting being the most specific, as a subclass of event. A little like as in object-oriented systems, we can imagine sub-classes inheriting the fields, properties or attributes of their superclasses. If one type is defined as a subclass of another type, the only fields that need to be specified for the subclass type would be the ones that are not defined in the superclass: so if an event is defined as a subclass of entry, only the three fields listed above would be defined for an event.

This hierarchy can be represented in several ways. Relating this to the W3C's Semantic Web work, the obvious predicate to use is rdfs:subClassOf in conjunction with other RDFS, SKOS and/or OWL structures, to make a model, which could be called an ontology. This predicate is used not only in Semantic Web documentation, but also in the DCMI Abstract Model semantics.

Placing new types in the hierarchy

If someone wanted a new type, whose desired fields were:

that would clearly be a subclass of event, not a subclass of meeting. The principle is that a new type is placed immediately below the type which has the maximal subset of its fields. However, care needs to be taken that the fields actually mean the same thing in all cases.

Changes in the hierarchy

Definitions don't last forever, and at some point people may want to change or extend the definition of a particular type. To allow for that, we need to apply good versioning practice. Particular versions of types never change, and the "latest" version points to different versions as time goes on. A particular version of a type would be defined as a subtype of a another particular version of a different type. It may or may not be possible to coordinate the versions, so that a version set is released all at once. While this is tidy, it may also hold up progress, so may be a bad from that point of view.

Newer versions that are simply extensions of older versions should be noted as such, by including in the ontology the fact that the newer version is a subclass of the older version. As well as subclass relationships between versions of the same date, subclass relationships should be added to superclasses of earlier dates, to allow a degree of (degraded) backward compatibility.

Sending and receiving information

When using such a hierarchy, things can be represented in more than one way. For example, something that is a meeting could be represented as an entry. A sensible rule is for any sending system to use the most specific type that is applicable. If the system represents something as a meeting, and has the appropriate information to fill in the expected fields for the meeting class, then it should be sent as a meeting. It turns out that the type information just a helpful indication of what information is contained, not something that is necessarily strictly defined. The only constraint is that all the fields defined for the type must be present within the information communicated. Extra fields may be included, but any such extra fields should be considered as non-essential to the meaning of the information transferred.

A receiving system will maintain a list of types that it can deal with: these will be expressed as URIs.

  1. If the sent type is on the list of recognised types, then the receiving system uses its known method of processing the item.
  2. If the sent type is not on the list of types recognised by the receiving system, then it goes to look at wherever the type is defined (in an ontology or wherever), looks up the type class hierarchy and interprets the item as an instance of the nearest superclass that it can deal with.
  3. In the latter case (and possibly the former) there may be fields present within the sent information which are not defined as basic to that type. The optimum response is for the extra information to be stored in a form corresponding to the form it was sent in, and displayed along with the content or description. However, no specific action can be expected of the receiving system.
  4. If the receiving system works with an older version of the same type, it may or may not be able to cope with the newer type. The superclass relationships given in the ontology should guide here.

An alternative approach to receiving information would be to ignore the type altogether, and just look at the fields or predicates themselves. Given a set of well-defined fields or predicates, it would be a matter of a tree search to find the best fitting type. While this is possible, it involved more processing. This leads back to the idea that type labelling is helpful, but not essential in principle. The type label specifies the set of fields which are to be expected.

Inheritance of possible relationships

Similar things can be said of relationships. It is not clear whether this is correct, but it would seem good for more general classes to partake in more general relationships, and more specific classes to partake in more specific relationships. See rdfs:subPropertyOf. But it could be, for instance, that the same relationship applies to several levels of type. A relationship at one level may or may not mean that there is a more general relationship at a more general level.

Implications

Arranging things in this way means avoiding the rigid constraints of having just one "type" and perhaps one "subtype". The type class hierarchy would be maintained centrally, and each class would have its own URI of course. Most of the required architecture or infrastructure is already there in the W3C's Semantic Web materials.

It should be noted that this whole discussion deals only with types that are distinct in terms of content. There could be another distinction, which one can see in some unmaintained IMS LIP type vocabularies, between items of identical structure but different human semantic implications. Thus, a "work" activity could be understood as being different from a "voluntary" activity, even if it has the same kinds of information attached. It may be that the linked resources are expected to beof a different type, or that other predicates may be imagined, but are not yet represented. In any case, care is needed.

page maintained by and © Simon Grant, edition 2008-01-04