Downes.ca ~ Stephen's Web ~ Topic Representation and Learning Object Metadata

Topic Representation and Learning Object Metadata

Jan 25, 2002
By Stephen Downes

Topic Maps:

   From "The TAO of Topic Maps":
"Topic maps are a new ISO standard for describing knowledge structures and associating them with information resources. As such they constitute an enabling technology for knowledge management. Dubbed “the GPS of the information universe”, topic maps are also destined to provide powerful new ways of navigating large and interconnected corpora."

   http://www.xml.com/pub/a/2000/06/xmleurope/maps.html
   http://www.gca.org/papers/xmleurope2000/papers/s11-01.html
   http://www.infoloom.com/whitepaper.htm

The maps are derived by identifying topics, noting occurrances, and drawing associations. Though it is not explicitly stated in the documentation I read, the grouping of topics in this way - properly called 'clustering' in AI literature (see Rumelhart & McClelland, eds., Parallel Distributed processing, Volume 1, Introduction (I forget the exact name of the paper & my books are in a moving van somewhere in Ontario) - may be generated using neural net technologies.

I personally think this is the way to go and have sketched an approach in my paper "Relevant Similarity" (unplublished but available at http://www.downes.ca/cgi-bin/website/view.cgi?dbs=Article&key=964214207&format=full )

While topic maps conform to an ISO standard (ISO 13250:2000 ) topic maps are unformtunately not expressed in RDF (Resource Description Framework (W3C - http://www.w3.org/RDF/    ). So the discussion around topic maps appears to have peaked in mid-2000.

RelaxNG

Entities and entity relationships are defined in RDF using schemas ( http://www.w3.org/TR/rdf-schema/ ). In RDF, a schema describes how to use RDF to describe RDF vocabularies. Think of a schema as a dictionary XML files can use to define the names, meanings and allowed values for XML tags.

RelaxNG is an alternative schema language. It replaces the W3C's schema language with something that is easier to use and (importantly) write software for. RelaxNG is a combination of two previous alternative schema languages. The advantage is that it putrs the capacity to write schemas - and therefore, various new types of XML documents - into the hands of many more people than the W3C specification would.

http://www.thaiopensource.com/relaxng/design.html
http://www.oasis-open.org/committees/relax-ng/
http://www.oasis-open.org/committees/relax-ng/tutorial-20010810.html
http://sourceforge.net/projects/relaxng

SCORM (and learning object metadata)

Created by ADL (Advanced Distributed Learning), a partnership of commercial, educational and government organizations, the Sharable Content Object Reference Model (SCORM) is a set of standards intended for use by organizations providing learning to the U.S. military (and therefore, to military organizations in allied nations). ADL’s SCORM draws on the Instructional Management System (IMS) protocols and extends them, defining course metadata and program interface specifications.

SCORM and IMS have esssentially two components:
·   A set of application program interfaces (APIs) that define how learning objects communicate with learning management systems
·   A schema for describing learning object metadata in XML
Of these, only the second part is relevant to the current discussion.

SCORM - http://www.adlnet.org/
IMS - http://www.imsproject.org/

See my essay, Learning Objects: http://www.downes.ca/files/Learning_Objects.htm

Despite their detail (SCORM defines roughly 60 tags or so) both IMS and SCORM are limited in their application. Many of the tags define bibliographical information - the author of the learning object, the publisher, etc. - and other tags describe its educational functions - degree of difficulty, length, level. There is limited description of the topic area of the learning object.

One purpose of SCORM and IMS metadata is to facilitate the location of learning objects from a repository. In small or medium sized repositories, an instructor or student can locate a small set of learning objects by identifying the topic area and the level of learning required. However, as the granularity of learning objects decreases (that is, as a learning object is a smaller chunk of learning material) and as the size of learning object repositories increases, there will be a need for much more fine-grained topic descriptions than SCORM or IMS provide.

It should be pointed out that similar issues will arise in other areas related to the application of SCORM. The metadata defined by SCORM may be found to be insufficient for a large number of learning object properties. For example, rating of learning objects by professional associations, pricing and copyright information required under digital rights management schemes, and cultural appropriateness guidelines established by church and cultural groups, all fall outside the bounds of SCORM. Additionally, because SCORM was defined for a military market (and because IMS is defined in a largely American context), the definitions provided my SCORM and IMS may be inappropriate in a global context (hence we see the development of additional basic learning object metadata schemas in Canada, Australia and Europe).

Some examples:

Tutorial Markup Language - http://www.ilrt.bris.ac.uk/netquest/about/lang/
Translation Memory eXchange - http://www.lisa.org/tmx/tmx.htm
Chemical Markup Language - http://www.xml-cml.org/

Third Party Schemas in Learning Object Metadata

One way to generate the additional information needed is to define additional schemas defining additional tags available for use by learning object metadata (this may be a major reason why the most recent version of SCORM explicitly adopts RDF). Additionally, the use of multiple schemas allows description metadata to be generated from multiple sources. Complex metadata about a learning object, for example, could be generated by reading SCORM-compliant metadata about the object generated by the object's author and rating and certification metadata about the same object generated by an independent professional association. These two sets of tags would be combined to form a single, complex metadata description, which in turn could form the basis of an instructor or student's learning object selection.

In order for third parties to generate their own metadata about a learning object, they need to be able to create schemas describing this new metadata. For example, a professional association may have a tag indicating whether the material was approved or rejected by their review board. A schema language like RelaxNG may be used to generate this schema, since it will be easier to use to define the schema and hence perhaps more appropriate for a body that is not comprised of computer professionals (that said, most likely they will use a schema generation tool).

http://rtiess.tripod.com/dtdxml.htm
http://www.xmlscript.org/samples/Schema/docs/Schema.S.1.html
http://www.sun.com/software/xml/developers/instancegenerator/

Topic Representation in XML

To create the fine grained topic descriptions needed to search for smaller chuncks of learning material in larger learning object repositories it would be useful to have something like topic maps, for two reasons:
·   topic maps can be generated automatically from a set of data using neural network processing
·   topic maps can achieve fine levels of description

As defined above, however, topic maps cannot be used to define learning object metadata schemas because they are not expressed in RDF-compliant format. However, a great deal of work has already been done in the area of representing topic specific information in XML.

Semantopic Web, for example http://www.universimmedia.com/semantopic.htm is a merger of the concept of RDF semantics and topic maps. Semantopic Web is mostly a project directed toward the representation of semantical topic maps (for example, there is a graphical version using generated Scalable Vector Graphic (SVG - http://www.w3.org/Graphics/SVG/Overview.htm8 ). Input into the Semantopic web is limited to forms-based input, and output is restricted to data using a small number of schemas.

That said, Semantopic Web is capable of creating high degrees of organization of data. Of particular interest are:
·   Ontologies - descriptions of the concepts and relationships that can exist for an agent or a community of agents
http://mondeca-publishing.com/s/anonymous/title10044.html
·   Semantic networks - associative models of data http://mondeca-publishing.com/s/anonymous/title10200.html

Topic Representation and Learning Object Metadata

There are two major ways topic representation is applicable to learning object metadata:
·   As a mechanism for generating topic-specific schemas for learning object classification
·   As a means of dynamically organizing learning object repositories for search and retrieval functions

Learning object classification: in order to place learning objects within a curriculum framework, it is necessary to have a langauge that represents the elements of that framework, and then to have a set of tags in learning objects that contain elements of that language. For example, a mathematics course may have a short segment on 'the definition of right triangles' . In order to locate this element of the curriculum it is necessary to have a description of mathematics as a field of study such that 'the definition of right triangles' occupies a certain niche in the curriculum.

The curriculum, therefore, would be an XML file that employed, among other schemas, the schema used to delineate topic areas within the field of mathematics (indeed, the curriculum just is an ordered subset of the values of instances of that schema). A person creating a curriculum would select and order topics from the topic area map of mathematics.

At the same time, learning objects covering matematical topics would also employ the schema used to delineate topic areas in mathematics. This allows the author of the learning object to precisely place the learning object into a specific topic area and to add metadata (over and above the SCORM standard) describing that placement.

An instructor or student searching for learning materials may therefore work with a curriculum. The curriculum identifies specific topics within a field of study. Learning objects may be identified according to their location in that field of study. Therefore, the student or instructor would be able to obtain a list of learning objects specific to a particular elelement of the curriculum.

Such a structure allows for a great deal of dynamism. For example, it allows the owners of a curriculum (a school board, say, or a provincial government) to change the curriculum without thereby requiring a change in the metadata describing thousands or millions of learning objects. Moreover, this new curriculum would be immediately reflected in the subsequent creation or selection of any courses. It also allows new or extended domains of learning to be added without requiring changes in learning object or curriculum definitions.

Finally, because topic mappings may be generated dynamically from a body of knowledge using neural net technologies, the use of topic maps may also provide an alternative method for searching learning object repositories (and also, an alternative method for generating curricula). This, of course, does not preclude schemas and topic maps created by hand by bodies of experts or professional associations who may have an interest in describing a field of study a particular way (for example, biologists would probably prefer to use existing schemes for classifying plants and animals rather than to allow a neural net to do it for them).