- My eBooks
- Ed Radio
More info
About
About Stephen Downes
About Stephen's Web
About OLDaily
Subscribe to Newsletters
gRSShopper
Threads Discussions
Privacy and Security Policy
Subscribe
Web - Today's OLDaily
Web - This Week's OLWeekly
Email - Subscribe
RSS - Individual Posts
RSS - Combined version
JSON - OLDaily
Viewer
Social Network
Stephen's Web and OLDaily
Half an Hour Blog
Google Plus Page
Twitter Feed
Flickr Photos
Huffington Post Blog
Slideshare
Blip TV
Professional
National Research Council Canada
Research Topics, Research Wiki, Code
Publications
Presentations
All My Articles
Contact
Email: stephen@downes.ca
Email: Stephen.Downes@nrc-cnrc.gc.ca
Skype: Downes
Principles of Distributed Representation
|
This is the edited text of a talk delivered at the EDUCAUSE Seminars on Academic Computing, Snowmass, Colorado, August 9, 2005. Resources: - MP3 Audio of the talk - Power Point Slides - Unedited Transcript - MS-Word Version of this paper |
Introduction
Thank you. It's a pleasure to be here. I have my water. I need my water. You know, I feel a little bit out of my element here. I live at sea level and so I don't spend a whole lot of time at this altitude. So if I sit down part way through my talk...
And I'm also a bit out of my element because I'm kind of outside EDUCAUSE, I live in another country, three time zones away, and being a bit on the outside, I come here, I come here to a talk like this and it's almost like kind of coming to see the establishment. And I don't get a chance to talk to this particular group of people a whole lot, and so, what am I going to say?
And of course I'm here at this conference in this beautiful location - and thank you so much for inviting me - and thirty years of tradition, and I'm wondering, you know, how can I do anything different, be distinct, because you know I'm kind of an outsider, and I was thinking, yesterday, that maybe I should do something like quote some Che Guevara or something like that, and I thought that, and I looked through but couldn't find any good quotes.
I did bring the book. So I figure, I'm probably the only person at this conference to wave a copy of Che Guevara at the podium, and if not, then I'm probably in very good company.
OK, well I thought that was pretty funny. It took me a long time to come up with that.
Today's talk is 'Principles of Distributed Representation' and when you looked at the outline you were probably thinking to yourself, "Oh no, another metadata paper." And if you're like me you're probably tired of metadata papers. And today's talk is sort of about metadata, and I will talk about metadata because I did kind of promise that I would, and I've learned from hard hard experience that you really should talk about whatever's in the abstract.
But today's talk is also not about metadata. It's about knowledge, it's about the changing model or picture of learning that new technology brings to us, it talks about networks in specific, and then, near the end, I begin to apply this to metadata. Now, I'm going to talk a lot about things that aren't metadata, therefore, but as I go through this you should be thinking as I go along that each thing that I say is about metadata. I know that doesn't make a whole lot of sense, but it'll come a bit clearer, I hope it come's clearer.
Knowledge: The Traditional Theory
We begin with knowledge, and I have to begin with knowledge because my training is in philosophy, and if I don't begin with knowledge I'm kind of lost, like in a rowboat without a paddle, or whatever. This would be different here.
And we have this picture, don't we, of what knowledge is. And the picture - I've sort of caricatured it on the screen here - knowledge is like entities in the brain corresponding to sentences like, 'Paris is the capital of France.' And if somebody asks you, 'What is your knowledge like?', 'Paris is the capital of France,' you probably talk about the sentence, and the meanings of the words, and how the words go together, and the syntax and forms of grammar.
And there's a fairly established theory, and some of the more recent writers, people like Chomsky and Fodor, talk about the, if you will, the writing in the brain. And we can think about that literally, people like Fodor think about that literally, or we can think about that metaphorically, but even so, that is kind of the picture of knowledge that we have.
It is what I sometimes think of as the 'information theoretic view' where communication involves getting a bit of knowledge, like that sentence, 'Paris is the capital of France,' from point A to point B. From professor to student. From speaker - me - to you - people in the audience.
And the whole theory of distance learning is wrapped up around this concept, so you get for example Moore's concept of transactional distance where you try to bridge this gap. You know, there's been so many conferences, 'bridging the gap' between this and that, and we try to improve the communication and create interaction - when I read about interaction, I do have a background in computers, I think 'checksum', oh yeah, he's invented checksums.
Moore, transactional distance: "the physical separation that leads to a psychological and communications gap, a space of potential misunderstanding between the inputs of instructor and those of the learner." So it's communication theoretic, isn't it, and they're talking signals sent and signals received, noise, feedback, all of that sort of thing.
That's the traditional picture. In effect, knowledge is like sentences. Those of you who are familiar with RDF, you're probably all familiar with RDF, the 'subject verb object' type of formation. Vocabulary in a language is unambiguous. The fact that you invited me from another country three time zones away, you presumed that when I used words, I'd probably use words much the same way you use words, and if I said the word 'Paris' you'd pretty much get what I meant. I depend on that sometimes, and I'm certainly depending on that at the moment, because otherwise it would be like that commerical where I'm talking Russian or something.
Description is pretty much concrete. This is a bottle of water. This has a blue lable. A 'horse' is a hourse (of course, of course). Ah, I couldn't resist. yeah you get on a roll when you're typing these slides. And that has gotten me in trouble before. I don't always delete... anyhow.
Revising The Traditional Picture
But, none of this is true. And not only is it not true empirically, it can't be true. Because, if it were true, then context would have no effect on truth or meaning. But context sensitivity is everywhere. And I've sort of spewed a list of references there for you.
Wittgenstein: meaning is use. Quine: the indeterminacy of reference, the indeterminacy of language. When a native points to something and says 'gavagai' does he mean 'rabbit' or does he mean 'spirits of my ancestor'?
van Fraassen: scientific explanation. 'A is the cause of B' can only be understood in the context of an alternative event, C. Why did the plants grow? Well the plants grow because we put seeds in the ground. As opposed to, the plants grow because I put fertilizer in the ground. As opposed to, the plants grow because, well, there's photosynthesis, and there's sunlight, and all of that. As opposed to, the plants grow because God wills it. The explanation depends on your context.
Hanson: causation. What was the cause of the accident? Well, it was the brakes, it was the drunken driver, it was the bush at the side of the road. George Lakoff: categorization. Different cultures organize the world different ways. There is indeed, says Lakoff, a culture out there that classifies 'women, fire and dangerous things' as one category, mand everything else as another category.
Robert Stalnaker, David Lewis: modality, the logics of necessity and possibility. They're based on the most similar possible world. But what makes a possible world the most similar? Well that depends on how you view the world that you're in.

What we know, crucially, depends on our point of view. Now I tried to come up with a bit of a diagram here, this is a new one for me, but, in the centre there, that's reality, properly so-called, and then around the outside of that diagram we have four points of view and you can see that as we each look at reality from out different point of view our view of reality is slightly different, which I've represented by reorganizing the letters in the little boxes.
But in fact, all we have is our point of view, all we have are the things in the little boxes. And language, which is what we use to try to get at what's in the middle, is at best an approximation, and at worst a parody of what knowledge is actually there.

Now that's a hard concept. So I'm going to draw it out a bit. Some of the implications of this. And again, remember, I'm talking about knowledge, but I'm also talking about metadata.
Implications of the Revised Theory
1. Knowledge is subsymbolic. That is to say, what we know is not isomorphic with the words that express what we know. Another way of saying the same thing is, and those of you who are educators I'm sure have seen this in practice, the mere possession of the words is not the same as knowing something. The knowing of something depends not simply on the words but on the application of the words in the appropriate context.
And since I'm... I'll refer to Michael Polanyi here as well, and point out that a lot of knowledge indeed cannot be expressed in words, personal knowledge, tacit knowledge, the skill of how to throw a dart. Believe me, if that knowledge could be expressed in words, I would be a good dart player.
2. Second, crucially, knowledge is distributed. There is no specific entity that constitutes the knowledge that 'Paris is the capital of France.' Now think about how that contrasts with the picture I drew at the beginning of this talk, where we have this thing in our mind that's the knowledge that Paris is the capital of France. Well that knowledge doesn't occupy a particular place in the mind. It's spread out, it's in billions of neurons.
But not only that, it's not even completely entirely contained in the mind. My knowledge that 'Paris is the capital of France' is, partially, contained in you. Because I need to know what the word 'Paris' means, what the concept of a 'capital' is, what the word 'is' is; the Oxford English Dictionary has, what, fifteen pages trying to define the word 'is'. There is no given person who has that particular paradigm bit of knowledge 'Paris is the capital of France'.
Now I know it sounds unintuitive, so let me give you a slightly more intuitive way, an intuitive way, of representing this. This morning, if you were awake, and I sincerely hope you weren't, we saw the space shuttle come in for a landing. And it did in fact land. Rock and roll; we like that.
Where does the knowledge of how to launch, fly and land a shuttle reside? What person has this knowledge? And clearly, as soon as you reflect on that, you realize, nobody. Nobody could. There is so much involved in the launching, flying and landing of a shuttle that no one person could possibly have that knowledge. Some people know how to make shuttle tires. Other people know how to make shuttle tiles. Other people know how to do the launch sequence, somebody knows how to do that countdown, '10, 9, 8...' I guess it's a skill. Somebody in the shuttle knows how to go out of the shuttle and pull the little bit of paper out from in between the tiles. Somebody else knows... and you get the idea.
What I'm saying is that all knowledge is like that, not just the complicated stuff, because, again, this is, like, my background in philosophy, as soon as you begin pushing even the simple stuff, like 'Paris is the capital of France', it gets really complicated in a hurry. What do you mean by 'capital'? What do you mean by 'is'?
3. Knowledge is interconnected. This is very different from the traditional picture. The traditional picture, you have a sentence, 'Paris is the capital of France', that's it, you're done, you've got your knowledge. But 'Paris is the capital of France' - that bit of knowledge is actually a part of other bits of knowledge, and other bits of knowledge are part of the knowledge that 'Paris is the capital of France'.
The knowledge that 'countries have capitals' is parft of that knowledge. The sentence 'Paris is the capital of France' wouldn't make any sense to you if countries didn't have capitals. And it's playing with these sorts of connections that is the basis for a whole lot of jokes. "What's the capital of France? About 23 dollars." That sort of thing, and you mess around with the preconceived understandings of the words.
Even sentences like 'ducks are animals' are related, in a complex chain, to the sentence 'Paris is the capital of France', it's like Quine says, it's a web.
4. Knowledge is personal. And you probably if you go to knowledge management conference you hear Polanyi Polanyi Polanyi and they talk about, oh let's extract all this tacit knowledge and we'll put it in a database, and, if you read Polanyi, it's exactly what you can't do, because the knowledge that's in your head, it's embedded, it's personal, it's sitting there in a context. If you pull it out and put it up, it doesn't make sense any more.
Your belief that 'Paris is the capital of France' is quite literally - I don't mean this metaphorically - it's literally different from my belief that 'Paris is the capital of France'. And if you think about it, think about the word 'Paris'. All right. How many of you thought about the word 'plaster'? One, two? OK. How many of you thought about the word 'Hilton'?
Now, I've used two examples here, we got a few people raising their hands, and everyone else not raising their hands, and those are the first two things that come up in my mind, and I'm wondering - you know what I said, I'm out of my element here, right? - I say the word 'Paris' I have certain associations, you say 'Paris', you have different associations, and now I'm wondering what they are.
I have one set of thoughts when I think of 'Paris', you have (a) different set of thoughts, why aren't they the same? If knowledge is according to that traditional picture, they should be the same. If I mean 'Paris' I mean the same exact same thing as you. But it's clearly and evidently not the case.
5. Fifth. Knowledge is emergent. And, yeah, I know, we've got Steven Johnson and others, and emergent this, and emergent that, it's the new buzzword. The knowledge that 'Paris is the capital of France', we have this kind of abstract idea that we share, the knowledge that 'Paris is the capital of France', the Platonic ideal almost that we're trying to get at, and what I'm saying here is that this concept is emergent from the many individual bits of knowledge inside all of yourselves that 'Paris is the capital of France'.
Now the thing about emergence, and I don't see people write about this, maybe it's me but I don't know, but maybe I'm just naive, emergence is not a causal phenomenon. Well, yeah, OK, it is a causal phenomenon, you go to the micro levels and bits and atoms and all of that, and draw a causal picture, but the causal picture is so complicated nobody could understand it, it's like the weather is a causal picture but who's going to draw the line from this to this to this and make an accurate preduction forty-three years from now? It's not going to happen.
But at the higher level, emergence is a phenomenon of recognition. You need a viewer. You need a perceiver. You don't get away without having one. Think about a picture of Richard Nixon on the television. You see the television, well, what you really see are all those little pixels. And you know this, you've heard this story before, you look at all those, and the way those pixels are all organized, the way those pixels are coloured, the picture of Richard Nixon emerges from the television.
But, if you had never heard of Richard Nixon you would not recognize that as a picture of Richard Nixon. At the very best it would be 'some guy'. And if you're an alien from another planet, you're visiting with the people on the space shuttle - I like to go with a theme - then you're not even sure whether it's a human or a rock formation, could be anything.
Emergence requires perception. It requires a perceiver. That is why it is context sensitive and that is why knowledge is context sensitive.
Knowledge Creation and Acquisition in Networks
Here's another buzzword: the wisdom of crowds. What does that mean? Knowledge is distributed. Each one of us is a piece of the puzzle. And we don't acquire this piece, it's not like somebody comes to a podium and talks a piece and you're sitting there and OK you have it. It doesn't work that way.
As you sit there, indeed even as this talk is happening, you are not simply acquiring the words that I give you, and I sincerely hope not, though maybe I'll start reading some Che and see what happens... no, I'm kidding. Right? The stuff's coming in, but then it meshes and shmooshes with everything else that you've got going on, and what happens in your mind is you create something new out of it, and then that new thing becomes another piece of the puzzle, and it gets fed back in. And back and forth it goes. Back and forth over and over again.
Creation, on this model, is a process of acquisition, you get the input in, the talk, the website, the paper, the television show, the trip through the forest, you remix it, you take a bit here, a bit here, a bit here, a bit here, you put it together, and sometimes in a new arrangement, sometimes in an arrangement you're comfortable with, you repurpose it, you reshape it, you frame it according to your own background knowledge, your own beliefs, your own understandings of the words. This guy at the front of the room says the word 'Paris', you take that word, and shape that, fold that, into a place where it fits in your mind.
And then you feed it forward. You complain to the organizing committee after the talk. Just kidding. Or if Alan Levine's in here, he's probably blogging this. You pass it along. And this process happens over and over again. And each individual person does this, and it creates this network of meaning.
It's not simply a physical network. You read people like Barabási or Watts and they talk a lot about the structures and the structural properties of networks, but what's interesting and important are the semantical properties of networks, and the semantical properties are what is found, what are found, when we look at these concepts, as they're being molded, as they're being passed along, and what emerges from them.
Hence, for example, we've seen this before, in the literature, we'll go back to the 1970s, Thomas Kuhn, Structure of Scientific Revolutions. What it is to know in, as he says, normal science, to know, to learn a science, to learn a discipline, says Kuhn, is not to know a whole bunch of facts, but to learn how to solve the problems at the back of the chapter. And as someone who's struggled with those problems at the back of the chapter, I can tell you, the stuff that you need to solve the problems isn't in the text that preceeded the problems. I have analyzed this.
More modern: Etienne Wenger. Learning is participation in a community of practice, and again, this is the same concept here, that's coming out. This instead of learning as being the acquisition of facts, rather, learning as immersion into an environment. Well your metadata should be like that too.
Properties of Successful Networks
Properties of successful networks. I like to adapt. And so yesterday we heard Charles Vest talking about the three attributes of (a) successful university system, you had that nice list. Of course, I'm sitting there in the back, the very back of the room, sitting there, "yeah, but it's the Times of London, they have an agenda." Everything is context, right?
But anyhow, but the attributes, the attributes were important. The attributes that he identified were right. I think they're vital, and they're fundamental, and it's kind of neat, because I come in and think I'm going to do a talk on principles of representation, and I come in, and I pick up the principles from the opening talk. But that's how this works. You thought, probably, you thought the talk was static, dynamic, something that existed before it actually happened, but that's not how it works. When the knowledge is in this network, in this flow, interacting back and forward, it quite literally changes from day to day.
Charles Vest, three key attributes.
Diversity. I kind of recast that as 'many objectives'. He was talking about the different types of institutions, you've got your land grants, you've got your publics, et cetera. All the different types of institutions, there were many.
Interwoven. For Charles Vest interwoven is teaching and research. (Note: I was trying a play from a Simpsons episode; "We play all kinds of music: country and western.") OK, that didn't come out quite right. But the idea is, you're not focused on a single thing, you're not doing just one thing.
And then crucially, and this is the core of course, behind MIT's Open Courseware, and the many other projects that he mentioned, it's open. The system is open. The network is open. It admits many minds, many points of view. And that openness is what enables the communication and the exchange of concepts and ideas to happen, that creates that network effect.

Diversity. That means, if diversity is true, diversity is a virtue, then its converse, is not. So the idea of making everything the same, making everything of anything the same, is fundamentally misguided. Now, many of you work on something called 'standards'. Standards is, by definition, the making of everything the same. So we have a tension here.

Interwoven. The idea that our different activities are distinct is fundamentally misguided. Those you you who took in the talk on what the next net generation expects of us will have caught the flavour of this.
There is no real distinction between home and work and school and hobbies; it's all part of a great tapestry, isn't it? And yet, look at not only how we've structured institutions, we've got entire buildings dedicated for 'school' only, and you sort of scratch your head. If school's not distinct from work, why is there a separate building for school? And it seems sort of odd.
Metadata, we have metadata that is like 'school' metadata, and then we have other metadata which is 'work' metadata, and they will never meet.

Open. The idea that we can store knowledge in closed repositories, and I'm thinking here specifically of things like Learning Content Management Systems (LCMSs), but more generally, of the whole range of institutional repositories that require passwords, authentication, IP checks and blood types in order to get access to - that idea is fundamentally misguided.
And to illustrate - and that's why I'm so pleased to come here to a conference like this and look at all the sessions on open source and open software, open content, and I'm beginning to think, it's great, people are beginning to get this - the argument, in my mind the argument in favour of open content and open software is really very simple: if you picture the network of knowledge as being like the network of neurons in the mind, then barriers, like copyright limitations, password access and all of that, that's like putting bloockages in the connections between the neurons in your mind. And if that happens to a person, if their neurons stop sending signals freely and openly to each other, we consider that to be very sick, fundamentally ill, in need of major care and treatment and support. It's not a healthy knowing mind at that point, is it? It certainly not a remembering mind.
The Properties Applied to Metadata
So anyhow, I did say I'd talk about metadata some time, so what about metadata? I'm going to shift gears a little bit here and take these properties and apply them specifically to metadata.
The properties, the three properties, that I've just described, are not merely properties of universities. Because after all, the basic unit of knowledge is not the university. It's... well, I was going to say, when I was writing this first was something much smaller, try to come down a little, well, what is the basic unit of knowledge, and I realized, uh oh, I've stumbled into philosophy again. So, I'll dither.

Here's the picture that I've been trying to draw so far, I don't know if the different colours come out clearly, but they're there. Those circiles, they're not all actually the same, it's just, you try to do a graphic in five minutes, you go for the predefined circle. But think of those circles, they're all different, they're all diverse, they're all autonomous, they're all doing their own thing, and they're connected.
And the knowledge itself consists in the connections between the circles. I've got one set of black lines, that represents 'Idea A', and another set of red lines, that represents 'idea B'. That, that's our knowledge network.
The same picture applies at all levels. And I know I'm making a very strong claim here, but I believe, I don't have time to go into all of the detail here on that, I believe there is significant empirical evidence to support this. The same principles that govern the interactions between bloggers also cover the distribution of rivers in a river valley, also cover the way crickets chirp in unison. There's the picture.

At the lowest level, if you will, there are neurons, but also the interconnection between ideas that I've been talking about, interconnection between metadata, interconnection between people, which these days has reached hype status under the heading of 'social networking', and then, of course, at the top, the interconnection between universities, the mechanism by which you develop an excellent university system.
And just for good measure, and I'm not going to linger on this, as you go from the smaller to the larger, you have your causal relationships, but equally importantly, as you go from the larger to the smaller you have your perceptual relationships, that's the being able to recognize the picture of Richard Nixon, as opposed to, say, having a picture of Richard Nixon be 'caused', the recognition of Richard Nixon be 'caused'.
Now, this is drawn in a nice neat line. It is not a nice neat line. I left out all kinds of... I left out crickets, for one thing. I wasn't sure I should put ideas above metadata or metadata above ideas. I don't want to convey the idea here that it's nice neat layers all the way up. It is not; it is a chaotic mess. But, if we abstract it, apply words to it, this is kind of what we get.
That's the background into which I approach metadata. Now thinking about metadata, and thinking about the way metadata ought to be organized and structured, I came up with the concept that I call 'resource profiles', I wrote a paper on that a couple of years ago, a couple of people read it, which is nice. And in that paper I described three major features of metadata.
First of all there are different types of metadata. What very recently we would call microformats. I'll talk more about all of these. Second, the information in metadata is distributed. And then, third, any given perspective, any given point of view, any given context of recognition, is the result of aggregation, of bringing things in.
Now just last thing this morning, before I came here, as I was reviewing these, I realized, oh yeah, wait, these are the same principles that I just talked about, so: different types of metadata - diversity; information is distributed - interwoven; different perspective is aggregated - open. So there's a correspondance there, I'm not sure of the significance of that, but it's certainly matched, at least it matched at seven-thirty this morning.
Learning Object Metadata: Microformats
So let's look at learning object metadata specifically. I am going to work from the assumption that you are all familiar with learning object metadata.
All right. Learning object metadata is one of your classic standardization exercises, and when I look at learning object metadata it is the oddest thing in the world to me to see every metadata record having exactly the same structure no matter what kind of learning object is described. That just seems wrong. And it seems to me that we lose a lot when we do that.
If you look at different types of resources, I've got a couple of examples here but you can multiply them yourselves, you've got a video resource and an audio resource. Now these are two very different types of resources. One will have a bitrate, that'll be the audio. Another will have a framerate, that'll be the video. And the video will have a size. And the audio, size makes no sense.
So, there are, or there ought to be, what we might call LOM microformats. If we have learning object metadata that describes an audio resource, then the metadata appropriate to audio resources ought to be a part of that learning object metadata. If, on the other hand, the resource is, say, an essay, in Microsoft Word, you use a different type of metadata. If it's a learning object properly so-called, with learning outcomes and activities, the model, then you use different metadata. If the learning object is an opportunity for a one-on-one personal engagement with an online mentor, then you use different metadata. And the different metadata varies, so you have different technical metadata, you have different educational properties, and so on.
We think of learning object metadata as though it is just one big monolithic format. But in so doing, we, we, not only do we, we mis-shape the descriptions of the objects - look at the technical elements of metadata. Really, you don't learn anything. Well, you don't learn much about the technical properties of the resource you're describing, because we've tried to get one size fits all, and we've just sort of fudged it.
Look at rights. "Yes, yes, description." What kind of rights metadata is that? I mean, it doesn't work at all, but again, because we're trying to get one size fits all, we just wipe out the detail and just go for something, oh, you know, this will work for everyone, I guess that'll do.
Learning object metadata, too, it just seems odd to me, it's almost like it's in this world apart, like I said earlier, it's 'school' metadata. And when we're thinking of learning object metadata as metadata that could be constructed out of other types of metadata, that draws us to the conclusion that we should see learning object metadata as metadata that is situated in an environment where there are other types of metadata surrounding it. And learning object metadata and these other types of metadata interconnect, interact, and indeed, you would take, say, personal metadata, such as Friend of a Friend, and actually bring it in to the learning object metadata. Oh we got, we got vcards instead. I've always been scratching my head over that one, why there's vcard metadata in learning object metadata.
Rights metadata: instead of "yes, yes, description" we have rich, expressive languages that could be used to express rights in learning objects. But we have to learn to stop seeing learning object metadata as something separate stand-alone, we have to invent it all from scratch.
When I think of metadata, I think of RSS. RSS is beautiful. RSS: title, description, link. You're done. And then you just add other stuff to it as needed. And learning object metadata has even recreated title, description, link. It has its own special fields for it. Now there have been crosswalks built between learning object metadata and Dublin Core, but I sort of wonder, why didn't they just take the core of Dublin Core and, "we'll use that." That's what RSS does. Need creator metadata? Dublin Core, dc:creator. And you're done, you didn't need a special RSS element for creator.
I want you to think about how limited our conception of what a learning resource could be has become because of the way we've shaped our metadata. Picture a learning object in your mind for a moment, of course it's all different pictures, and ask yourself, how do you represent an event in learning object metadata? Where is the field, 'start time'? What happens, I mean, you can make it work, well you're taking these standard fields and you kind of using them to your own purpose. You're ignoring the 'real meaning', properly so-called, of what that field means, you'd probably stuff it in technical resources or something. Yeah, why not? If everybody else who's getting the metadata knows what you mean it doesn't really matter what the word said.
Learning object metadata, as it is structured now, actually collapses our view of what a learning resource can be into this static 'knowledge as something like a sentence' picture of learning. But if we break the constraints of vocabulary imposed on us by learning object metadata we also break the conceptual constraints of what a learning object can be. And then it can be a mentoring session. Then it can be a seminar. Then it can be an organization. Now what does an organization look like as a learning object? I don't know, but I'd like to be able to describe it.
Learning Object Metadata: Distributed Metadata
We have this thing, learning object repositories, metadata repositories, and we have this picture of the metadata being like the card catalogue. How many times have you heard that analogy? The other one is the lable on the soup can. But people love the card catalogue analogy. And so you have each indivdual record, each individual card describes a resource for us, so when we want to go locate a learning object, we're going to do just like we do in a library, we go search the card catalogue.
Most knowledge isn't organized this way. Think about how we would describe a person in metadata. Think of yourselves as a prospective employer of that person. So, what do you want? Well, you don't know the person, well, you're not supposed to anyways, so what you wnat is person metadata. Which these days is called the c.v. So the c.v.s come in, you've got this pile on your desk, that's all the metadata, now you're going to go through the search process and try to retrieve the records, the people, that you want for your position.
The question here is, as a potential employer, are you going to depend completely and exclusively on the c.v. in order to come to conclusions about the attributes of that person? I contend that you would be nuts to do so. And nobody would. At the very minimum, we have interviews so that we can get other data. But typically, we'd do thinks like, we'd run a reference check. I don't know how it works here, but in Canada we'd check and see if they have a criminal record, we'd run it through that sort of database. I don't think we do it in Canada, but they may do it here so I put it in, you may check their credit history, to make sure they're not a bad risk. If they give you a name and an address you might confirm that in a phone book.
The point here is, what we know about a person is not contained in a single metadata record, and indeed, it's not contained even in a single location. And that is crucil for our understanding of, our knowledge about, that person.
And of course, it's all point of view. A prospective employer is interested in one set of personal metadata, a prospective date is interested in a very different set of metadata. And because I couldn't resist, I diagrammed that.

So we have different types of metadata, the classic c.v., which I consider bibliographic metadata, that's the stuff you were born with. Actually it doesn't even include name when you're born, unless your parents planned ahead and did ultrasound or whatever. When I grew up the name came after the birth. But, age, that's known right from Day One, stuff like that.
Then you'll have health metadata, which would be located in the doctor's office or in the hospital, or I guess down here they'd be, what are they, HMOs? Grades, which would be held at the school, because you're not going to trust the person to provide an accurate transcript of their grades, because if you did, everybody gets As. The police criminal record, again, you get that directly from the police. The bank, or I guess you have it here too, Equifax, you get the credit information, which is sometimes accurate, sometimes less so. Information about teeth from the dentist.
Now your employer is going to aggregate this information, bring it in, and remix it, and organize it, in order to form their own view, their own perspective. C.V., grades, health, criminal record. The date doesn't really care about the c.v., well, most dates don't. They're interested in health, criminal record, well they usually do care about that, credit, and so I'm told, teeth. And you could go on. I could make this list much longer and I could come up with different points of view.
Learning object metadata is the same thing. You have a resource. It is born, created, the fruits of creativity, you know what I mean. And it has a creation date, it has a parent or author, you'll give it a name, you'll say it's a nice learning object, it's about rockets, and so on. And then it goes out into the world, and as it's out there in the world, then it begins to acquire different properties. Fred Penner used it in a nature class. Joe Jackson thought it was really good and gave it a rating of 5. The Mennonite Central Committee had a look at it and gave it the approval for LDS classes. The Siskel and Ebert of e-learning gave it two thumbs up.
In general, we can identity three major types of metadata. First party metadata - metadata created by the author. Bibliographic metadata. Second party metadata. Metadata created by the user of a resource. Evaluation. Context of use. "I used this resource in a math class." Third party metadata. Metadata created by an observer of some sort. The Mennonite central Committee. The rights broker. The Siskel and Ebert of e-learning. First, second and third party metadata.
Learning object metadata of the future will be composed of these three types of metadata, and the microformats within these three types of metadata will be mixed and matched, mixed and matched according to the nature of the resource, but mixed and matched according to the perspective, point of view, or context of use of this metadata.
Learning Object Metadata: Referencing
Think about your metadata environment. Think about your personal metadata. Even think about your c.v., maybe think about it a bit more abstractly, because your c.v. is typically a paper document and has the limitations inherent in physical objects.
The metadata about you isn't simply the metadata about you. If you think about it. I live in a house, for example, it's a nice little house, it's on a quiet street in Moncton, New Brunswick, in eastern Canada. That house has metadata. That house is older than I am. It had metadata before I did. It has a creation date, which is approximately 80 years ago, it's not very reliable metadata because that was before they invented metadata. The house has an address, a street address, a lot description number and all of that. It has its its history of owners, its provenance and all of that.
That metadata describing my house actually exists separately from me, it's down at City Hall. I, when I give you my metadata, I refer you to my house metadata, typically I'll just refer you with an address. I'll simply refer you to where you can get more metadata.
Same with pets. I got a cat, and the cat came with papers. Cat had its own metadata. Cat's metadata isn't my metadata because cat might go away. And I continue. I might give the cat, with its associated metadata, to someone else. Cat might die, in which case I close the file and archive it. Your car, same sort of thing, car has papers.
An entity does not exist in isolation, it's not a sentence like 'Paris is the capital of France.' An entity is related to other entities. Inherently related. And we need to express this in metadata.
So I call this 'metadata referencing'. And other people call it other things, none of this is unique to me, but what it isn't is in LOM. Now metadata about a given resource is not stored in a single file. And, as you go though say some learning object metadata, from point to point as you refer to different types of resources, instead of embedding the metadata right in there, you simply point, or reference, an external metadata file.
I've proposed this on a number of fronts. I wrote a paper about expressing digital rights in metadata, and one way of doing it is you take your digital rights, your ODRL file, or your XrML file (I still have trouble saying MPEG-REL) and embed it, the 80 lines, in the description field of the learning object metadata, that's one way of doing it. And what that means is that if you have a million learning objects, then you have this rights information replicated a million times, and if you want to change your price, you're in trouble. But if you take your rights metadata and create a rights model, and you put that in a specific spot, I call it a rights broker, and then in your learning object metadata you simply point to the location of your rights metadata.
And that's what Creative Commons does. Creative Commons, you have a web page, read through the web page, there's a little Creative Commons logo, and if you look at the source of the page, you'll see the rights metadata encoded in the page, but what that does is it's a pointer to the canonical definition of, say, 'non-commercial share-alike' on the Creative Commons website. And that's how it's done. Now, of course, learning object metadata, we've got "yes, yes, description".
It's not just that, the authors of resources, again, we refer to people about half a dozen times in learning object metadata and every time we've got this embedded vcard, and I sort of, I sit there and look at these learning object metadata files, and I say well what happens if the person changed jobs and got a new email address? Who's going to go out and change the 25,000 learning object metadata records to reflect this new information? That makes no sense.
But if a person had their own metadata record - Friend of a Friend is a popular format, not necessarily the definitive format - then in the learning object metadata you simply have a reference to that person's metadata, 'creator: where that person is'. Then a person can change their job, change their address, change their name, and they would not obsolete one learning object metadata file.
You see this already in RSS, or I should say more accurately, Atom, with the different link elements. atom allows you to have several links associated with a resource, one of the links will be the actual location of the resource, and another link will be a back-up location, and another link will be a resource that the current resource talks about, and so on, they're all defined in the Atom 1.0 specs. And you're beginning to find them in web pages as well. I'll talk a bit about that shortly.

So here's the picture. So pretend that this is learning object metadata, I adapted the vocabulary for my own purposes, so on the learning object website, the name, the description, the location. The author, now the author isn't a string 'Stephen Downes', because that's not a good way to store that information, the author instead is a pointer to the author website. In my FOAF file. And indeed, I work for a company, biggest one in Canada - well I don't know if that's true - but the company, it doesn't just say 'National Research Council', it's a pointer to the company metadata, describing that company. If I change jobs, I just change that pointer. If my company changes names - it's a government entity, could happen - then they change their thing, I don't need to change anything on mine. The rights on the broker website. And so on, I've just picked a few things here, but we could expand this.
Two Principles of Distributed Metadata
This picture gives me two basic principles of distributed metadata. And those of you who are involved in database design should be thinking 'normalization'. Those of you who are not involved in database design may want to Google the concept; this is not original to me.
1. Metadata - and put in the caveat, where possible - metadata for any given entity should not be stored in more than one place. There should be one canonical location for my name. And that's on my website. Not your websites, those of you are university people. It's on my website, because it's my name. And that's the only place it's stored.
Now it can be mirrored, it can be reflected, because you're thinking about database design, you don't want to be doing lookups across the entire internet every time you go to see a record. So you pull this information in, you mirror it on your own site, sure, no problem.
But the canonical information is stored on my website and from time to time you aggregate my information, you bring it in, just to make sure that your information still coincides with my information. Now for mission critical information you'd be aggregating a lot, and for bibliographical information you might do it once a month.
And the reason for that simply is data integrity. You multiply the location of a piece of data, say, my name, you multiply the possibility for errors. My name is spelled 'Stephen Downes'. I can give you eight different ways of getting that wrong. And they're always got wrong. Steven with a v. Downs without my e. Sometimes they do both. I've had 'Stephe'. And so on. And some of them I do myself, typing my name in all these fields all the time.
2. The second principle, and this is the one that I think is most violated by LOM, metadata for a given entity should not (except as a mirror, cache or whatever) contain metadata for a second entity. We need to keep our entities straight and have separate metadata for the different entities. Now if you think about it, it's going to give us a lot more expressive power because it is going to allow us - how do I want to describe this? - it allows us to do, for example, much more finely grained searches.
I did a paper called The Semantic Social Network where I talked about some of these principles, and the idea is, you have social networks which is, you have a person, they list all their friends, and then you have content metadata, like RSS where you describe all your blog posts or your essays or whatever, and right now these are two separate things. But if you merge them together, that puts friends together with content, as I put in my newsletter the other day, my social network is my content network, they're one and the same thing. They just have different types of entities.
So, I could in principle, if I was a better software author, do a search, 'Find all the papers written by people who are friends of David Wiley.' Now, why would I do this? Well, I don't know. What if I narrow it down? 'Find all the papers on learning objects written by people who are friends of David Wiley.' That is going to give me, I would bet, an authoritative collection of papers on learning objects, because I know David is an authority on learning objects. His friends are probably also authorities. At least those who write about learning objects.
So you get that kind of - I'm looking for the word there - multi-type entity search capacity. Trying to come up with a phrase off the top of my head, it's always a bit hard.
Web 2.0: The Principles More Widely Applied
What's important now, remember all my layers, these principles apply not just to metadata. They apply to learning resources themselves. We now have this picture of learning resources in our mind of, well, it is like a can of soup and you stick it in the back shelf and you pull it out when you want it. But it's not like that. The learning resource itself is distributed, itself brings in different types of entities.
It applies to applications themselves. Now I'm not talking, like, Java and all that sort of thing but I'm talking more along the lines of separate free-standing applications loosely connected through communication channels, not integrated into one large piece of enterprise software.
The web is changing, and it's changing in this very direction. You may have heard the concept 'Web 2.0'. That's not just a slogan. It's a shifting of the idea of the web from being a medium to the idea of the web as a platform, or if you will, an environment. It just is the shift from the idea of the web being communications, like in that old picture of knowledge, to an environment, or a network, or pick your own metaphor, where you're not just dealing with content, you're actually immersed in it, part of it. It becomes a place where you do things, it becomes even a place where you live.

E-learning 2.0 - I've got a whole other slide show on e-learning 2.0. Here's the picture. It isn't my picture, Scott Wilson did the original and Dave Tosh has done more. The idea of the future virtual learning envrionment, that's your space, and then, you are connected to all these applications, to all this content, to all this data, to all this metadata, around the web.
Those of you - because I've witnessed this - most of you, all of you, are working on university-centric systems. E-Learning 2.0 is not university-centric. E-learning 2.0 is where you're one of those bubbles, you're part of the student, the person's overall learning environment, and your metadata, and your interactions, your identity sign-ons, have to play nice with all of these other applications, not just other universities, but newspapers, blogging sites, dating sites. Different points of view. Or as I've got here, Flickr photo sites.
Learning becomes a network phenomenon. It becomes not just a place where we receive the service or the content of learning, but it becomes an interactive back and forward network environmentg, where everybody's receiving and everybody's creating, everybody's remixing. We see social networks and communities, and as I've talked about before, the semantic social network. Networks of interactions. The personal learning centre.
We're beginning to see this already in Web 2.0. There's a link there at the bottom, microformats.org, where these microformats are beginning to be developed for embedding in an XHTML file. So they;ve got things like hcalendar, hcard, rel license, a whole bunch of things. There's a new one just came in the other day for video. And these microformats are embedded in web pages, or, but in the future, because this is just an XMTML initiative, but in the future they'll be embedded in RSS and other types of XML metadata as well.
The Web 2.0 checklist. This is another take on the principles of distributed metadata. Structured microcontent, like I described. The data is outside. It comes in through the interactions. The bits of the network - it's not all one big monolithic piece of software like which is running on my computer, but different, small pieces of software that talk to each other in application-specific and resource-specific microformats and APIs. That's why you get the Flickr API. That's why you get the Google Maps API. And you use these APIs the way you use media-specific metadata. The single identity, the single placed for that personal thing that - I drafted a proposal on that, it's at that URL there. User-generated, user-managed content, applications, network as a whole.
Michael Feldstein yesterday wrote, and I quite agree with this, "We need a system that is optimized toward slotting in new pieces as they become available, not as an after-thought or an add-on, but as a fundamental characteristic of the system." Try doing that with Blackboard or WebCT.
Concluding Remarks
The take-away. And I am going to come in under time. Charles Vest talked yesterday about the meta-university. If I may be so audacious, this - what I've described here - is the information architecture for the meta-university. Now you might not agree with all of the details and everything, but it is going to be very much like that, and it is going to be very much like that because, really, that's the only way to do it. The key here is not large integrated systems but small flexible bits that are interconnected. And that's true of applications, it's true of content - like websites, pictures, images, graphics,sound - and it's true of metadata.
And that leads us to this. Learning object metadata will be rewritten. Or maybe bypassed entirely. That's a prediction. I'll stake my reputation as a pundit on it. It's going to be rewritten. And it's going to be rewritten because it has to be, because as we work with learning objevct metadata as it is currently incarnated, unless we're working within a large monolithic entity like the U.S. military, learning object metadata will be found to be too rigid, too inflexible, too narrowly defined, to do the sorts of things that we want to do with it.
And instead, we're going to get the type of learning object metadata that will be similar to - although, I know these committees, so it will be different from - the resource profiles that I've described here, where it will bring in the different types of microformats, where metadata will be distributed, will do things like harvest second-party and third-party metadata.
And that is my last slide, I thank you very much for inviting me, it has been a pleasure, and I really appreciate you staying for the whole talk. Thank you very much.
References
Barabási, Alberto-Laszlo. Linked: The New Science of Networks. Perseus Books Group; 1st edition (May, 2002).
Blackboard. Website. August 14, 2005.
Bryant, Lee. Smarter, Simpler Social: An introduction to online social software methodology. Headshift, April 18, 2003.
Buchanan, Mark. Nexus: Small Worlds and the Groundbreaking Theory of Networks. W. W. Norton & Company (June, 2003).
Chomsky, Noam. Syntactic Structures. Walter De Gruyter Inc; Reprint edition (June, 1978).
Creative Commons. Website. August 14, 2005.
Downes, Stephen. Resource Profiles. Journal of Interactive Media in Education, 2004 (5).
Downes, Stephen. The Semantic Social Network. Unpublished. February 14, 2004.
Downes, Stephen. E-Learning 2.0 - Alberta Cut. ADETA, June 10, 2005.
Downes, Stephen. mIDm - Self-Identification the World Wide Web. Unpublished. May 4, 2005.
Downes, Stephen, et.al. Distributed Digital Rights Management: The EduSource Approach to DRM. The Open Digital Rights Language Initiative - Workshop 2004.
Dublin Core Metadata Initiative. Website.
Feldstein, Michael. The Long Tail of Learning Applications. E-Literate. August 7, 2005.
Fergus, Paul, et.al. Capturing Tacit Knowledge in P2P Networks. PGNet 2003.
Flickr. Website. August 14, 2005.
Flickr. Flickr API Documentation. Website. August 14, 2005.
EDUCAUSE. Web site.
Fodor, Jerry A. The Language of Thought. Harvard University Press (January 1, 1980).
Friend of a Friend. The foaf project. Website. August 13, 2005.
Google. Google Maps API. Website. August 14, 2005.
Google. Search results. "Bridging the Gap" conference. August 13, 2005.
Guevara, Ernesto (Che). The Motorcycle Diaries : A Latin American Journey. Ocean Press (September 15, 2004).
Guimera, R., et.al. Self-similar community structure in a network of human interactions. Physical Review E, vol. 68, 065103(R), (2003).
Hanson, Norwood Russell. Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science. Cambridge University Press (January 1, 1958).
IEEE. Draft Standard for Learning Object Metadata. 1484.12.1-2002, 15 July 2002.
Internet Mail Consortium. vCard: The Electronic Business Card. January 1, 1997.
Johnson, Steven. Emergence: The Connected Lives of Ants, Brains, Cities, and Software. Scribner (September 19, 2001).
Kuhn, Thomas S. The Structure of Scientific Revolutions. University Of Chicago Press; 3rd edition (December 15, 1996)
Lakoff, George. Women, Fire, and Dangerous Things. University Of Chicago Press; Reprint edition (April 15, 1990).
Landon, Bruce, and Robson, Robby. Technical Issues in Systems for WWW-Based Course Support. International Journal of Educational Telecommunications, 1999, 5(4), 437-453.
Leene, Arnaud. Web 2.0 checklist 2.0. Hovering Above. July 21, 2005.
Lewis, David K. Counterfactuals. Blackwell Publishers (December 1, 2000).
microformats.org. Website. August 14, 2005.
MIT OpenCourseWare. Website. Massachusetts Institute of Technology. August 13, 2005.
Moore, Michael G. Distance Education Theory. The American Journal of Distance Education. Volume 5, Number 3, 1991.
Nottingham, M., and Sayre, R. The Atom Syndication Format. IETF, July 11, 2005.
Patterson, Thom. 'Discovery is home'. CNN, August 10, 2005.
Polanyi, Michael. Personal Knowledge: Towards a Post-Critical Philosophy. University Of Chicago Press (August 15, 1974).
Quine, W.V.O. Word and Object. The MIT Press (March 15, 1964).
Quine, W.V.O. and Ullian, J.S. The Web of Belief. McGraw-Hill Humanities/Social Sciences/Languages; 2nd edition (February 1, 1978).
Rein, Lisa. Getting Started On A Harmonized Video Metadata Model. Microformats Wiki. August 6, 2005.
Richards, Griff, et.al. The Evolution of Learning Object Repository Technologies: Portals for On-line Objects for Learning. Journal of Distance Education. Vol. 17, No 3, 67-79, 2003.
Seminars on Academic Computing. August 7-10, 2005. EDUCAUSE. Conference website.
Snowmass Village. Website.
Stalnaker, Robert C. Inquiry. The MIT Press (March 13, 1987).
Surowiecki, James. The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations. Doubleday (May 25, 2004).
Tosh, Dave. A concept diagram for the Personal Learning Landscape. ERADC. April 08, 2005.
van Fraassen, Bas C. The Scientific Image. Oxford University Press (January 1, 1982).
Vest, Charles M. Claire Maple Address: OpenCourseWare and the Emerging Global Meta University. Notes by Alan Levine. Seminars on Academic Computing, August 8, 2005.
Watts, Duncan J. Six Degrees: The Science of a Connected Age. W. W. Norton & Company (February, 2003).
WebCT. Website. August 14, 2005.
Wenger, Etienne. Communities of Practice: Learning, Meaning, and Identity. Cambridge University Press (December 1, 1999).
Wikipedia. Checksum. August 13, 2005.
Wikipedia. Che Guevara. August 13, 2005.
Wikipedia. RSS (file format). August 13, 2005.
Wikipedia. Web 2.0. August 14, 2005.
Wilson, Scott. Future VLE - The Visual Version. Scott's Workblog. January 25, 2005.
Wittgenstein, Ludwig. Philosophical Investigations. G.E.M. Anscombe, trans. Prentice Hall; 3rd edition (1999).
World Wide Web Consortium. Resource Description Framework (RDF). October 21, 2004.
Comments
Your comments always remain your property, but in posting them here
you agree to license under the same terms as this site
(CC By-NC-SA). If your comment is offensive it will
be deleted.
Automated Spam-checking is in effect. If you are a registered
user you may submit links and other HTML. Anonymous users cannot
post links and will have their content screened - certain words are prohibited
and your comment will be analyzed to make sure it makes sense.
