< Back to previous page


From Findability to Awareness: Metadata in Music and Technology Enhanced Learning (Vinden en beseffen: Metadata in muziek en E-learning)

Book - Dissertation

Nowadays, a vast amount of content is available to, and can be created by anyone, anywhere. To make this content available, search services make use of the content itself, the context in which content is used or metadata that describes what the content is about and how to locate, retrieve and use the content. Based on metadata, content can be organised and searched for more efficiently. Repositories are databases that store metadata and potentially content. Compared to search engines, repositories typically provide features for content management, as well as copyright and access management. The repository and its metadata can be tailored to support specific application needs. In this thesis, we have researched the use of metadata in two different domains that typically apply repositories: music information retrieval (MIR) and technology-enhanced (human) learning (TEL).Metadata needs to be created and added to the content. For repositories, this process is still often done manually. Because this is expensive and does not scale, we have investigated how metadata creation can be automated by using web-based techniques - i.e. by using mashups. Mashups are web applications that combine content, presentation or functionality of web sources to create new useful applications or services. Two MIR case studies investigated how music metadata fields can be created automatically using mashups.First, we researched the use of data mashups to determine the country of origin of an artist. The approach intelligently combines location data from three data sources based on the accuracy of each data source. The method performs well, but one downside of our approach is that it cannot determine the origin for artists who are not described in the data sources. To our knowledge, this research is the first attempt to generate the country of origin of an artist.The second case study investigates how search engines can be used as a general metadata generation solution. An existing method was used that is based on the distribution of the search result counts over artist queries with changing metadata values (e.g. for musical genre by comparing search results counts of "madonna rock" and "madonna jazz").We re-evaluated the approach for musical genre with multiple search engines and identified situations where the approach under-performs. Although the approach does not outperform other generators for musical genre, the method offers perspectives for automatically generating a variety of metadata fields. This technique is also a mashup, because search engines already aggregate data from different sources.Typically, metadata is used extensively for search and recommendation applications. These applications often rely on a specific metadata standard that enables a common understanding to support exchange and interoperability. These standards and schemas describe content in different ways, granularity levels and application purposes. To guide the selection of a suitable schema for a specific application, we have elaborated a framework that defines clusters of semantically related metadata fields as well as application domains for metadata schemas. This framework can be used to compare the expressiveness and richness of a metadata schema. More specifically, our survey can guide the selection of a metadata schema suitable for application requirements. In a next step, we researched how content in a variety of repositories and described according to a metadata standard, can be made available through a single search service. This is called federated search and works as follows: a query is sent to each repository, the received results of each repository are aggregated, the metadata standards of each repository are mapped to the metadata format used by the federated search service, and the aggregated search results are presented to the user. In TEL, we evaluated the usability and usefulness of this approach to provide student access to a variety of learning material in three different real-world settings. Most users would benefit from a federated search solution, but the search speed could be improved. In addition, we researched how recommendations can enhance these search results with both explicit (e.g. rating) and implicit relevance indicators that are extracted from interaction data of users with content (e.g. downloads, reads, etc.). Such 'attention metadata' describe not content as such, but what people do with it. The use of these data was researched for two purposes: first, for personalisation and recommendation functionality as explained above and second to provide awareness and to enable self-reflection for learners and teachers. More specifically, we researched (i) how information visualisation techniques can be used to support this process and (ii) whether it is useful to create awareness and provide self-reflection to teachers and learners. Results from evaluations in four real-world settings, indicate that the provided visualisations are useful to teachers and students to assist with improving awareness. This research contributed to the momentum of the emerging research field of learning analytics.
Number of pages: 290
Publication year:2012