Dealing with Data: Workshop II om metadata.

7. november 2013 | Posted by | Category: larm-begivenheder

Dealing with Data: Developing a Roadmap
Workshop II: Metadata

Hele workshoppen bliver livestreamet på

Date: 11th November
Time: 10am – 4pm
Location: Emil Holms Kanal, Building 22, Aud. 22.0.11, 2300 Copenhagen S


10:00 – 10:05

Welcome and Introduction to the Programme

Project coordinator at LARM Audio Research Archive Bente Larsen

10:05 – 10:50

The LARM infrastructure. Multiple metadata challenges – multiple metadata solutions

Birger Larsen, Professor of Information Analysis and Information retrieval at Department of Communication, Aalborg University Copenhagen, Associate Professor Jacob Thøgersen, The LANCHART Center, University of Copenhagen & Associate Professor Haakon Lund, Associate professor at the Royal School of Library and Information Science, University of Copenhagen

10:50 – 11:25

Metadata in public health research

Associate Professor Rikke Lund, Department of Public Health, University of Copenhagen

11:25 – 12:00

Meta data – the key to unlocking potential

Head of Data and Information, Mærsk Oil, Christian Andreassen

12:00 – 13:00


13:00 – 13:45

Metadata in adaptive information retrieval

Professor Andreas Nürnberger, Faculty of Computer Science, Otto von Güericke University, Magdeburg.

13:45 – 14:10

Distributed data management for biodiversity science

Director Donald Hobern, Gobal Diversity Information Facility, GBIF

14:10 – 14:45

Steps towards a national data management solution
Senior Consultant, Diba Markus, Danish e-Infrastructure Cooperation, DeIC

14:45 – 15:00


15:00 – 15:35

Metadata and the law: Data ownership and processing restrictions

Associate Professor Clement Salung Petersen, Faculty of Law, University of Copenhagen

15:35 – 16:00

Discussion: Metadata – Challenges and Solutions



The LARM infrastructure. Multiple metadata challenges –multiple metadata solutions

Part 1: Birger Larsen and Haakon Lund: Metadata in

Traditionally thinking and research about bibliographic metadata has focussed on standards, control and uniformity. Increasingly however such metadata are not available, and when they are, may not correspond well to rapidly evolving user needs. We present results from a study of the requirements of audio culture researchers in relation to metadata in a large repository of digitised audio, and present the metadata scheme designed for encompassing both original archive metadata and researcher annotations.

Part 2: Jacob Thøgersen: Metadata in sociolinguistics

Sociolinguistics is in Labov’s famous dictum the study of the social significance of ”different ways of saying the same”. This definition encompasses three levels of metadata: The level which defines ‘sameness’ (the linguistic variable), the level which defines ‘difference’ (the linguistic variants), and the level which defines a non-linguistic variable e.g. gender, class, time, style, situation etc. The talk will show examples of practical implementation of these three levels of metadata in different sociolinguistic projects carried out under the LARM project. The overarching claim of the talk is that metadating always involves a process of analysis, and thus that all metadating is open to variant interpretations and renegotiations and therefore in principle always open ended.


Rikke Lund: Meta data in public health research

Within public health the concept of metadata is not commonly used. However, the major part of our research are based on information collected in populations concerning risk factors for public health problems ranging from perinatal health, outbreak of infectious disease over non-communicable diseases such as e.g.  cancer, cardiovascular disease, rheumatic disorders and mental health to risk factors for accelerated aging processes. A variety of sources feed data into our research projects e.g. registers, surveys, biological and physical tests. To make the enormous amount of data collected useful  for present and future users careful documentation of datasets is needed. Furthermore, modern technologies such as data collection by mobile phones make up new challenges for data management, description and documentation within public health research. This information about data and its ‘proper’ use are suggestions of meta data within this field. Two cases of datasets will be described: ‘Copenhagen Aging and Midlife Biobank’ and the mobile phone based project ‘Social Fabric’.


Christian Andreassen: Meta data – the key to unlocking potential

The exploration for oil and gas has evolved from being a paper and pen discipline (pre 1980’s) with cardboard 3D models and seismic paper sections to a highly digital science spanning many earth disciplines and an ever growing volume of data and number of data types. Advances in computing and storage technologies now allow for endless iterations and analyses of data. Identifying the right metadata early on is key to managing both the data itself and the metadata over time and in so-doing maintain a safe and compliant global operation as well as the value invested in data acquisition and research.


Andreas Nürnberger: Metadata in Adaptive Information Retrieval

Analyzing and exploring huge object collections or to retrieve specific information from it, is a task we frequently have to perform when, e.g., searching for some news or specific publications on the web. Also interactive and exploratory data analysis tasks have very similar characteristics and would strongly benefit from flexible interfaces that adapt to the requirements and context of the mining task at hand. However, the currently available tools for exploring, mining, searching and organization still provide only very limited support with respect to context-based interactive structuring and visualization. In this talk, first a motivation for the need to develop information systems which focus on user and context specific support in the interaction process is given. Then approaches are discussed that tackle specific problems of this process, especially mining methods that are able to use contextual information – extracted from user interaction and ontologies – as bias or constrained in the learning process in order to structure and/or visualize data collections. Finally, the extent to which metadata are needed and aid such adaptive information retrieval systems are discussed.


Donald Hobern: Distributed data management for biodiversity science

 Research into the world’s biodiversity has progressed through dispersed global effort over the last few centuries, centered on natural history collections.  Approximately 2 million species have been described by researchers over this period and the science has depended heavily on a massive printed literature, much of which remains in use many decades after its production.  However the knowledge held in these non-digital resources is not easily accessed in an efficient way to address modern questions regarding the complexity of biological systems or applied use in areas such as species conservation, land use, agricultural sustainability and human health.  Consequently many institutions and countries have made significant investments to transform this historical information into digital formats and to combine it with contemporary measurements and observations.  The Global Biodiversity Information Facility is an inter-governmental initiative established to support this process and to deliver globally integrated access to biodiversity data.  This effort has evolved over the last decade and now focusses on a combination of low-complexity data publishing tools and centralised data harvesting and aggregation processes.  Critical factors requiring continued attention include adoption of stable and persistent data repositories for primary data; consistent use of persistent identifiers for data sets and data elements; use of machine-readable licences to maximise reuse of data; improved processes for crediting data publishers; and development of acceptable processes for community curation of data.  These are challenges that are shared by all research domains as they embrace digital data management. Consequently we need to work together to build consensus and common practices to support all research data.

Diba Markus:

Clement Petersen: Metadata and the law: Data ownership and processing restrictions

This presentation will focus on ownership of metadata and on the legal restrictions for processing metadata.




Birger Larsen

Birger Larsen is Professor of Information Analysis and Information retrieval at Department of Communication, Aalborg University Copenhagen. He has a passion for research that involves the activities, processes and experiences arising in the meeting between users, information, and information systems in a given context – with the goal of optimising these to empower users in their task and problem solving. His main research interests include Information Retrieval (IR), structured documents in IR, XML IR and user interaction, exploiting context in IR, Informetrics/Bibliometrics, citation analysis and quantitative research evaluation.

Haakon Lund

Associate professor at the Royals School of Library and information Science and is working with a number of different aspects of Information Access. This includes meta-data and meta-data schemes and the relation between user access to information and meta-data. Within information access is also user behavior and methodologies for studying user behavior.

Jacob Thøgersen

Sociolinguist working with (among other things) language change, language attitudes and language norms, social evaluation of linguistic variation, interviews and interview interactions, language policy, language choice in higher education. PhD 2008 on an investigation of Danes’ attitudes towards the English influence on Danish and ‘the language of attitudes’. Within LARM, Jacob has been looking at the role of the Danish National Broadcasting Corporation as a shaper and changer of language attitudes and language norms.

Rikke Lund

Associate professor at Institute of Public Health, Section of Social Medicine is studying the health impacts of social relations among middle-aged and older people. She is responsible for several cohort studies including the Copenhagen Aging and Midlife Biobank (CAMB) as well as the newly established mobile phone based project ‘Social Fabric’.

Christian Andreassen

Data Management Discipline Lead and Head of Data and Information Management in Copenhagen. Joined Mærsk Olie og Gas A/S in 1990. He holds an M.Sc. in sedimentary geology from Copenhagen University. During his recent time with Mærsk he is responsible for providing data and information management services to production/development and exploration/new business departments. Although these departments reside in Copenhagen, the geographic scope of the work is worldwide. He manages the local team in Copenhagen, which comprises a broad range of professional skills ranging from geology, geophysics, astrophysics and geography who mainly deal with structured data, and also librarians, information managers and drafting personnel who mainly deal with more unstructured information.

Andreas Nürnberger

Andreas Nürnberger is professor for Data and Knowledge Engineering at Otto-von-Güricke University, Magdeburg. He studied computer science and economics at the Technical University of Braunschweig, Germany, and received his Ph.D. in computer science from the University of Magdeburg, Germany in 2001. After this he was for two years postdoctoral researcher at UC Berkeley, where he worked on adaptive soft computing and visualization techniques for information retrieval systems. From 2003 to 2007 he was assistant professor (“Junior professor”) for information retrieval at the University of Magdeburg. His current research focuses on user centered approaches for information access and organization.

Donald Hobern

Donald Hobern’s career spans over twenty years in software development and biodiversity informatics. He is currently the Executive Secretary of the Global Biodiversity Information Facility (GBIF,, headquartered in Copenhagen, Denmark.  In this role he is responsible for coordinating the activities of a global network for sharing biodiversity data including 57 countries and many international organisations. Mr Hobern also worked as a technical lead for GBIF between 2002 and 2007 with responsibility for adopting and promoting data standards and an international culture of data sharing. Between 2007 and 2011, Mr Hobern worked for CSIRO as the inaugural Director of the Atlas of Living Australia (ALA,, overseeing the development of the Atlas’ architecture and core tools within the context of the many ALA collaborators. Mr Hobern has also served as Chair of Taxonomic Database Working Group (TDWG,, the international organisation responsible for development of standards for exchange of biodiversity data, and is currently Chair of the Council for the Encyclopedia of Life (EOL). Mr Hobern has had a lifelong passion for natural history, participating in ornithological survey work and photographing insects, including a large selection of Australian moths.

 Clement Salung Petersen

Clement Salung Petersen is associate professor and vice-head of the PhD programme at the UCPH Faculty of Law. He is also legal advisor to the Committee for the Protection of Scholarly and Scientific Works under the Danish Confederation of Professional Associations (Udvalget til Beskyttelse af Videnskabeligt Arbejde (UBVA)).




The Global Biodiversity Information Facility (GBIF)

The Global Biodiversity Information Facility (GBIF) is an international open data infrastructure, funded by governments. It allows anyone, anywhere to access data about all types of life on Earth, shared across national boundaries via the Internet. By encouraging and helping institutions to publish data according to common standards, GBIF enables research not possible before, and informs better decisions to conserve and sustainably use the biological resources of the planet. GBIF operates through a network of nodes, coordinating the biodiversity information facilities of Participant countries and organizations, collaborating with each other and the Secretariat to share skills, experiences and technical capacity. GBIF’s vision: “A world in which biodiversity information is freely and universally available for science, society and a sustainable future.

LARM Audio Research Archive

LARM Audio Research Archive is an interdisciplinary project  the goal of which has been to establish a platform that will allow researchers and university students the ability to stream, search and annotate and otherwise interact with radiophonic cultural heritage sources. The interface gives access to more than one million hours of radio, a national bibliography of radio and television as well as digital tools adaptable to collaborative and individual research projects. User driven innovation is a key element in LARM. The development of the infrastructure is based on user needs. It is developed in close collaboration between technicians, cultural researchers and designers. The technology is tested in a series of cases where radio broadcasts are analyzed from a variety of research perspectives. LARM Audio Research Archive is a collaboration between a number of research and cultural institutions: The University of Copenhagen, Roskilde University, The University of Southern Denmark, Aalborg University, Aarhus University, The Danish Broadcasting Corperation, The State and University Library, Danish e-Infrastructure Cooperation, Kolding School of Design and The Museum of Media. The project is funded by a 25 million DKK grant from the Danish Ministry of Science, Innovation and Higher Education.

Comments for this post are closed