DPE: Digital Preservation Challenge

The Second Digital Preservation Challenge

was sponsored in part by

Xerox

Electronic resources are a central part of our cultural and intellectual heritage, but this material is at risk. Digital memory needs constant management, using new techniques and processes, to contain risks such as technological obsolescence. The DPE digital preservation challenge aims to raise awareness amongst researchers of the issue of digital preservation. Three competitions will be held over the lifetime of the DPE project, each including several tasks to solve. The challenge invites participants to overcome the barriers hindering access to (sets of) digital objects. Each object is accompanied by a scenario based on a reallife situation. These scenarios are intended to make the challenge more accessible to participants from all backgrounds while not trivialising the serious nature of the digital preservation challenges facing society.

The winner of the Second Digital Preservation Challenge was Alex Mason (Durham University, UK)

Alex Mason was able to solve all the provided challenges showing great skills in data archealogy. The proposed digital preservation solutions were outstanding and Alex also provided tools to handle problems emerging in the tasks. His work clearly showed a deep understanding of digital preservation and the challenges associated with it.
Alex was the only participant to implement a preservation solution for the database scenario that allows direct access and further utilisation of the complete data set. Besides, he proposed a video-based approach to solve the task of multimedia art preservation - an outstanding solution.
Altogether, he has proven to have a stunningly deep understanding of the issues at hand and therefore is awarded the first prize of this digital preservation challenge.

2nd Prize: Juan-José Boté Vericad (Universitat Oberta de Catalunya, Universitat de Barcelona)
The second prize goes to Juan-José Boté Vericad. He also managed to solve most of the given tasks, focussing specifically on emulation approaches. Juan- José was able to identify and open all the files provided and access all the data. He suggested good solutions for the problems occuring in the tasks. The solutions were very well designed and accompanied by good analyses and motivations why the specific solutions were recommended.

3rd Prize: Mac Kobus (Stuttgart Media University)
Mac Kobus solved some of the tasks and showed a good understanding for problems associated with the other tasks. His submission focussed specifically on task 2, where he managed to display all the images. He also provided recommendations for generally applicable strategies.

Read the Second Challenge Poster [PDF, 137 KB]



Scenarios for the second Digital Preservation Challenge

Legacy Application File

Scenario 1

Your company archivist discovered an old tape in a store-room. The content is not known but the label "Master Backup" suggests that is highly valuable. There were four types of files on it, one type of text documents, one type of graphics and two unknown file types. You are asked to identify the unknown file types and display the content of the given sample files in such a way that they may be used in a different application. You are also asked to design an appropriate preservation strategy that will facilitate access to such records, and that can be applied, as far as possible, in an automatic manner. Moreover, you are asked to estimate the cost/effort required to deploy the strategies you propose.

Task

Your task is to:

  1. Identify the type of content of the unknown files.
  2. Propose one or more suitable preservation strategies and provide a thorough description that highlights their advantages and disadvantages.
  3. Implement a preservation strategy capable of mass handling of files of this type, giving an estimate of the cost/effort for deploying the strategies.
  4. Apply the preservation strategy to the objects and display the files.
  5. Analyse the benefits and the shortcomings of the preservation strategy.

Evaluation

Evaluation will be based on the progress participants make in completing the tasks outlined above and also on the thorough description of the steps taken to complete each task. The submission will be evaluated according to the following points in particular:

  1. The quality of rendering the files with respect to the original files as demonstrated by description and screenshots.
  2. The level of detail in your explanation of the proposed preservation strategies.
  3. Quality and level of detail of the comparison of strategies for long term preservation in relation to the ease of use, stability and robustness of your implementation.
  4. Quality and suitability of the code for mass handling of documents of the example types.
  5. Quality and level of detail of the estimation of cost and effort.

Images from a Legacy Computer Gaming Platform

Scenario 2

An image archive received a donation from an artist representing working material from his early years. While ingesting the data into the image archive repository, the system failed to identify some of the file formats. The artists cannot remember the name of the particular application or the computer platform. He also found a related file for one type of the images, but he does not know what the file is. Can you display the images? Include the images in your report in an appropriate form.

Task

Your task is to:

  1. Identify the application and computer platform that the files come from.
  2. Display the images in an appropriate form.
  3. Propose valid preservation alternatives pointing out their advantages and disadvantages.
  4. Implement one or more of the preservation strategies (for example emulation or migration).
  5. Evaluate and compare their performance.

Evaluation

Evaluation will be based on the progress participants make in completing the tasks outlined above and also on the thorough description of the steps taken to complete each task. The submission will be evaluated according to the following points in particular:

  1. Correct identification of the computer platform and the applications.
  2. Correct representation of the images.
  3. The level of detail in your explanation of the proposed preservation strategies.
  4. The level of detail in the analysis of the advantages and disadvantages.
  5. Quality, feasibility, ease of use and stability of the developed preservation strategies for the example files.

Obsolete Database

Scenario 3

In the beginning of 2003, the Porto Regional Archive (ADP) initiated a project called DigitArq. The goal of the project was to bring together its various finding aids, previously scattered throughout the archive in many different forms and formats, into a single centralised repository based on international standards such as ISAD(G) and EAD. The planned repository would enable the standardisation of all archival procedures and the development of new data services such as search mechanisms and description tools. However, in late 2007 a fire broke out in the server room destroying the server that held all the information produced over the last 25 years. In addition to this all the backup tapes that were kept in a cabinet in the adjoining room were destroyed. Around 80% of the information had been synchronised with a similar repository at the National Archives in Lisbon, and this information was easily recovered. The other 20% had been migrated from an old database that was still kept at the archive but had not been used since 1990. The ADP staff were unable to use the database, so they decided to hire a digital preservation expert to do the job.

Task

Your task is to:

  1. Identify the database system that is necessary to interpret the provided data files.
  2. Provide access to the data of the database.
  3. Propose an adequate procedure to migrate the records from the old system to the new one (please provide output files, scripts or standards to support your answer).
  4. Propose a disaster recovery plan so that if a similar incident takes place the disruption will be kept to a minimum.

Evaluation

The evaluation of this scenario will be based on the correct identification of the database. A thorough description of the steps taken to access the data should be provided. Additional credit may be awarded on the basis of the quality of additional migration as well as disaster recovery plans. The submission will be evaluated according to the following points in particular:

  1. Correct identification of the database system.
  2. Level of detail in the description of the steps taken to gain access to the database data.
  3. The level of detail in your explanation of the proposed preservation strategy.
  4. Quality and suitability of the code for the preservation strategy.
  5. Level of detail, quality, and feasibility of the proposed disaster recovery plan.

Electronic Art

Scenario 4

Founded in 1987, the Prix Ars Electronica is an interdisciplinary platform for digital art and media culture. The Prix Ars Electronica is one of the most important awards for creativity and pioneering spirit in the field of digital media. With the rapid change of software tools and frameworks for multimedia authoring their artworks are in danger of becoming inaccessible and unusable. You have been asked to preserve four of these historical digital artworks for future generations and to develop appropriate digital preservation strategies.

Task

Your task is to:

  1. Display the multimedia art, provide screenshots and a description of the steps taken.
  2. Decide which aspects of the artworks to preserve, and identify their significant properties.
  3. Develop a set of different preservation strategies for the four pieces of multimedia art provided, that have the potential to address different aspects of the artwork.
  4. Point out the differences in the strategies with respect to the characteristics of the preserved artworks and their suitability.
  5. (Optional) Implement part of the preservation strategies you have developed and submit code for this.

Evaluation

Evaluation will be based on the progress participants make in completing the tasks outlined above and also on the thorough description of the steps taken to complete each task. Particular attention will be paid to the arguments given for and against chosen preservation strategies. The submission will be evaluated according to the following points in particular:

  1. Correct representation of the multimedia art and detail in the description of the steps taken.
  2. Level of detail in the description of the significant properties and the characteristics to be preserved.
  3. Quality and suitability of the developed preservation strategies.
  4. Level of detail and quality in the comparison of the different preservation strategies.
  5. Quality, feasibility, ease of use, and stability of the submitted code.

Web Archiving

Scenario 5

Your company wants to preserve their website to document their growth and evolution over time. You are asked to analyse different preservation strategies for websites. The developed strategies will be applied to two internet domains and to analyse their advantages and disadvantages.

Task

Your task is to:

  1. Devise at least two preservation strategies for websites, highlighting their respective advantages and disadvantages.
  2. Harvest the two following internet domains documenting the date, time and method of harvesting:
    • digitalpreservationeurope.eu with a depth of 3 (approximately 15MB of data)
    • www.rai.it with a depth of 2 (approximately 40MB of data)
  3. Apply the identified preservation strategies to the harvested websites, compare and document the results (for example storage size, processing time, presentation quality). Give an estimate of the resources (such as time, storage, effort, costs) required to deploy the strategies.

Evaluation

Evaluation will be based on the progress participants make in completing the tasks outlined above and also on the thorough description of the steps taken to complete each task. The submission will be evaluated according to the following points in particular:

  1. Quality, level of detail and suitability of the description of the preservation strategies and their comparison.
  2. Suitability of the preservation strategies covering special document types (such as script languages, Flash, videos, etc.).
  3. Quality of the preservation of the interlinking between the objects.
  4. Level of detail in the explanation given for the method of harvesting.
  5. Quality and level of detail in the comparison and evaluation of the results.
  6. Quality, feasibility, ease of use and stability of the implementation of the preservation strategy and the capability of mass handling.