What is Digital Preservation?
Three definitions of 'long term' Digital Preservation
Digital preservation is a set of activities required to make sure digital objects can be located, rendered, used and understood in the future. This can include managing the object names and locations, updating the storage media, documenting the content and tracking hardware and software changes to make sure objects can still be opened and understood.
- "Digital preservation combines policies, strategies and actions to ensure access to reformatted and born digital content regardless of the challenges of media failure and technological change. The goal of digital preservation is the accurate rendering of authenticated content over time." (ALA 2007:2)
- "The act of maintaining information, in a correct and Independently Understandable form, over the Long Term." (CCSDS 2002: 1-11)
- "All activities concerning the maintenance and care for/curation of digital or electronic objects, in relation to both storage and access." (Research Councils UK 2008: 6)
What does 'long-term' mean in the context of Digital Preservation?
- "five years or more" (Verheuil 2006: 20)
- "a period of time long enough for there to be concern about the impacts of changing technologies, including support for new media and data formats, and of a changing user community, on the information being held in a repository. This period extends into the indefinite future." (CCSDS 2002: 1-11)
- "Data should normally be preserved and accessible for not less than 10 years for any projects, and for projects of clinical or major social, environmental or heritage importance, the data should be retained for up to 20 years, and preferably permanently within a national collection, or as required by the funder's data policy."(Research Councils UK 2008:6)
What do we need to preserve?
Various aspects of the digital objects may be needed to be preserved.
The lowest level of preservation requirements includes preservation of the bit stream, this does not however ensure understandability, readability or usefulness of the digital object. The biggest risk in terms of understandability is that the meaning (and even the names) associated with values in a dataset, although known to the data producers, is not available to the users; without this the data is essentially useless.
Another aspect is that, even for users within the same sub-discipline, terminology drifts and meaning is lost; users in different (sub)disciplines will require even more help with the semantics of the data.
- A more complex approach may strive to preserve not only the 1s and 0s but also the meaning so that it remains readable and understandable. Such an approach requires the preservation of additional information (representation information, technical metadata etc.)
- Even more ambitious preservation approaches try to preserve understandable content in such a way that the provenance and source of the digital object also remains clear. Thus the users can have trust that the object is authentic, accurate, complete etc.
Why should we care about Digital Preservation?
- Storage media/data carrier problem
Digital objects are much more 'fragile' than traditional analogue documents such as books or other hard copy mediums. Digital objects are fragile because they require various layers of technological mediation before they can be heard, seen or understood by people. Digital objects are also much more venerable to physical damage. One scratch on CD-ROM containing 100 e-books can make the content inaccessible, whereas to damage 100 hard copy books by one scratching move is - fortunately - impossible. A flash memory stick can drop into glass of water or get magnetised, portable hard drive or laptop can slip from your hands and get irreparably damaged in a second.
Digital objects require pro-active intervention to remain accessible. While you can put a book on a shelf and return to it in upwards of 100 years and still open it and see the content as it was intended by the author/publisher, the same approach of benign neglect to a digital object is almost a guarantee that it will be inaccessible in the future.
- Hardware obsolescence
Even if you returned to the digital object in five years to find the disk is in perfect condition and you have software that can open the file, but if that file is on a disc your computer doesn't have a drive for you will not be able to access it.
- Software and format obsolescence problem
Alternatively the software or file format can become obsolete for a number of reasons. For example software upgrades may not support legacy files; the format take up is low and the industry does not produce compatible software; software which supports the format may be bought by a competitor and withdrawn form the market place. Without the intervention of digital preservation techniques the information contained will no longer be accessible.
Real life examples
- Hardware obsolescence
You wrote your PHD thesis X years ago on a 286 PC in a word processor software called T602 in Kamenicky coding in win 3.11. You now wish to access this to show to a colleague/student working on the same topic but you do not have a computer with the right floppy disk drive and you are not sure if it would be possible to read the disk even if you were able to access this kind of drive.
If you are very lucky you might find a friend with the right drive on their computer. The drive will be able to access the floppy disk and retrieve the data, you can see your file thesis.t602 but you cannot open the file, the format you used to store the file is obsolete. What will you do then?
To prevent the hardware obsolescence problem you could have migrated the file to new media carrier before you threw out your last PC with a floppy drive.
To prevent format obsolescence problem you could continually move the content from one format to another as each is updated. In addition to this every time you changed your software you could have kept a copy of the software with which you created the file.
Add your own real life example to the dpe forum dpe forum.
Some of the benefits of Digital Preservation
National legal frameworks often require organisations to provide adequate records of business processes, communications and many other types of data for many years after their creation.
- Accountability & protection from litigation
Recent legal cases have shown the importance of being able to search and recover archived emails quickly and in a legally admissible manner.
- Protecting the long term view
Access to digital data is critical to ensure business continuity and to support decision making with a long term view. For research in particular preserving data may be crucial for identifying long-term trends.
- Protecting investment
The valuable intellectual assets of organisations are increasingly in digital form.
This data represents both intellectual property and a considerable investment of time, effort and money. It would therefore be foolish not to protect and preserve these assets adequately.
Repositories of digital information and the tools to mine, analyse and re-purpose them represent a society's intellectual capital.
Effective and affordable digital preservation solutions are essential to transfer digital data into valuable assets for business.
- ALA (American Library Association) (2007). Definitions of digital preservation. Chicago: American Library Association. Available at: http://www.ala.org/ala/mgrps/divs/alcts/resources/preserv/defdigpres0408.pdf
- CCSDS (Consultative Committee for Space Data Systems) (2002). Reference Model for an Open Archival Information System (OAIS). Blue Book, Issue 1. Washington, DC (US): CCSDS Secretariat, January 2002. Technical report. CCSDS 650.0-B-1. Recommendation for Space Data System Standards. Available at: http://public.ccsds.org/publications/archive/650x0b1.pdf.
- Research Councils UK (2008). Code of Conduct and Policy on the Governance of Good Research Conduct: Integrity, Clarity, and Good Management. Public Consultation Document. July – October 2008. Available at: http://www.rcuk.ac.uk/cmsweb/downloads/rcuk/reviews/grc/consultation.pdf
- VERHEUL, Ingeborg (2006). Networking for Digital Preservation: Current Practice in 15 National Libraries. Munchen: K.G. Saur.
What is digital preservation? by http://www.digitalpreservationeurope.eu/ is licensed under a Creative Commons Attribution-Noncommercial 3.0 Unported License.