Digital Preservation: LOCKSS, CLOCKSS & Portico



Digital preservation is the process of safekeeping of digitally stored information for the benefit of present and future generation. Last two decades have witnessed that libraries have transitioned from acquiring paper-based collections to subscribing access to web-based e-content. This shift from owning print collections to renting digital collection has had several disturbing unintended consequences and has put libraries in a precarious and weakened position. Since digital information is more prone to damage and loss than print information, therefore, it is imperative to include the practices required to ensure their safety from medium failures as well as software and hardware obsolescence.

Various initiatives and policies are currently in place to ensure the long term preservation of digital information. Some of the preservation projects to be mentioned here are:


Classic Software Preservation Project (CLASP): CLASP project was setup by the Internet Archive in 2004 with the aim of collecting and archiving obsolete retail software in outdated formats dating from 1970s to early 1990s. These consumer software are archived until copyright obstacles are overcome and then refreshed the data into current suitable storage media and made them freely available.

Digital Object Management System (DORIA): DORIA is a national project of Finland set up by Helsinki University Library to collect and preserve digital collections held by the universities and polytechnics of Finland.

PANDORA Digital Archiving System (PANDAS): PANDAS has been developed as a web-based tool by the National Library of Australia since 2001 to provide an integrated, web-based, web archiving management system.

KB e-Depot: Koninklijke Bibliotheek (KB), National Library of the Netherlands started a project of digital Information Archiving System ‘e-Depot’ in 2002 for long term preservation of digital publications in the country. 

The Effective Strategic model for the Preservation and Disposal of Institutional Assets (ESPIDA): The project was based in the University of Glasgow and sponsored by the Joint Information Systems Committee (JISC) to draw up a model for digital preservation at higher education institutions.

LOCKSS, CLOCKSS, and PORTICO are the three major digital preservation initiatives aimed at protecting and preserving the digital content for its long term access and use.


LOCKSS (Lots of Copies Keep Stuff Safe) program, founded in 1999 by Dr. David S.H. Rosenthal and Victoria Reich under the sponsorship of Stanford University, develops and supports open source software for digital preservation. The LOCKSS program provides access to archived information even in case of temporary loss of access to publisher or publishers’ website or preserves the data for longer period.  Librarians use their local LOCKSS box as a ‘digital stacks’ to store and take custody of subscribed and open access e-content, bringing the traditional purchase-and-own model to electronic materials as well and strengthening the role of libraries in the digital era.

The program, led and supported by libraries and librarians internationally, works on the principles and practices like replication, format migration and repair through a polling mechanism. When LOCKSS is installed on a system, it becomes part of a large LOCKSS network comprising thousands of nodes and each node connected to network share information as a result of which multiple copies of information get produced at different geographical locations. LOCKSS has a unique polling and repair mechanism. For example, if particular information on a node gets deleted or reports a difference in comparison to other nodes, it is automatically polled out odd in the network and the deleted information is replaced or the difference found is neutralized thereby ensuring the security of preserved data.  LOCKSS also includes the feature of format migration i.e. it provides facility for automatic data migration from invalid format to usable or current format.  

Though LOCKSS is open and free software, its members have to join an alliance called LOCKSS Alliance and support it financially for development and documentation of the core LOCKSS technologies.


CLOCKSS (Controlled Lots of Copies Keep Stuff Safe) is an extended version of LOCKSS, founded as a project in 2006 is a collaboration of world’s leading academic publishers and research libraries, providing sustainable dark archive to ensure the long term survival of web-based scholarly content. Digital content is stored in the CLOCKSS archive with no user access unless a ‘trigger’ situation occurs. It works on the same principle of replication and polling odd one out mechanism as in the case of LOCKSS but unlike LOCKSS (a light archive) it is a dark archive, which ensures access to information in trigger situations only. 

Trigger event mainly occurs due to following reasons:
  •  Publisher no longer exists / stops operations
  • When publisher stopped publishing particular information/title
  • When publisher stopped giving access to some or all back issues
  • Catastrophic and sustained failure of publisher’s delivery platform/servers


PORTICO digital preservation service, started in 2002 as a project by JSTOR, is now a part of ITHAKA, a not-for-profit organization helping the academic community use digital technologies to preserve the scholarly content (e-journals, e-books, digitized collections) accessible to researchers, scholars, and students in the future. It was launched in 2005.

Comparison of LOCKSS, CLOCKSS and Portico Accessibility

Conditions for Trigger event



LOCKSS
CLOCKSS
PORTICO
When library cancels subscription with publishers and needs access to back issues to which they had subscription
 Yes
No
Yes, but if library discontinues Portico participation, then they will no longer be able to get post-cancellation access to content through Portico.
E-Journal and its past issues are no longer available from the publisher
Yes
Yes, but the title would be openly made accessible to all
Yes, the title would be made accessible to all active participants without considering whether they were previously subscribing the content or not
When publisher ceased operation and e-content is no longer available
Yes
Yes, but the title would be openly made accessible to all
Yes, the title would be made accessible to all active participants without considering whether they were previously subscribing the content or not
Natural disasters/Catastrophic failure
Yes
Yes, but on condition if the publisher is unable to provide service due to the said reason
Yes, but on condition if the publisher is unable to provide service due to the said reason
Temporary failure of publisher’s operation/servers
Yes
No
No

Difference between LOCKSS, CLOCKSS and PORTICO

LOCKSS is a real time backup solution. It provides access to stored content whenever publisher sites are unavailable, even for a brief period of downtime. Portico and CLOCKSS are dark archive, preserving digital content for the long term. Access to CLOCKSS content is similar to the Portico model. The difference between Portico and CLOCKSS is that Portico archives digital content in Standard format, whereas CLOCKSS preserves content in the publisher's original format (not a standard archival format).

System Approaches of LOCKSS, CLOCKSS & PORTICO
LOCKSS
CLOCKSS
PORTICO
Open source software
Open source software
Proprietary software
Distributed, peer to peer platform with error detection
Distributed, peer to peer platform with error detection
Centralized, hosted platform
Small workstation required, it can be run of a CD
Specific server hardware required
No equipment required from client side
Light archive (even for short term preservation)
Dark archive (long term preservation)
Dark archive (long term preservation)
Preserves content in publisher's original format
Preserves content in publisher's original format
Use standard format for preservation


SELF CHECK EXERCISES


1.      LOCKSS program is related to ___.
A.     Computer virus
B.     Institutional repository
C.     Long term digital preservation
D.     Programming language

2.      Portico digital preservation service is part of __.
A.     Elsevier
B.     McGraw-Hill
C.     ITHAKA
D.     EBSCO

3.      KB e-Depot is a digital information preservation system of __.
A.     France
B.     England
C.     Netherlands
D.     Australia

4.      Which among the following system is considered ‘a light archive’?
A.     Portico
B.     LOCKSS
C.     CLOCKSS
D.     None of the above

5.      Select the correct combinations:
a)      Portico – Preserves scholarly publications in dark archives
b)      LOCKSS – Non-profit service that allows libraries to collect e-contents for digital preservation
c)      CLOCKSS – Non-profit digital preservation service for participating libraries
d)      DORIA – National preservation project of Finland

A.     c, and d are correct
B.     a, and c are correct
C.     b and c are correct
D.     a, b, c, and d are correct

6.      Select the correct combinations:
a)      DORIA – Helsinki University Library
b)      LOCKSS – Stanford University Libraries
c)      ESPIDA – University of Glassgow
d)      PANDAS – National Library of Australia

A.     c, and d are correct
B.     a, and c are correct
C.     b and c are correct
D.     a, b, c, and d are correct



Post a Comment

16 Comments

  1. Thanks for response. pls post more such type of topics.

    ReplyDelete
    Replies
    1. sure...will do that definitely

      Delete
    2. Great information sir..than you and waiting for many more.

      Delete
  2. Your notes are really so good and very helpful ...post notes for other topics also ...

    ReplyDelete
    Replies
    1. Thanks...definitely will do the needful for sure

      Delete
  3. Replies
    1. All the answers are covered in the notes....please study the notes carefully

      Delete
  4. It's very important for net exam. .1 questions put up every time

    ReplyDelete
    Replies
    1. Yes, exactly, all the notes are prepared keeping in mind the UGC-Net and other exams' pattern

      Delete