Remember me on this computer. Enter the email address you signed up with and we'll email you a reset link. Need an account? Click here to sign up. Download Free PDF. Devika Madalli. A short summary of this paper. Digital information is To organize, store, and retrieve digital growing and exploding at a rapid rate; it is also available in content, many libraries as well as archiving heterogeneous forms, adding to its complexity.
Hardware and software on which digital information is created are continuously changing. This centers are using either proprietary or presents a significant challenge in preserving digital resources and open-source software.
While it is accepted making them accessible for future use. There techniques, digital media requires are also new challenges, particularly in the digital environment. When continuous processes to keep it compliant digital documents are added to a digital repository, it is necessary to ascertain that the software and tools lend support to long-term with current technology. It is not only preservation of the digital content. Evaluation criteria from a accessibility, sustainability, and retrieval digital preservation point of view are defined here based on a study across time.
This paper presents an undertaken for the purpose. The evaluation against the important analytical study along with observations criteria is executed and reported here by installing selected OSS-DL in a test bed environment. Open-source software is available for free under open-source license terms and conditions where the source code of the software is also available.
The open-source code can be altered for further development, customization, and redistribution. In terms of processing, DROID fully supports auto and manual metadata creation, and partially supports rights management. It generates information only from what is embedded in the object, but not from external repositories or elsewhere.
Description: Dropbox is a free service that lets you bring all your photos, docs, and videos anywhere. This means that any file you save to your Dropbox will automatically save to all your computers, phones and even the Dropbox website. Even if you accidentally spill a latte on your laptop, have no fear! You can relax knowing that Dropbox always has you covered, and none of your stuff will ever be lost. Make sure your partition is mounted with support for extended attributes xattrs.
Music playback is supported on devices with BlackBerry OS 4. Additional Notes: DROID uses internal signatures to identify and report the specific file format and version of digital files. For individual users, Dropbox provides 2 GB of storage for free, and users can acquire up to a total of 18 GB for free by recommending other users to Dropbox.
In addition to its web interface, Dropbox offers software clients for the major desktop operating systems Windows, Mac and Linux and a variety of mobile platforms Android, iOS, Blackberry, etc. Through the desktop clients, users are able to create folders on their local machine which, if they wish, they may synchronize with their folders on the cloud. Although Dropbox may in practice be used with some frequency as a tool to back up personal digital materials, it was not designed with the managed, long-term preservation of institutional archives in mind.
Dropbox does not perform a fixity check. Dropbox is more appropriate for personal archival purposes rather than institutional repository services. While the services that Dropbox provides seem to offer a more reliable solution for saving data in a more permanent way, they cannot be relied upon for long-term bit preservation, and depending on the file size of what needs to be stored, it may not be a viable solution.
It meets and falls short of the objectives of the OAIS reference model. The DPSP is a collection of software applications which support the goal of digital preservation. System Requirements: You will need to install Java 1. Additional Notes: Auto Metadata Creation is minimal. Description: DSpaceDirect is a hosted service from the DuraSpace non-profit organization that allows users to store, organize, and manage DSpace repository content in the cloud.
DSpaceDirect can be used to preserve and provide access to academic faculty and student papers, projects, and research making content easily searchable by end users and easily managed by content curators. Last Updated Checked on April 16th, :. DSpaceDirect is a fully hosted software as a service managed by DuraSpace. For customers to access their account, they simply need an Internet connection and an Internet browser. Customer Service: Ten support requests included in annual.
Additional Notes: DSpaceDirect allows customers to get a repository up and running quickly with a feature set that customers can choose to customize. Included in the annual subscription are regular software upgrades as well as a complete integration with DuraCloud that provides an ongoing synchronized backup of all content stored in DSpaceDirect. Through the DSpaceDirect interface, content can be made open access via the web or restricted to specific users or groups of users.
Because the DSpaceDirect service is running the DSpace open source software, customers can choose to move their repository and its contents to another location at any time. Description: The Duke Data Accessioner provides a graphical user interface to aid in migrating data from physical media to a dedicated file server, documenting the process and using MD5 checksums to identify any errors introduced in transfer. The software also offers a way to integrate metadata tools into the migration workflow.
The tool is primarily designed for use by technical services librarians in small institutions with little or no IT support. A download is also available for Version 0.
System Requirements:. Description: DuraCloud is a hosted service from the Duraspace non-profit organization that provides a centralized interface for organizations interested in using cloud storage as a part of their digital archiving and preservation programs. DuraCloud is integrated with commercial and academic cloud storage providers such as Amazon Web Services and the San Diego Supercomputer Cloud storage center storing content within these existing infrastructures.
DuraCloud enables users to store multiple copies of their content on multiple clouds, all while keeping each copy synchronized in an ongoing fashion as well as checking the integrity of all content stored on a regularly scheduled basis. System Requirements: Web portal for most users.
Refer to the documentation about building DuraCloud from source for more information about building and running the DuraCloud software. Customer Service: Ten support requests included in annual subscription. Additional Notes: DuraCloud is a web-based hosted service that is integrated with multiple cloud storage providers.
Customers of the service can choose the level of access they provide to their account and content as well as the number of cloud storage providers their account is integrated with. When customers subscribe for more than one cloud storage provider on their account, all of the content is automatically copied and synchronized among all cloud storage providers.
All DuraCloud accounts include automatic regualrly scheduled bit integrity checking of all content stored as well as storage reports available through the web interface.
DuraCloud is a Java-based, open source, cloud-based preservation system created by DuraSpace the not-for-profit organization responsible for the DSpace and Fedora repository platforms. DuraCloud also includes a set of tools and services that act as a preservation-centered interface for storage shares which are provided by a third party storage providers includes Amazon S3, Rackspace, etc.
DuraCloud provides an account and open — source software tools that manage these third party cloud spaces, allowing for automated duplication across vendors, access and ingest, and fixity checking. This provides a buffer to the risks of vendor lock-in, the business cycle , and physical disaster while maintaining user control over the data itself. The user is responsible for tracking of item metadata and rights information, and integrating DuraCloud with their overall system.
While users can provide additional metadata, DuraCloud stores only the small amount of administrative metadata used to identify, retrieve, and authenticate each bit stream. DuraCloud does, however, integrate with systems that provide metadata management and user-facing access to items, including DSpace, Fedora, Archive-It, and DSpaceDirect.
Description: EMET is an image metadata extraction tool intended to facilitate the management and preservation of digital images and their incorporation into external databases and applications. Tool is able to handle single files, or recurse trough a folder.
Please note this tool is a prototype and adds very specific institutional metadata from The State University of New York at Binghamton, though the code can be changed very easy to your own needs. This tool for converting data captured when images and audio files are created and not directly about storage or preservation though the data may assist in those activities. Description: Islandora is an open-source software framework designed to help institutions and organizations and their audiences collaboratively manage, and discover digital assets using a best-practices framework.
Additional Notes: Fedora 4 promises to include some additional features. In terms of redundancy, Fedora 3 does support some replication, redundancy, failover, etc. There will be more options in Fedora 4, which offers better support for clustering and policy-driven storage. Fedora4 will also allow for easier usage of a geographically dispersed data storage model. A self-healing auto recovery ability also be available in Fedora4. Description: FFmpeg is a complete, cross-platform solution to record, convert and stream audio and video.
See the documentation for a complete feature list and the Changelog for recent changes. This is not a tool for preservation, but the resizing may appeal to some people for access purposes. It is designed for simple integration into automated work-flows. It converts signatures into regular expressions and applies them directly. Fido is free, Apache 2. Most importantly, Fido is very fast. This is not directly about storage or preservation though the data generated may assist in those activities.
Description: FITS allows data curators to identify, validate, and extract technical metadata for the objects in their digital repository. It does this by incorporating a range of mostly third-party open source tools, normalizing and consolidating their output. System Requirements: Instructions for command line use are given for Windows and Unix. Additional Notes: Identifies and extracts Technical Metadata only. Software offers a user discussion board. Description: Google Cloud Storage allows users to store, access, and manage their data.
Additional Notes: Customer Service operates as a discussion board for the time being. Customer service representative not able to answer detailed questions about how the storage actually works.
Like many Google products, Google Cloud is ready-to-use right out of the box. However, some people do not like using Google products because so much of the process is handled by company, and they require information some may not be comfortable sharing.
Many features of Google Cloud could be implemented with a solid understanding of the Google API, which may or may not require some in-house programming or a significant amount of time spent searching out specific solutions.
Depending on the expertise of the employees in a given small archive, this may or may not be a significant limitation. Still, Google has a lot of support and documentation, therefore it is unlikely to prove overly challenging for those with computer literacy.
Furthermore this availability of resources may prove to be a benefit for the consumers; as ease of use and understandability for the designated community is essential, this bodes well for the compliance of Google Cloud with the OAIS reference model.
One of the biggest issues with Google Cloud is that the exit strategy is apparently not streamlined. For others who want to manage much of this in-house, it may be a less-than ideal solution. Description: Heritrix is an open-source web crawler, allowing users to target websites they wish to include in a collection and to harvest an instance of each site. The software is most often used as a powerful back-end tool incorporated into a web archiving workflow. Heritrix may run on other platforms, but this option is not supported.
JRE 1. Note that Java 6 update 23 and update 24 and possibly later cannot be used with Heritrix 3. As of Hertirix 3. Assign more of your available RAM to the heap if you are crawling thousands of hosts or experience Java out-of-memory problems.
See Heritrix Configuration. Additional Notes: Public Interface Archive-IT — Heritrix is installed via a command line interface, but once installed the user can launch a web-based interface for configuration. Setting up a crawl requires a significant number of adjustments. Installation requires solid knowledge of Linux and command line interfaces.
In addition, may require a new large server to hold crawls. Date Unknown. The amount of memory can be an important factor, especially if you intend to work on large images. Image Magick is a software tool for creating, editing, and composing bitmap images; it is not a tool for general archival purposes. Image Magick does have a great deal of documentation available.
Users across all functional entities in the OAIS reference model will be able to find answers to questions about most tasks due to the fact that the software had a great number of users. This system does not meet the OAIS reference model requirements simply because, while it is a wonderful tool for the creation, conversion, and metadata manipulation of images, it is by no means a robust archival tool for long-term preservation. Description: The Internet Archive is a c 3 non-profit that was founded to build an Internet library.
Its purposes include offering permanent access for researchers, historians, scholars, people with disabilities, and the general public to historical collections that exist in digital format. Founded in and located in San Francisco , the Archive has been receiving data donations from Alexa Internet and others. Given the importance of source code, open-source software OSS development holds great promise for facilitating our efforts to keep digital materials accessible into the future.
Computers don't understand what we want them to do. Well, not directly, anyway. This is because, at their most basic level, computers are just collections of devices that deal with streams of electricity. Through numerous magical tricks, electricity can be broken into tiny little chunks.
Engineers and technicians who develop, evaluate and maintain the physical devices must often deal with aspects of the electricity itself. Most computer professionals, however, can ignore those physical details and instead deal with the chunks, which they treat as signals or bits binary digits , conveying one of two possible values 1 or 0, true or false, on or off. By processing, sharing and storing combinations of these signals, the devices can perform many different instructions.
The story can't really end there, though, since not many people are really very good at thinking in terms of those instructions either. What the end user really wants to do is perform some task like write a letter, listen to some music, or we would hope find the hours of operation of an archival repository. If doing such things through a computer required an intimate knowledge of how all the bits were being created and processed, it would hardly be worth the trouble.
It would be much more helpful for us to be able to convey our needs to a computer in a way that made sense to us, and then let the computer figure out how to translate that into the appropriate bits. Luckily for us at least, when it all works , modern computers have numerous components in place to do exactly that.
When I use the mouse to move the pointer over a file name and then double-click on it, for example, an image then appears on my monitor that looks to me like a document. I can make some changes to the document and then save it, allowing someone to view that changed version some time in the future. By anticipating the sorts of tasks people will want to perform, computer engineers have built a complex system that translates my mouse click into the appropriate generation and processing of bits.
This is done partially through hardware the devices themselves , but mostly through software combinations of bits that can perform operations on other bits. Since different pieces of hardware and software deal with bits in very different ways, this process actually has to take the form of a huge number of tiny little translations.
In order to make this massive job much more manageable, developers take advantage of an extremely powerful concept known as abstraction. This allows different parts of the problem to be addressed as layers, which then "talk" to each other. If I'm building a component in layer X that has to work with some other component that you're building for layer Y, I don't need to know all the intimate details of how your component works or even everything about how layer Y works.
0コメント