Open Access News

News from the open access movement


Monday, February 23, 2009

More on cloud storage and repositories

Leslie Carr, The Cloud, the Researcher and the Repository, RepositoryMan, February 21, 2009.

There's currently a lot of buzz about DuraSpace, the DSpace and Fedora project to incorporate cloud storage into repositories. ... [I]t sounds like a very positive agenda for repositories in general to adopt. I hope this is a good opportunity to make a few remarks about the work that EPrints is doing that also might make cloud services accessible to repositories and users of repositories. ...

The EPrints team have been working on projects that might help researchers looking to take advantage of the cloud's benefits, without being put off by its lack of home comforts.

We've previously announced that Dave Tarrant has extended EPrints to use cloud storage services as part of JISC's PRESERV2 project. The new EPrints storage controller (debuting in EPrints v3.2) allows the repository to offload the storage of its files to any external service - cloud storage, local storage area networks or even national archiving services. The repository can mix and match these services according to the characteristics of each deposited object - even storing each item in several places for redundancy or performance improvement.

That tackles the technical part of the problem - how to join up repositories with the cloud, but it doesn't have much to say about how to better engage data-rich-users with the cloud (or with the repository come to that). As part of the JISC KULTUR project, Tim Brody has been looking at the problem of user deposit for lots of large media files. Not petabyte large, but gigabyte large. Even at that scale, the normal web infrastructure fails to deliver a reliable service ...

The solution that Tim has come up with is to allow the researcher's desktop environment to directly use EPrints as a file system - you can 'mount' the repository as a network drive on your Windows/Mac/Linux desktop using services like WebDAV or FTP. As far as the user is concerned, they can just drag and drop a whole bunch of files from their documents folders, home directories or DVD-ROMs onto the repository disk, and EPrints will automatically deposit them into a new entry or entries. Of course, you can also do the reverse - copy documents from the repository back onto your desktop, open them directly in applications, or attach them to an email. ...

Now perhaps if you put the desktop front-end together with the cloud back-end, the repository might be able to offer institutional researchers a realistic path to cloud storage. ... Not naked cloud storage, but storage that is mediated, managed and moderated on the researcher's behalf by the institution ... In other words, a cloud you can depend on! ...

Desktop services have already been built on top of cloud storage - JungleDisk for example is a desktop backup and archiving service, but it still requires the user to have their own cloud account. Hopefully, a repository can take away all the necessity for special accounts, passwords and storage management from the user and provide them with a whole host of extra, valuable services. ...

Update. See also the comments by Wally Grotophorst and Dorothea Salo.