Selection and archiving
If you have created data which may be useful to others (or which you yourself may wish to access again in the future), consider depositing it in an appropriate data repository or archive.
Repositories generally offer a range of options: data may be made openly available, or embargoed for a fixed period (to permit publication of results, for example), or access may be restricted permanently.
Preservation means the storage of a project’s digital outputs in such a way that they remain usable, understandable, and accessible – beyond the end of funding, and ideally for the long term. In practice, therefore, preservation is often achieved by depositing the digital material in an archive/repository during the project, or shortly afterwards. Often, charges made by the archive for preparing and ingesting the data can be directly costed into your grant application (NB. such charges usually need to be paid within the lifetime of the project and not after it has finished).
Before archiving data, you should carefully consider what is and is not important to keep.
We advise you to use our institutional data repository DataSTORRE. It is an online digital repository of multi-disciplinary research datasets produced at the University of Stirling.
University of Stirling researchers who have produced research data associated with an existing or forthcoming publication, or which has potential use for other researchers, are invited to upload their dataset for sharing and safekeeping. A persistent identifier and suggested citation will be provided.
Selection
Not all data needs to be kept beyond the lifetime of a project; indeed in many cases it would be impractical to keep everything:
- Archived data must be discoverable: if no-one can find it, no-one will use it
- Archived data must be usable: if finding the few useful bits from a huge dataset is like looking for a needle in a haystack, it won't be used at all
- Storage space is expensive: archived data must be robustly backed up, which can more than double the cost compared.
Before archiving data, you should carefully consider what is and is not important to keep.
More information
How to appraise and select research data for curation (Digital Curation Centre)
Curation reference manual (Digital Curation Centre). Our Curation Reference Manual contains advice, in-depth information and criticism on current digital curation techniques and best practice. The Manual is an ongoing, community-driven project, which involves members of the DCC community suggesting topics, authoring manual instalments and conducting peer reviews.
JISC have written a report with advice on guidance on what research data to keep, where and how
Archiving
A long-term archive of research data can have a number of benefits:
- It can be used to demonstrate compliance with national information access legislation, e.g. Freedom of Information Act 2000, Data Protection Act 2018, Environmental Information Regulations 2004, etc., and other funding body and sponsor requirements.
- It secures the ongoing accuracy, authenticity, reliability, integrity and completeness of research data by safeguarding it against loss, deterioration, unauthorised or inappropriate access, obsolescence and future incompatibility.
- It facilitates a consistency of approach which adds value to the University's overall research profile, saving effort and resources over time and enabling future sharing of research data.
- It increases the visibility of institutional research over time by providing robust evidence of past, current and ongoing University research activity, broadening, deepening and supporting its long-term impact.
For more information about the benefits of archiving data, see Why Deposit Data? from the UK Data Archive
Data Retention:
The University of Stirling requires that research data is securely preserved in an appropriate format for a minimum of 10 years, or longer if specified by the funder. The 10 year period should run from the date of any publication that is based on the data or the date on which the data was last requested and accessed by a third party. Data being deleted or destroyed should be done with particular concern for confidentiality and security and in accordance with research funder requirements.
Other data repositories and archives
Whether you want to find data to reuse for your research, or archive your own data for the long-term, you'll need to know what data archive services are available in your field.
Maintaining an archive of data in the medium to long-term is a non-trivial activity, so it's generally best to deposit data with a dedicated archive which is set up to properly curate and look after data.
Data Retention:
The University of Stirling requires that research data is securely preserved in an appropriate format for a minimum of 10 years, or longer if specified by the funder. The 10 year period should run from the date of any publication that is based on the data or the date on which the data was last requested and accessed by a third party. Data being deleted or destroyed should be done with particular concern for confidentiality and security and in accordance with research funder requirements.
As well as our own institutional data repository DataSTORRE there are many other services available, depending on your research area, and the list is growing all the time. Here are some links to places you can search for data across a number of archives, and places you can find lists of specific archives to search or deposit in.
More information
Directories of archives
- re3data.org Registry of Research Data Repositories. You can filter the full list by subject discipline, content type or country, etc.
- OpenDOAR is an authoritative directory of academic open access repositories.
General purpose data sharing
Licensing data
When sharing data, it is important to consider how you want your data to be reused. You can then apply a relevant licence that most closely depicts those intended uses. Applying an explicit licence removes any ambiguity over what users can and cannot do with your data.
Lawyers can craft licences to meet specific criteria, but there are a number of open licences developed for widespread use on the internet that anyone can apply. Different types of subject matter necessitate differences in licensing. Licenses designed for one type of subject matter aren’t always best suited to licensing another type of subject matter because of differences in how copyright law applies Creative Commons (CC) licences were designed for 'generic' digital content and as such aren’t always best suited to licensing specific types of subject matter which engender different intellectual property rights. Indeed Creative Commons themselves have recommended against using their licences (other than CC Zero - CC0, or 'no rights reserved') for data and databases.
The Open Knowledge Foundation's (OKF) definition of 'open knowledge' says that knowledge is open if 'one is free to use, reuse, and redistribute it without legal, social or technological restriction.'
Similarly, the Panton Principles for Open Data in Science state that 'for science to effectively function, and for society to reap the full benefits from scientific endeavours, it is crucial that science data be made open.'
For further information see
- The DCC's guide on How to License Research Data. This guide will help you decide how to apply a licence to your research data, and which licence would be most suitable. It should provide you with an awareness of why licensing data is important, the impact licences have on future research, and the potential pitfalls to avoid. It concentrates on the UK context, though some aspects apply internationally; it does not, however, provide legal advice. The guide should interest both the principal investigators and researchers responsible for the data, and those who provide access to them through a data centre, repository or archive.
- Open Data Commons - Making your Data Open
Disposal of research data
Disposal of research data
Once you have appraised and selected the data that needs to be kept beyond the lifetime of the project it is important to consider how the remaining data will be deleted.
Electronically stored data:
Optical media: CDs or DVDs, it is recommended that these types of optical media storage devices are physically destroyed using a suitable shredder – please check the shredders suitability before using.
Network File Store – it is sufficient that relevant files/datasets are deleted from the file store. Ultimate destruction of the data, or of the physical media on which it resides, will be performed by Information Services when the hardware reaches end of life.
Magnetic, solid state or disc based media:
USB sticks – given the way in which data is stored on these devices it is difficult to completely delete the data and these should be physically destroyed.
Disc based media - it is not sufficient to simply delete the information stored on these discs as the files will still exist in some format. Data held on these devices should be destroyed through multiple overwriting using a software such as DBAN.
Paper files:
Paper files identified as sensitive or confidential and which are not to be retained should be destroyed carefully. The University recommends the use of a cross cut shredder to dispose of these files. The shredded paper should then be disposed of in recycling sacks which are removed by estates and campus services staff.
Extra considerations:
Care should be taken to ensure that data from all electronic devices is removed – including; cameras or voice recorders, even where this data has been stored in an encrypted format.
Where funders require a certificate of destruction to be completed (e.g. NHS Health Scotland) please ensure that the certificate is completed and returned to the funder in a timely manner. (NHS Health Scotland require that personal and project data must be securely destroyed three years after the Project Completion date.)