Creating a Data Management Plan
The concept of research data (aka research materials or assets in the humanities) is relevant to all researchers and good data management planning is essential for promoting high-quality research data across the university. A Data Management Plan (DMP) is a written document which describes the data that will be collected or generated during a research project and sets out a detailed plan for how the data will be managed throughout the project and what will happen to it after the project completes. Creating a DMP is not simply a box-ticking exercise. It is a document which, when created properly, will help researchers understand how best to ensure that data is findable, accessible, interoperable, and reusable throughout, and beyond, the project lifecycle.
An introductory webinar on Data Management Plan was delivered through MS Teams to highlight resources available to Ulster researchers to help with data management planning. During this webinar, DMP was contextualised within FAIR data principles and top tips were highlighted for preparing a DMP. Further information on the webnair can be found at RDM support.
Make plans for your data before you start collecting it
As part of a research grant application process, you need to ensure that you make plans for your research data before you start collecting it. Most funders now require a DMP and the University now requires it as part of the University's Research Data Management Policy.
What is a DMP?
A Data Management Plan (DMP) is a written document which describes the data that will be collected or generated during a research project and sets out a detailed plan for how the data will be managed throughout the project and what will happen to it after the project completes.
Creating a DMP is not simply a box-ticking exercise. It is a document which, when created properly, will help researchers understand how best to ensure data is findable, accessible, interoperable, and reusable throughout, and beyond, the project lifecycle.
Why do I need one?
Most of the major UK funders require applicants to include a DMP with their application for funding.
Maintaining a DMP for your research project is also good research practice. It can help you manage and look after your data throughout the lifetime of your research project and should be thought of as a "living" document that can be continually revised and updated to reflect your changing data management needs.
-
What are the benefits of a DMP?
- Helps to avoid or manage risk (e.g. of data loss; of accidental or malicious disclosure of sensitive or confidential data);
- Identifies in advance any extra costs and resources needed for carrying out data management activities;
- Identifies tasks and responsibilities that need to be planned for in advance (e.g. managing ethical and legal obligations);
- Makes it easier for you to find and understand your data when you need to use it;
- Saves time and resources.
DMP Templates
DMPonline is a web-based tool which allows users to create DMPs using customised templates and guidance from selected funders and institutions. Researchers can employ several features in DMPonline to create, edit and share their DMPs with colleagues.
You will need to create an account and link it to your University credentials if you have not already done so.
The Digital Curation Centre (DCC) have also produced a very useful DMPonline screencast video that gives instructions on how to register and use the tool.
Access funder specific templates for your Data Management Plan (DMP)
Some funders mandate the use of DMPonline, while others point to it as a useful option. You can download funder templates without logging in, but when you login the tool provides tailored guidance and example answers from the Digital Curation Centre (DCC) and many research organisations.
The generic DMP template covers the following areas:
-
Introduction and Context
Basic project information
- Name of project
- Project ID
- Grant number (may be useful to populate this when reviewing your DMP post-award)
- PI
- Researcher ID (e.g. ORCID)
- Data manager/contact
- School/research group
- Partner Organisations
- Funding bodies
- University user name
- Telephone
- Email address
- Project start date (approximate)
- Duration of project
- Date of DMP creation
-
Data Collection
Use of existing data
- Have you reviewed existing data, in the University and from third parties, to confirm that new data creation is necessary?
- Name of existing dataset(s)
- Name of contact(s)/responsibility for data
- Location
- Contents
- Brief description
- Estimated size
- Any licence issues
- Comments (any additional information (e.g. use or restrictions))
Creation or capture of new data
- How will you create/capture new data?
- What types of data will be created and how will they be created?
- Observational (e.g. sensor data, survey data, sample data, neuroimages)
- Experimental: gene sequences, chromatograms, toroid magnetic field data
- Simulation: climate models, economic models
- Derived or compiled: text and data mining, compiled database, 3D models
- Reference: gene sequence databanks, chemical, structures, spatial data portals
- What file formats will you use for each type of collection and why?
- Do these formats and software enable sharing and long-term access to the data?
- Are there any tools or software needed to create/process/visualize this data?
- How will you structure your names and files?
Organisation of the data
- How will you handle versioning?
- What quality assurance procedures will you adopt?
Any other notes or comments on data collection
-
Documentation and Metadata
Contextual information
- Is the data you will be capturing/creating self-explanatory or understandable in isolation?
- If not, what contextual details will be needed to make your data meaningful?
- How will you produce/capture this contextual information, and in what format?
Documentation
- For each type of data/material you produce what metadata will you need?
- Can you automate the creation of this metadata and if so, how?
- What metadata standards will you use, and why have you chosen these standards and approaches for metadata and contextual documentation?
-
Ethics and Legal Compliance
Ethical issues
- Are there ethical and privacy issues related to the data?
- If yes, list these issues and how you will deal with them; for example:
- Anonymisation of personal data;
- Retention or destruction of personal data.
Legal issues
- Who owns the copyright and Intellectual Property Rights (IPR) to the data?
- If more than one person owns the IPR, what agreement do you have on how this is to be handled?
- Freedom of Information (FoI) requests
- How will the data be licensed for reuse?
- Are there any restrictions on the reuse of third-party data?
- Will data sharing be postponed/restricted; e.g. to publish or seek patents?
Commercial issues
- Does the data have any unrealised commercial value which needs to be discussed with the Knowledge Exchange Manager - KTP?
-
Storage, Backup and Security during the Project
Anticipated data volumes
- How much data/associated materials in electronic form do you anticipate you will collect?
- How much data/associated materials in paper form do you anticipate you will collect?
Data storage
- Where do you intend to store the data during the project, and why?
- Whose responsibility is the storage of the data?
Data backup
- How will you back up the data?
- How regularly will backups be made?
- Who will be responsible for making backups?
Data security
- How will you ensure the security of your (including personal/sensitive) data?
- How will you ensure that collaborators can access your data securely?
- If you are collecting data in the field, how will you ensure its safe transfer into your main secure systems?
-
Data Sharing, Access and Long-Term Preservation
Sharing data
- When will you make the data available?
- Who do you think may use this data in the future, and for what purposes?
- Who will ensure that data will be deposited into a suitable repository at the end of the project?
- How will potential users find out about your data?
- How will you pursue getting a persistent identifier (DOI) for your data?
Restrictions on sharing data
- Will there be any limits/restrictions on how people can use their data?
- What action will you take to overcome or minimise restrictions?
- For how long do you need exclusive use of the data, and why?
- Will a data sharing agreement be required?
Selecting data to keep
- What data must be retained/destroyed for contractual, legal or regulatory purposes?
- How will you decide which other data to keep?
Preserving data
- What work is required to prepare the files, so they are suitable for preservation, and have you costed in the time to do this?
- Where will you keep the data that is retained?
- If you are responsible for the long-term storage, how will you ensure it is preserved?
- Have you costed in time and effort to prepare the data for sharing/preservation?
- Are there costs that your chosen repository charges for preparing and storing the data long-term? If so, have you included these in your grant application's direct costs?
- How will you destroy data that won't be preserved?
-
Roles and Responsibilities
- Who is responsible for implementing the DMP, and ensuring it is frequently reviewed and revised where necessary?
- Who will be responsible for each data management activity (including names and/or roles)?
- Will (and how will) responsibilities be split across partner sites in collaborative research projects?
- Will data ownership and responsibilities for RDM be part of any consortium agreement or contract agreed between partners?
- Who will ensure that any published research papers include a short statement on how the underlying research data may be accessed?
Costing RDM
What do I need to consider when costing Research Data Management?
- Be aware of the types of costs that you should consider in your grant application
- Identify what your funder will or will not fund with respect to data management
- How much does storing my data cost?
View further information on the costing of research data management in writing a research data management plan.
Funder Requirements
Most funders of academic research now require researchers to comply with certain expectations as a condition of the award.
These expectations include but are not limited to:
- The creation of a data management plan;
- Attention to the security of active data;
- The citing of deposited datasets using persistent identifiers, normally a DOI;
- The deposit of completed data in an appropriate open access data repository for an agreed period.
Further Information
A series of funders' data policy statements are outlined below. You should always check your specific funder for updated policies.
-
AHRC
AHRC do not have their own data policy, and instead share the UKRI common principles on data sharing. The new AHRC Funding Guide was published in July 2021.
'Grant Holders in all areas must make any significant electronic resources or datasets created as a result of research funded by the Council available in an accessible and appropriate depository for at least three years after the end of their grant. The choice of depository should be appropriate to the nature of the project and accessible to the targeted audiences for the material produced' (p98).The AHRC require Data Management Plans to be submitted in grant applications and outlines points that should be addressed by applicants (p56).
-
BBSRC
BBSRC provide an overview of the data management plan and detailed data sharing policy
Adherence to the data management plan will be monitored and built into the Final Report score, which may be taken into account for future proposals.
Research data that supports publications must be stored for 10 years.
Grant holders are requested to capture and record data sharing activities, including details of where and how data have been shared, in the appropriate places on ResearchFish.
Cambridge University have discussed BBSRC policy directly with Michael Ball from the BBSRC. The discussion and resulting clarifications of the BBSRC policy are published here.
-
Cancer Research UK
Cancer Research UK requires that applicants applying for funding provide a data management and sharing plan as part of their application.
Any applicants who consider that the data arising from their proposal will not be suitable for sharing must provide clear reasons for not making it available.
Investigators carrying out research involving human participants must ensure that consent for data sharing is obtained from participants; research data should be anonymised prior to sharing.
Research data should be available for sharing for a minimum period of five years from the end of a research grant.
CRUK have also issued a list of FAQ on data sharing.
-
EC Horizon 2020
Since 2017, all Horizon 2020 projects are part of the Open Research Data Pilot by default. The Principal Investigator must:
- Develop a DMP in the first 6 months of the project and keep it up-to-date throughout their project;
- Deposit their research data in a suitable research data repository;
- Make sure third parties can freely access, mine, exploit, reproduce and disseminate their data;
- Make clear what tools will be needed to use the raw data to validate research results or provide the tools themselves.
The H2020 Online Manual discusses both open access and data management requirements.
-
ERC
Open science requirements are embedded in an ERC grant agreement and they depend on the Framework Programme (and in some cases, the ERC call) under which funding was obtained.
The ERC step-by-step guide outlines how you can meet requirements on projects generating research data.
-
EPSRC
EPSRC has clear research data expectations of organisations in receipt of EPSRC research funding.
The University of Cambridge have prepared separate, dedicated guidelines to help achieve compliance with the EPSRC expectations.
Additionally, a useful list of FAQs has been developed in consultation with researchers at the University of Cambridge, and with Ben Ryan from the EPSRC.
-
UKRI
UKRI is one of the four organisations that wrote and published the UK Concordat on Open Research Data. The Concordat states that researchers should, wherever possible, make their research data open and usable within a short and well-defined period (based on disciplinary norms). Additionally, the Concordat asks for research data supporting publications to be accessible by the publication date.
"UKRI will be considering how to reward open data as part of the future REF assessments" (p. 40)
UKRI's set of seven common principles outline their expectation on research data. You can also read Guidance on best practice in the management of research data.
UKRI's New Open Access Policy requires all research publications arising from UKRI funding bodies to include a data access statement on how the supporting data and any other relevant research materials can be accessed.
-
Leverhulme Trust
The Leverhulme Trust has currently no dedicated research data policies in place.
-
MRC
MRC's data sharing policy gives an overview of the MRC principles of data sharing for all MRC-funded research.
The MRC expects valuable data arising from MRC funded research to be made available to the scientific community with as few restrictions as possible to maximise the value of the data for research and for eventual patient and public benefit. Such data must be shared in a timely and responsible manner.
Grant holders shall review and update their data management plans annually. MRC also provides a detailed guideline on dealing with personal data in medical research.
Applicants are also expected to submit a Data Management Plan together with the grant proposal.
-
NERC
NERC have a well-established data policy setting the ground rules for managing data that applies to all those funded by NERC.
All applications for NERC funding need to include a one-page Outline Data Management Plan (ODMP). A fuller Data Management Plan must be provided to NERC within three months of the project’s starting date.
Data needs to be deposited into a NERC data centre within 2 years of collection.
-
NIH
Under NIH data sharing policies, investigators are encouraged to maximise the appropriate sharing of scientific data.
The Data Management and Sharing Policy Overview outlines what is expected of investigators and institutions under the 2003 NIH Data Sharing Policy and the 2023 NIH Data Management & Sharing Policy.
A series of Frequently Asked Questions are also provided
-
NIHR
The NIHR position on the sharing of research data strongly supports the sharing of data in the most appropriate way.
To enable research data to be discoverable, and effectively and ethically re-used, NIHR encourage researchers' data to be deposited in an appropriate repository where possible.
However, the minimum requirements for the research teams are:
- Data sharing statements must be included when publishing the findings of the research describing how to access the underpinning research data.
- Data management and access plans are currently in the process of being introduced across all NIHR funding programmes. These must be completed during the startup of the research and will be available on the NIHR Funding and Awards website. NIHR will monitor the submission and implementation of data management and access plans.
-
The Royal Society
The Royal Society have a comprehensive Open Data Policy which outlines expectations regarding depositing data, licensing, citing datasets and data accessibility.
-
STFC
The STFC Scientific Data Policy applies to all scientific data produced as a result of STFC funding. The policy outlines a series of principles that should be followed and then recommendations for good practice.
The Data Management Plan attachment is mandatory for most STFC schemes. STFC also provides peer review guidance for data management plans.
-
Wellcome Trust
Wellcome's Data, Software and Materials Management and Sharing Policy outlines how researchers should manage and share data, software and materials that arise from Wellcome-funded research and outputs management plans.
An output management plan must be submitted as part of a grant application process and Wellcome Trust provide guidance on completing an outputs management plan.