Many funding agencies, especially federal agencies, require Principal Investigators (PIs) to share their project's research data and to provide a data management plan (DMP) at the time of application or prior to award. The National Science Foundation (NSF) began requiring DMPs in all NSF grant proposals in 2011 and the National Institutes of Health (NIH) will be rolling out additional DMP requirements in the near future.
Raynor Library have put together a set of resources to help you understand, plan for, and implement data management plans for your research. Additionally, Marquette University has registered as a participating institution with DMPTool. This tool enables users to log in with their MU credentials, receive MU-specific information and guidance as it becomes available, and save the DMPs they create.
Writing the Plan
Writing DMPs helps researchers formalize the data management process, identify weaknesses in their plan, and provide them with a record of what they intend to do. We recommend using the DMPTool to create your data management plan. This resource can help save time, protect your data investment, and increase your research efficiency. It also helps you prospectively think about the data you will collect and how you will manage it, thus strengthening your research design.
The DMPTool allows researchers to:
Beyond the DMPTool, libraries and data centers have drafted guides to help researchers write and implement their data management plan. They have put together an annotated "guide to the guides" in an effort to help you locate the most relevant advice for your own research needs. This list will be continually updated.
While ICPSR is a social science organization, the data management framework they have developed is valuable across disciplines. The framework describes what ICPSR has determined to be the key elements of a good data management plan, the relative importance of each element, and the rationale for including this information in your plan, along with examples.
The Data Conservancy, an initiative based at Johns Hopkins University and aimed at developing "data curation infrastructure for cross disciplinary discovery of observational data".
Documenting Data
An important step toward making your data useful both to you and other researchers is to develop a framework for documenting and describing your data and the context in which it was created. The Pennsylvania State University Libraries suggest that data documentation might include the following:
(From "Data Management Planning at Penn State Libraries: Documenting Data")
Metadata is the data used to describe your data. This makes it easier to store and locate your data, and makes it much easier for future researchers to use your data. A number of metadata schemas exist to help you organize and structure your data description. Metadata schemas can be viewed at the JISC Digital Media website.
The Libraries at MIT have put together a guide to the most basic elements to document, regardless of discipline. These include Title, Creator, Identifier, Subject, Funders, Rights, Access information, Language, Dates, Location, Methodology, Data processing, Sources, List of file names, File Formats, File structure, Variable list, Code lists, Versions, and Checksums. For more detail, see the MIT Libraries' guide to metadata for data management.
For a more comprehensive overview of metadata in general: NISO distinguishes between three types of metadata: descriptive, structural and administrative. Descriptive metadata is the information used to search and locate an object such as title, author, subjects, keywords, publisher; structural metadata gives a description of how the components of the object are organized; and administrative metadata refers to the technical information including file type. Two sub-types of administrative metadata are rights management metadata and preservation metadata.
Source: NISO. Understanding Metadata. NISO Press. ISBN 1-880124-62-9. Accessed: 6 March 2011.
Management Tools
The methodology you choose for managing your data will vary depending on the collection method, nature of the data, and the types of analyses to be applied. Some more common methods for managing data are databases, spreadsheets, data management tools, and standard file systems. The lists below summarize the benefits of each approach and provide links to further resources on the Web.
For more information on the basic organization of your data files, see the MIT Libraries Guide to "Organizing Files"
Storing/Securing Data
Where data is stored and backed up may depend on funding considerations, collection processes, the need for encryption or increased security, and available resources. Data storage locations may include one or all of the following options: an internal or external hard drive on a personal computer, a departmental or university server, Marquette approved cloud storage OneDrive/SharePoint, or cloud storage such as Amazon S3. Subject archives and data repositories, such as Genbank, may also be an option, depending on your discipline, the nature of your data, funding guidelines, and other issues. See the "Sharing Data" tab for more information on external data repositories.
Know the implications of working with confidential, sensitive, or proprietary data. Restrictions upon the ownership or sharing of student, patient, or other personal data may be governed by federal HIPPA, or FERPA guidelines. Marquette's Office of Research and Sponsored Programs can help researchers working with sensitive data.
Sharing Data
NSF guidelines require grantees to detail how they will disseminate and share their research results: "Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants." NSF Award and Administration Guide, Chapter VI.D.4
Why is sharing data important?
Data sharing is essential for expedited translation of research results into knowledge, products and procedures to improve human health
From NIH Data Sharing Policy, 2003
Many other academic and governmental groups have released their own statements on the importance of data sharing:
Some data may not be shared, based on policies from funding agencies or other relevant bodies. One example is the HIPPA (Health Insurance Portability and Accountability Act) Privacy Rule, which protects all "individually identifiable health information" derived from health care records and requires specification of data handling responsibilities. Marquette's Office of Research and Sponsored Programs can help researchers working with sensitive data.
Some issues you may want to consider:
Consider the following options:
The Distributed Data Curation Center (D2C2) at Purdue University Libraries has put together Databib a listing of data repositories. These are repositories where researchers may be able to deposit and share their research data. The list is both browse-able and searchable.