Research data may not be copyrightable in the U.S. However, certain intellectual property policies and licenses may inform how you and others can share and use data. Addressing copyright and selecting the right license are important data management practices that ensure your shared data is being used as intended and you are being properly attributed for its creation. By proactively applying a clear license, you empower the research community to use your work confidently, increasing the impact of your efforts and upholding the principles of open science.
A license is a legal document that specifies what others are permitted to do with your data. By applying a clear license, you remove ambiguity and encourage proper reuse. The best practice for academic data is to use an open, standardized license.
Creative Commons (CC) licenses are the global standard for sharing academic and creative works, including datasets.
License | What it Means | Best For |
---|---|---|
CC0 (Public Domain Dedication) | No Rights Reserved. Anyone can use, modify, and distribute the data for any purpose without restriction. You waive all your copyright claims. | Maximizing the reuse and interoperability of factual data. This is a highly recommended standard in the open science community. |
CC BY (Attribution) | Give Credit. Others can use, share, and adapt your data for any purpose (even commercially), as long as they give you credit (cite you). | The most common license for academic sharing. It ensures you receive credit while allowing for broad reuse. This is the default best choice if you are unsure. |
CC BY-SA (ShareAlike) | Give Credit & Share Alike. Others must credit you and make any derivative works they create available under the same CC BY-SA license. | Promoting a cycle of open sharing, often used in community-driven projects like Wikipedia. |
CC BY-NC (NonCommercial) | Give Credit, No Commercial Use. Restricts reuse to non-commercial purposes only. | Situations where you want to prevent your data from being used in for-profit products. Be aware this can limit valuable reuse by startups or in industry research. |
License to Avoid for Data: The NoDerivatives (ND) license should generally be avoided for datasets, as it prevents others from adapting, remixing, or performing analysis on your data, which is often the primary point of sharing it.
You apply a license at the point of sharing or publication.
Prepare Your Data and Documentation: Finalize your dataset and create a README.txt file. This is the perfect place to state the license you've chosen.
Your license is also applied when you deposit your data in a repository.
Deposit Data in a Repository: Choose a repository that supports long-term access and formal metadata, as this is where you will officially select your license. You are required to choose a license before you can complete your submission and receive a DOI.
The Repository Applies the License: Once you've selected a license in the repository's interface, it will be displayed on the landing page for your dataset, clearly informing users of the terms of use.