Best Practices: Preserve Your Data

So you’re finished with your research project, now what? To make sure your data is accessible and usable in the future you must have a plan for preserving it. Sharing in a repository is not the same as preservation!

Things to Consider

  • What files do you need to keep long-term?
  • What file formats are your data saved in?
  • Where the data be stored long-term?
  • When and how will you back-up your data?

Remember to always keep a copy of your raw data before it is processed, and to keep data in its original format if you are converting it to another format.

Preservation Formats

Saving your data in proprietary formats risks your data being inaccessible in the future if you were to lose access to the software, or the program developers stop supporting and updating a program. Also, your data will not be usable to people who do not have that proprietary software. Whenever possible, use the following non-proprietary formats.

  • XML: Documents or web
  • CSV: Spreads
  • PDF: Documents
  • TIFF: Images

Suggested File Formats

Use non-proprietary, open, non-compressed file types when possible. Some suggested file types are:

  • Spreadsheets/Databases/Tabular data: CSV, TSV
  • Geospatial: GeoTIFF, NetCDF
  • Moving images: MOV, MPEG, AVI, MXF
  • Sounds: WAVE, AIFF, MP3
  • Statistics: ASCII, DTA, POR, SAS, SAV
  • Still images: TIFF, PNG
  • Text: XML, PDF/A, TXT Plain Text - with encoding: US-ASCII, UTF-8
  • Web archive: WARC
  • Containers: TAR, GZIP, ZIP