Publish and preserve

Last updated on 2026-04-14 | Edit this page

Overview

Questions

  • What is data archiving?
  • What are data repositories?
  • What is a DOI, and why is it important?

Objectives

  • Understand the role of trusted repositories in long-term preservation.
  • Learn why repositories improve discovery and citation.
  • Understand how DOI supports persistence and citation.
Callout

FAIR principles used in data archiving

Findable:

Accessible:

What is data archiving?


Data archiving is the long-term preservation of research data and related digital objects.

Funders and journals increasingly expect enough of the data and methods to be preserved so that results can be inspected, understood, and, where possible, replicated.

What are data repositories?


Data repositories are storage locations for digital objects such as datasets, code, supplementary files, and metadata.

Repositories make research outputs easier to find and can also provide:

  • preservation
  • backup
  • citation infrastructure
  • versioning
  • controlled access options

Examples:

Repository About
DataverseNL Community repository supporting Dutch universities and research centers
4TU Data Repository originally developed by Dutch technical universities
PANGAEA Domain repository for Earth and environmental sciences
Figshare General-purpose repository for many digital object types
Callout

General repository recommendations

Aim for a community repository when it exists. If not, a trusted general repository is usually better than leaving data only on a local drive or project website.

Typical strengths of general-purpose repositories include:

  • persistent identifiers
  • rapid publication
  • versioning
  • backup and preservation
  • usage statistics
  • open or restricted access modes
Comparison chart of repository options
Repository comparison matrix

Additional registries that help identify suitable repositories:

For context on the original Dataverse model, see Harvard Dataverse:

Challenge

Exercise

Visit https://dataverse.org/ and inspect the current installation map and the DataverseNL portal.

Answer:

  • How many Dataverse installations are listed now?
  • How many are in the Netherlands?
  • Roughly how many datasets are visible in DataverseNL at the time of your visit?

These counts change over time, so record the values you observe at the moment you complete the exercise rather than relying on historical numbers from older lesson versions.

What is a DOI, and why is it important?


DOI is a persistent identifier commonly used for articles, datasets, reports, and other scholarly outputs.

Two major DOI registration services in research are:

A DOI has three conceptual parts:

  • the resolver service
  • the registrant prefix
  • the locally assigned suffix
Illustration of DOI components
Diagram explaining DOI structure

DOIs make resources easier to cite and track, and they continue resolving even when storage details change.

Challenge

Exercise

Upload a test dataset to the DataverseNL demo or another training repository:

  1. Download the mock dataset: MOCK DATA
  2. Deposit it in a suitable demo or sandbox repository
  3. Record the DOI or persistent identifier you receive

There is no single correct answer. The point is to experience how a repository assigns persistent identifiers and what metadata are required before publication.

Discussion

Scenario

You conducted personal interviews for an ethnographic migration study. The raw transcripts cannot be openly shared, even under restricted access.

How should archiving be handled in this case? What metadata, derived outputs, or access statements should still be preserved and published?

Key Points
  • Repositories improve preservation, discovery, and citation.
  • Prefer a trusted community repository when one fits your discipline.
  • General repositories such as Zenodo still provide a strong baseline for FAIR publication.
  • DOI is a key persistent identifier for datasets and publications.