Data Management

  • WMO
  • Non-WMO

Research data is the foundation of scientific research. It is therefore important that the data is properly managed so that information cannot be lost, changed unintentionally or manipulated. Legislation and an increasing emphasis on the reproducibility and integrity of research demand transparency in working procedures. Funding bodies also increasingly set requirements for data management.

The department Research Data Management provides you with a a lot of information and advice on this matter. They have developed a SOP Research Data Management describing the legal requirements and those of Amsterdam UMC for research data management, and offers guidelines and practical tips for handling research data — from data collection and data storage to data processing and archiving. A few specific topics are highlighted here:

Data management plan

Prior to starting your research, you will draw up a mandatory detailed Data Management Plan (DMP), which shows how your research complies with the requirements described in the SOP Research Data Management. The DMP will describe, among other things, what kind of research data will be collected, how that data will be stored and managed during the investigation, and what will happen to the data after the project has been closed. Thus, a DMP is an important way of safeguarding data integrity. You can find the Amsterdam UMC template for your DMP here.

If you have any questions about research data management or drawing up a DMP, contact the Research Data Management helpdesk, via RDM Service Portal

Preparing for data collection

For collection of personal data for WMO compliant research, a number of basic rules based on legal regulations can be given:

  • never collect more personal data than needed for the investigation;
  • the people involved must give written consent for collecting their personal data;
  • never use the personal data for other purposes or share it with other parties than those for whom you have received consent from the people involved;
  • make sure the data collected only contains the research data that has been set out in the research protocol (data minimisation).

Data can be collected in different ways. You can:

  • reuse already existing data from other research databases, registers or the EPD;
  • create a new data set by entering the data from the human subjects manually in a database;
  • generate data using a machine or piece of equipment, such as laboratory experiments, DNA sequencing or radiology images.

Reuse of existing data

Reusing data avoids unnecessary creation of new data sets or unnecessary re-entering of details in a research database, with all the risks of typographical errors. It may be possible to make use of existing healthcare data from an EPD, for example. A great deal of structured healthcare data is available for investigation via the Research Data Platform (RDP)

Legal provisions nevertheless prohibit such data from being re-used for the purpose of scientific research with no further requirements. Privacy and patient autonomy must also be respected when healthcare data are re-used. The conditions that must be met are described in the SOP on the Reuse of care data for the purpose of research.

The website of the department of Research Data Management provides additional information on the re-use of data.

Creating a new data set

Amsterdam UMC has various applications and tools available for collecting and storing new research data. On the Department of Research Data Management (RDM) webpage you will find detailed information about the use of the right software for building a database (eCRF), setting up a questionnaire or randomisation programme, creating a data dictionary, validation controls, and more. RDM also offers courses in research data management and the application Castor.

All the information that is generated during the investigation and archived after research is completed must be safely stored while also remaining accessible for the research team.

If it is not possible to use existing healthcare data, the collection of new data will often take place by recruiting subjects through the clinic (inpatient or outpatient) or using advertisements, internet sites, leaflets or other tools.

When collecting data for your research, you will be working with source data and a Case Report Form (CRF), either in hard copy or electronic (database). The data management plan,  is used to document various details, including how you will collect the data, where you will store the data and how you will validate the data. The statistical analysis plan is used to document the analytical tools that you will use. Additional information on analytic databases is provided in the section on Analysis.

Audittrail and the CRF

A database/CRF is a document, either hard copy or digital, in which all of the requested study information for each subject is collected. The guidelines for good clinical practice (GCP) require that ‘Any change or correction to a CRF must be dated, initialled and (if necessary) explained, and it must not render the original text unreadable (i.e. an audit trail must be maintained). This requirement covers both manual and electronic changes or corrections’ (GCP Section 4.9.3).

Important: SPSS and Excel do not include an audit trail, and they are therefore not sufficient as CRFs/eCRFs.

Amsterdam UMC offers the following options (free of charge) for using an electronic CRF/eCRF: Castor EDC (additional information).

Researchers can use Castor EDC to perform randomisation, send email questionnaires to patients, invite monitors to the study, export data easily to Excel, CSV and SPSS and many other tasks.

Since October 2019, it has been possible to use the link between the Research Data Platform (RDP) and Castor. This has made it possible to download lab values in an eCRF through the RDP with only a few mouse clicks.

If you are interested in using these resources, please read the manual to learn how to request the link and which conditions must be met in order to use the link. The manuals for Castor explain how to prepare CRF pages for use, as well as the procedure for loading the data.

Validation procedures

To ensure the quality of the dataset, data entry must be performed in a standardized and transparent manner. It is advisable to set up validation procedures that make it possible to inspect the quality of research data as they are being entered.

The Research Data Management website provides additional information in this regard.


The random assignment of subjects to a treatment arm of a clinical trial (i.e. randomisation) requires the least possible involvement of the researcher. For this reason, computer-driven assignment techniques are preferred.

Castor EDC offers several randomisation methods that can be built into the eCRF. Additional information is available from the Research Data Management helpdesk, via RDM Service Portal

Online Questionnaires

To use online questionnaires outside of Castor EDC, a agreement must be concluded with the company that will collect these data. Amsterdam UMC has agreements with several suppliers (e.g. Survalyzer, Limesurvey and KLIK PROM portal).

Additional information is available from RDM Service Portal.