Chapter 1: Research Integrity

1.7 Data Management and Integrity

Good record keeping fosters the scientific norms of accuracy, replication and reliability. Failure to maintain records compromises the validity of the data, the utility of the research, and the protection of intellectual property.

Research sponsors obligate study teams to maintain appropriate records of experimental activities, but there is little guidance on accepted practice. Informing trainees and other research team members about practical issues related to data collection are the responsibility of the PI. While record keeping practices vary and often reflect the PIs customs and procedures, data must be recorded systematically in legible English so findings can be validated and replicated.

Ownership of data depends on the terms and conditions of awards. NIH awards ownership of the data to the institution while the investigator serves as the "custodian". Grantee institutions such as Children's Hospital usually operate so as to give maximum discretion to the PI for data collection, recording, storage, retention and disposal.

Data must be stored and maintained according to the terms and conditions of awards and institutional policy. NIH generally requires data be retained for three years from the date of the last transaction. PIs can choose to keep data longer. The Children's Hospital of Philadelphia Research Institute has invested considerable resources in a SAN (a secure and redundant data storage network) enterprise collaboration tool and virtual server environment to support research. More information is available from Research Information Systems.

The collection, storage and analysis of data in clinical research requires strict adherence to HIPAA's Privacy Rule. All data sets should be stored with the minimum number of identifiers necessary to conduct the research. Protections should be in place to minimize the risk to subjects' privacy and confidentiality and the approach should be tailored to the risks of the research. One or more of the following protections should be implemented: storing as much of the PHI as practical in a Master List separately from the remainder of the study data, password protecting computers used to store data, password protecting study files, encrypting study files, limiting access to authorized personnel, and by locking offices and filing cabinets that contain subject's data. Data that includes patient identifiers should not be stored on unsecured laptops, USB drives or other portable devices. Special considerations must be made before posting data on a Web site or other public forums.

Data integrity is the assurance that data is consistent and correct. Procedures for data collection, entry, storage and analysis must ensure that the integrity of the data is maintained at all times. Data compromised by carelessness or fabrication is of no value and renders an experiment worthless.

Ask the Experts

Abbas Jawad, Ph.D.
Associate Professor  
Division of Biostatistics

Andrew Farella
Associate CIO and AVP
Business Application and Research IS

« Back to "Data Management and Integrity"

Developed by the Office of Responsible Research Training