Automated Metadata Transformation in a Medical Data Integration Center: Implementation of an Algorithm and Standardized Quality Analysis
Doctoral thesis
Date of Examination:2023-10-13
Date of issue:2023-11-02
Advisor:Prof. Dr. med. Tibor Kesztyüs
Referee:Prof. Dr. med. Tibor Kesztyüs
Referee:Prof. Dr. Jürgen Brockmöller
Referee:Prof. Dr. Leif Saager
Referee:Prof. Dr. Ulrich Sax
Referee:Prof. Dr. Lutz M. Kolbe
Referee:PD Dr. med. Claus Wolff-Menzler
Files in this item
Name:Dissertation_Caroline_Boenisch_ohneUnterschrift.pdf
Size:7.33Mb
Format:PDF
Abstract
English
Evidence-based medicine is the conscientious, explicit, and judicious use of the current best scientific evidence to make informed decisions about the medical care of individual patients. The practice of evidence-based medicine involves the integration of three aspects: individual clinical expertise, the best available external evidence from systematic research, and patient needs. According to a Food and Drug Administration (FDA) white paper, Real World Data (RWD) can provide complementary information to Randomized Clinical Trials (RCT) data. From the University Medical Center Göttingen’s primary and secondary clinical systems, this clinical RWD is were extracted and stored within the Medical Data Integration Center (UMG-MeDIC). It should be emphasized, that the clinical data were not predominantly collected for research purposes and are therefore qualitatively inconsistent. Methodologically, requirements for a Medical Data Integration Center were developed and the setup of a Data Integration Center was enabled using open source software. Within the UMG-MeDIC, data were extracted from the clinical systems, preprocessed and loaded into the Data Integration Center. The aim of this work was to make the data available to researchers. For this purpose, a metadata crosswalk was created to provide researchers with in-depth information about the UMG-MeDIC data pool in the form of metadata. The crosswalk revealed that none of the selected data formats met the requirements of the UMG-MeDIC and that a self-developed intermediate format was needed. This intermediate format is the basis for providing researchers with metadata in the most frequently commonly used data formats. Metadata that reflect the quality of the data are of particular interest and importance. Based on the requirements of the UMG-MeDIC, quality criteria were developed and evaluated to determine the quality of the data in the UMG-MeDIC data pool. Researchers are now able to search and filter clinical data via an a graphical user interface developed within the this work and can download the resulting metadata in any data format. This metadata then also includes further information regarding the quality of the data.
Keywords: metadata; data warehouse; data quality; clinical data; real world data