• Int J Med Inform · Jun 2017

    A method for cohort selection of cardiovascular disease records from an electronic health record system.

    • Maria Tereza Fernandes Abrahão, Nobre Moacyr Roberto Cuce MRC Clinical Epidemiology Team, Heart Institute (InCor) Clinical Hospital, Faculty of Medicine, University of Sao Paulo, Sao Paulo, Brazil., and Marco Antonio Gutierrez.
    • Program in Cardiology, Heart Institute (InCor) Clinical Hospital, Faculty of Medicine, University of Sao Paulo, Sao Paulo, Brazil. Electronic address: tereza.abrahao@usp.br.
    • Int J Med Inform. 2017 Jun 1; 102: 138-149.

    IntroductionAn electronic healthcare record (EHR) system, when used by healthcare providers, improves the quality of care for patients and helps to lower costs. Information collected from manual or electronic health records can also be used for purposes not directly related to patient care delivery, in which case it is termed secondary use. EHR systems facilitate the collection of this secondary use data, which can be used for research purposes like observational studies, taking advantage of improvement in the structuring and retrieval of patient information. However, some of the following problems are common when conducting a research using this kind of data: (i) Over time, systems and data storage methods become obsolete; (ii) Data concerns arise since the data is being used in a context removed from its original intention; (iii) There are privacy concerns when sharing data about individual subjects; (iv) The partial availability of standard medical vocabularies and natural language processing tools for non-English language limits information extraction from structured and unstructured data in the EHR systems. A systematic approach is therefore needed to overcome these, where local data processing is performed prior to data sharing.MethodThe proposed study describes a local processing method to extract cohorts of patients for observational studies in four steps: (1) data reorganization from an existing local logical schema into a common external schema over which information can be extracted; (2) cleaning of data, generation of the database profile and retrieval of indicators; (3) computation of derived variables from original variables; (4) application of study design parameters to transform longitudinal data into anonymized data sets ready for statistical analysis and sharing. Mapping from the local logical schema into a common external schema must be performed differently for each EHR and is not subject of this work, but step 2, 3 and 4 are common to all EHRs. The external schema accepts parameters that facilitate the extraction of different cohorts for different studies without having to change the extraction algorithms, and ensures that, given an immutable data set, can be done by the idempotent process. Statistical analysis is part of the process to generate the results necessary for inclusion in reports. The generation of indicators to describe the database allows description of its characteristics, highlighting study results. The set extraction/statistical processing is available in a version controlled repository and can be used at any time to reproduce results, allowing the verification of alterations and error corrections. This methodology promotes the development of reproducible studies and allows potential research problems to be tracked upon extraction algorithms and statistical methods RESULTS: This method was applied to an admissions database, SI3, from the InCor-HCFMUSP, a tertiary referral hospital for cardiovascular disease in the city of São Paulo, as a source of secondary data with 1116848 patients records from 1999 up to 2013. The cleaning process resulted in 313894 patients records and 27698 patients in the cohort selection, with the following criteria: study period: 2003-2013, gender: Male, Female, age:≥18years old, at least 2 outpatient encounters, diagnosis of cardiovascular disease (ICD-10 codes: I20-I25, I64-I70 and G45). An R script provided descriptive statistics of the extracted cohort.ConclusionThis method guarantees a reproducible cohort extraction for use of secondary data in observational studies with enough parameterization to support different study designs and can be used on diverse data sources. Moreover it allows observational electronic health record cohort research to be performed in a non-English language with limited international recognized medical vocabulary.Copyright © 2017 Elsevier B.V. All rights reserved.

      Pubmed     Full text   Copy Citation     Plaintext  

      Add institutional full text...

    Notes

     
    Knowledge, pearl, summary or comment to share?
    300 characters remaining
    help        
    You can also include formatting, links, images and footnotes in your notes
    • Simple formatting can be added to notes, such as *italics*, _underline_ or **bold**.
    • Superscript can be denoted by <sup>text</sup> and subscript <sub>text</sub>.
    • Numbered or bulleted lists can be created using either numbered lines 1. 2. 3., hyphens - or asterisks *.
    • Links can be included with: [my link to pubmed](http://pubmed.com)
    • Images can be included with: ![alt text](https://bestmedicaljournal.com/study_graph.jpg "Image Title Text")
    • For footnotes use [^1](This is a footnote.) inline.
    • Or use an inline reference [^1] to refer to a longer footnote elseweher in the document [^1]: This is a long footnote..

    hide…