Natural Language Processing – Finding the Missing Link for Oncologic Data, 2022
Abstract
Oncology like most medical specialties, is undergoing a data revolution at the center of which lie vast and growing amounts of clinical data in unstructured, semi-structured and structed formats. Artificial intelligence approaches are widely employed in research endeavors in an attempt to harness electronic medical records data to advance patient outcomes. The use of clinical oncologic data, although collected on large scale, particularly with the increased implementation of electronic medical records, remains limited due to missing, incorrect or manually entered data in registries and the lack of resource allocation to data curation in real world settings. Natural Language Processing (NLP) may provide an avenue to extract data from electronic medical records and as a result has grown considerably in medicine to be employed for documentation, outcome analysis, phenotyping and clinical trial eligibility. Barriers to NLP persist with inability to aggregate findings across studies due to use of different methods and significant heterogeneity at all levels with important parameters such as patient comorbidities and performance status lacking implementation in AI approaches. The goal of this review is to provide an updated overview of natural language processing (NLP) and the current state of its application in oncology for clinicians and researchers that wish to implement NLP to augment registries and/or advance research projects.
Copyright (c) 2022 Andra Krauze, Kevin Camphausen
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright © by the authors; licensee Research Lake International Inc., Canada. This open-access article is distributed under the terms of the Creative Commons Attribution Non-Commercial License (CC BY-NC) (http://creative-commons.org/licenses/by-nc/4.0/).