ORIGINAL ARTICLE
ChatGPT diagnostic evaluation in the analysis of discharge notes for dengue cases
Evaluación diagnóstica de ChatGPT en análisis de notas de egreso por casos de dengue
Karla Cecilia Vélez Rodríguez ¹, https://orcid.org/0009-0000-5837-7220
Héctor Daniel Magallón Mendoza 1*, https://orcid.org/0009-0005-5355-5936
Yonathan Estrada Rodríguez 2, https://orcid.org/0000-0001-9161-6545
¹ Benito Juárez García Welfare Universities, Quechultenango Campus, Mexico.
² University of Medical Sciences of Matanzas. Dr. Juan Guiteras Gener Faculty of Medical Sciences. Matanzas, Cuba.
* Corresponding author: gausswerner816@gmail.com
Received: 10/10/2024
Accepted: 25/05/2025
How to cite this article: Vélez-Rodríguez KC, Magallón-Mendoza HD, Estrada-Rodríguez Y. ChatGPT diagnostic evaluation in the analysis of discharge notes for dengue cases. MedEst. [Internet]. 2025 [cited access date]; 5:e348. Available in: https://revmedest.sld.cu/index.php/medest/article/view/348
ABSTRACT
Introduction: Research is an intellectual and experimental process that allows for the study and identification of specific events. Currently, artificial intelligence (AI) stands out as a revolutionary tool that provides new perspectives, especially in the healthcare field, offering advances in classification, speed, and accuracy in research.
Objective: To evaluate the operational utility and diagnostic efficacy of ChatGPT in processing hospital discharge notes for dengue cases.
Material and method: An observational and descriptive analysis was conducted from August to September 2024, where 50 discharge medical notes were selected from a total of 400 dengue cases from the period from January 2023 to June 2024. The analysis was performed through the notes captured in a virtual format where, previously prepared, they were gathered in a document with the filtered information. And they were placed in the ChatGPT model interface for analysis.
Results: Significant efficiency and speed were observed in the analysis of discharge notes due to the specificity of the instructions during the manual training and preparation of the AI. This allowed for the automation of the recognition of key medical criteria patterns in a single analysis.
Conclusions: The use of ChatGPT proved to be an efficient resource that allowed for faster and more effective analysis of medical discharge notes.
Keywords: Dengue cases; ChatGPT; Deep learning; Artificial intelligence; Natural language processing
RESUMEN
Introducción: la investigación es un proceso intelectual y
experimental que permite estudiar e identificar eventos específicos.
Actualmente, la inteligencia artificial (IA) se destaca como una herramienta
revolucionaria que aporta nuevas perspectivas, especialmente en el área de la
salud, ofreciendo avances en la clasificación, velocidad y precisión en las
investigaciones.
Objetivo: evaluar la utilidad operativa y la eficacia diagnóstica de
ChatGPT en el procesamiento de notas de egreso hospitalario correspondientes a
casos de dengue.
Material y método: se realizó un análisis observacional y descriptivo
agosto a septiembre del año 2024 donde 50 notas médicas de egreso, fueron seleccionadas
de un total de 400 casos de dengue del periodo de tiempo de enero del 2023 a
junio del 2024. El análisis se realizó a través de las notas plasmadas en un
formato virtual donde previamente preparadas se juntaron en un documento con la
información filtrada. Y se colocaron en la interfaz del modelo de ChatGPT para
su análisis.
Resultados: se observó una significativa eficiencia y velocidad del análisis de las notas de egreso, debido a la especificidad de las instrucciones durante el entrenamiento y preparación manual de la IA. Esto permitió automatizar el reconocimiento de los patrones de criterios médicos clave en un solo análisis.
Conclusiones: el uso de ChatGPT demostró ser un recurso eficiente que
permitió analizar las notas de egreso médico de forma más rápida y efectiva.
Palabras clave: Casos de dengue; ChatGPT; Deep learning; Inteligencia artificial; Procesamiento de lenguaje natural
INTRODUCTION
Worldwide, research constitutes a fundamental component of the scientific process. By its very nature, this experimentation is carried out in controlled and measurable settings specifically designed for such purposes. Various research methods are employed, the main objective of which is to explain specific phenomena or events. Among these methods, there are specialized tools that allow for measurements, predictions, systematic reasoning, and classification. (1)
Artificial intelligence (AI) is a revolutionary tool in the scientific and technological fields, thanks to its ability to perform complex reasoning based on instructions using natural language processing (NLP). Among current models, ChatGPT stands out for its practicality and efficiency. It is a large-scale language model (LLM) that uses deep neural networks, giving it superior reasoning capabilities compared to other existing language models. (1)
In the Americas, the implementation of ChatGPT—developed by OpenAI (San Francisco, California)—has generated a significant impact, particularly due to its free version and accessible interface. This system, which functions as a chatbot or virtual assistant with highly humanized natural language capabilities, has captured the massive attention of frequent web browser users. (2)
It is worth noting that, unlike other regions with more advanced developments in artificial intelligence, current use in the Americas remains relatively basic. However, this outlook does not rule out the possibility that, in the medium term, the continent could become an important benchmark for specialized research, thanks to the strategic adoption of these technological tools. (2)
In Mexico, the growing use of artificial intelligence is generating a transformative impact, particularly in the research field. This phenomenon manifests itself through two key dimensions: (2, 3)
1) Technological development: The implementation of new AI tools is optimizing methodological processes in various scientific disciplines, while expanding its applications in non-academic sectors (industry, services, and government).
2-) Regulatory Framework: The country faces the challenge of establishing ethical protocols and moral standards that balance technological innovation with principles of social responsibility, data privacy, and digital equity.
Artificial intelligence platforms are revolutionizing the medical field through multiple clinical and administrative applications:
1-) Primary Care: (4)
• Optimization of the admission and initial triage process.
• Automated analysis of medical records.
• Assistance in medical history taking through natural language processing.
2-) Diagnosis and Treatment: (5)
• Advanced interpretation of vital data in real time.
• Support for diagnostic imaging (radiology, CT scans, MRIs).
• Generation of personalized treatment plans through predictive algorithms.
3-) Hospital Management: (6)
• Automation of administrative processes.
• Optimization of medical schedules and resources.
• Big data analysis for epidemiology and public health.
Although large-scale language models (LLMs) such as ChatGPT have demonstrated significant impact in various fields, their adoption as a continuous support tool in clinical settings remains limited, often limited to advanced information search functions (7, 8, 9). This underutilization poses a crucial challenge in medical education, where their responsible application—given the technical complexity and potential ethical risks—requires rigorous validation.
Based on the above, the following
scientific question is formulated: Can a general-purpose language model such as
ChatGPT simultaneously guarantee clinical validity and ethical compliance in the
analysis of medical documentation? To address this question, a study was
conducted to evaluate the operational utility and diagnostic efficacy of
ChatGPT in processing hospital discharge notes for dengue cases.
MATERIALS AND METHODS
An observational, descriptive study was conducted between August
and September 2024, using a systematic sample of 50 medical discharge notes
selected from a population of 400 confirmed dengue cases recorded between
January 2023 and June 2024. The medical records were obtained from the clinical
archives of the IMSS Bienestar Dr. Raymundo Abarca Alarcón High Specialty
Hospital in Mexico.
The exclusion criteria for selecting these cases were: patients under 20 years of age, patients over 50 years of age, patients registered with possible dengue, "no dengue data," and other pathologies.
The inclusion criteria were: patients over 20 years of age, diagnosed with dengue, with dengue data, with signs of dengue, and probable dengue. Some specific cases were also included with keywords such as "with alarming data," "without alarming data," "critical dengue," "severe," and "non-severe."
Bibliographies were searched from official sources such as UNESCO, Statista, Computer Weekly, and medical journals such as PubMed, Scielo, NCBI, SciencieDirect, ResearchGate, and Nature to understand the use of AI in research.
Data processing with ChatGPT:
Structured prompts were designed that specified the role of the model (medical analysis assistant), required tasks (format and content review), and ethical restrictions (no diagnosis generation).
The discharge notes were standardized in Word format, faithfully replicating the original physical structure and essential clinical fields (23 variables analyzed).
Variables analyzed:
1. Administrative data: name, medical record, admission/discharge dates.
2. Clinical parameters: hospital progress (vital signs, procedures), diagnoses (admission/discharge, ICD-10 coding), therapeutic plan (treatment, outpatient follow-up).
3. Quality indicators: internal consistency (e.g., diagnostic congruence) and completeness of required fields.
Validation Protocol: This was based on a three-phase verification: Structural Coherence (default format), Data Integrity (required fields), and Content Analysis (error patterns).
The evaluation criteria were based on the institution's clinical documentation guidelines and international medical record standards (ISO/TR 20514).
However, only the versatile health criteria for the patient's health were given greater importance, disregarding personal and hospitalization criteria due to patient privacy and confidentiality. After the analysis, ChatGPT was asked to generate an Excel macro that would display the total percentages for each medical discharge sheet in a bar graph. The errors were: "Absence of important information," "Inconsistencies," and "Illegible spelling."
The three classifications were used to compile the data obtained from the analysis regarding the number of permissible or impermissible errors. A further classification was also made based on whether the discharge sheet and its respective information were effective or ineffective, making pattern recognition visible in the medical discharge sheets. Permissible errors were those where the absence of information, inconsistencies, or illegible spelling interfered with important information and was in an area vital to the proper functioning of a discharge sheet. This was directly considered an ineffective discharge sheet.
The study showed interest in
demonstrating the use of AI (ChatGPT) as an effective tool, whether for
analyzing errors or finding patterns in relationships between discharge sheets,
without showing ethical and moral commitments to confidential patient data.
RESULTS
Table 1 shows the percentage of functionality preserved in
discharge sheets. The analysis revealed that only 25,3 % (13/50) of the notes
met the quality standards for progress summaries, while 74,7 % (37/50) had deficiencies
that compromised their usefulness for patient follow-up.
Table 1: Distribution of discharge sheets by percentage of effectiveness in functionality
Effectiveness of graduation notes |
No |
% |
Effective Notes |
13 |
25,3 % |
Ineffective Notes |
37 |
74,7 % |
Total |
50 |
100 |
Source: Analysis of graduation notes with ChatGPT
Table 2 classifies the error percentages obtained from all discharge notes in general. The analysis justifies this by stating that all discharge notes had errors (192 errors), but only the actual ones contained permissible errors, making the sheet less inefficient, leaving a percentage of permissible errors of 46,3 % and impermissible errors of 53,7 %.
Table 2: Distribution of errors by classification in medical discharge notes
Error Category |
% |
n/N |
Distinctive features |
Potential clinical impact |
Permissible Errors |
46,3 |
89/192 |
- Minor omissions in non-critical data
- Unambiguous terminology variations |
Low |
Impermissible Errors |
53,7 |
103/192 |
- Lack of essential clinical data - Evolving contradictions - Illegible text |
High |
Source: Analysis of graduation notes with ChatGPT
Table 3 shows the classification of error types and their prevalence in medical discharge notes. The highest percentage was missing important or relevant information, at 53,82 %, and inconsistencies in the medical note were the least prevalent in the typography of errors, at 21,42 %.
Table 3: Distribution by error category in medical discharge notes
Error category |
Frequency (%) |
n/N |
Characteristic examples |
Clinical severity |
Lack of relevant information |
53,82 |
103/192 |
-Omission of alarm data
-Missing daily progress |
Criticism |
Illegible spelling |
24,77 |
48/192 |
- Non-standard abbreviations
- Indecipherable handwriting |
Medium |
Clinical inconsistencies |
21,42 |
41/192 |
-Medication inconsistencies
-Diagnostic and outcome discrepancies |
High |
Source: Analysis of graduation notes with ChatGPT
The analysis also revealed that the months with the highest rates of dengue cases were the period from December 2023 to January 2024, along with the period from May to June 2024, with a total of 12 (24 %) and 10 (20 %) cases in each period.
In Table 4, the analysis identified two epidemiological peaks: December 2023-January 2024 (24 % cases) and May-June 2024 (20 %), showing the typical seasonal pattern of dengue. These periods accounted for 58 % of the ineffective reports, revealing vulnerabilities in the documentation systems during outbreaks.
Table 4: Distribution of dengue cases by incidence by epidemiological period
Epidemiological period |
Cases (n) |
% |
Average monthly rate |
Associated factors * |
December 2023 - January 2024 |
12 |
24% |
6 cases/month |
- Seasonal rainfall - Vector increase |
May-June 2024 |
10 |
20% |
5 cases/month |
- High temperatures - Ambient humidity |
Rest of the period |
28 |
56% |
1.75 cases/month |
- Basal behavior |
Source: Analysis of graduation notes with ChatGPT
DISCUSSION
The study yielded a multitude of results that met the objective and even suggested other essential patterns in the analysis of discharge sheet performance, thus representing a fundamental technological advance for the field of medicine.
The findings presented (25,3 % effective notes) are consistent with previous studies by Alowais et al., (14) which reported rates of 22-28 % of adequate documentation in secondary care hospitals. The high proportion of ineffective notes (74,7 %) reflects a systemic problem already identified by the WHO in its report on the quality of medical records in 2022 (3), which indicates that <30 % of institutions meet basic documentation standards. Of particular concern is the omission of progress data, a critical factor for continuity of care according to Singh et al. (14)
The distribution of 46,3 % permissible errors vs. 53,7 % impermissible errors corroborates the findings of Yang et al.,(13) in the analysis of medical records using AI. It is noteworthy that impermissible errors exceed 50 %, a percentage consistently reported in clinical audit studies in Latin America, such as that of Bhagat et al. (6). The results presented in this study expand the evidence that clinical inconsistencies (21,42 %) represent an underestimated medicolegal risk, as noted by Macintyre et al. (9) in contexts with a high healthcare burden.
The lack of relevant information is consistent with the meta-analysis by Das et al. (11), which identifies this as the main document defect in 17 developing countries. The particular incidence of legibility problems exceeds the global average (18-20 %), possibly associated with the persistent use of handwritten formats, a situation that Ávila-Tomás et al. (10) identify as a key modifiable factor. The inconsistencies show similarity with data from Al Kuwaiti et al. (5) in hospitals with high staff turnover.
The epidemiological bimodality exactly replicates the patterns described by the PAHO in its latest report on dengue in 2023. The concentration of 44 % of cases in these periods partially explains the observed documentary deterioration, a phenomenon quantified by Long et al. (17) as the "saturation effect" in emergencies. Our data support Kim et al.'s (18) proposal for the implementation of differentiated seasonal protocols.
The results validate Rivera Valdivia's (21) warnings about the limitations of general LLMs in clinical settings, showing that, although useful for identifying error patterns (as demonstrated in the presented study), they require human supervision for clinical interpretation. This reinforces Zeas et al. (20) position on the use of AI as an assistant—not a replacement—in medical audits. (15, 21, 16)
The study shares limitations recognized by Lanzagorta-Ortega et al. (8) in retrospective analyses with AI: dependence on the quality of the input data and the need for validation against clinical reference standards. Future research should incorporate, as suggested by Martins et al. (15), multicenter evaluation with more diverse samples.
Among the considerations of scientific, social, and economic importance, there is still debate about whether the recurrent use of this technology will have an impact on changing the concepts of "health" and "research" in the future, and whether its applications and ethical considerations in the medical field should be mediated by a strict protocol for the use of personal information in these technologies. The ability to leverage resources allocated to academic research, as well as the preparation of trainees for the optimal use of this great technology, will be essential to overcome these challenges. (19, 22, 23)
CONCLUSIONS
The
use of artificial intelligence, particularly ChatGPT, has proven to be an
invaluable tool for reviewing and analyzing medical discharge notes in dengue
cases. Throughout the study, it became clear that the artificial model had no
trouble identifying and detecting errors in discharge notes. This confirms its
effectiveness in the field of medical research, where AI transforms clinical
auditing. However, its true value emerges when it enhances—not replaces—expert
medical judgment.
BIBLIOGRAPHIC REFERENCES
1- Vincent Quezada. ChatGPT y el futuro de la inteligencia artificial [Internet]. Oficinas de TechTarget Boston [cited 17/08/2024]. Available in: ChatGPT y el futuro de la inteligencia artificial | Computer Weekly
2- Statista Research Departament. Inteligencia artificial (IA) en América Latina y el Caribe - Datos estadísticos [Internet]. Departamento de Statista Estados Unidos [cited 17/08/2024] Available in: Inteligencia artificial (IA) en América Latina y el Caribe - Datos estadísticos | Statista
3- UNESCO. UNESCO presenta Reporte de Evaluación del Estadio de Preparación de Inteligencia Artificial de México [Internet]. Sede de la UNESCO Francia. [cited 17/08/2024] Available in: UNESCO presenta Reporte de Evaluación del Estadio de Preparación de Inteligencia Artificial de México | UNESCO
4- Alowais SA., Alghamdi SS., Alsuhebany N., Alqahtani T., Alshaya AI., Almohareb SN., Aldairem A., Alrashed M. et al. Revolucionando la salud: el papel de la inteligencia artificial en la práctica clínica. BMC educación médica [Internet] 2023 [cited 17/08/2024]; 23(1)689. Available in: https://pubmed.ncbi.nlm.nih.gov/37740191/
5- Al Kuwaiti A, Nazer K, Al-Reedy A, Al-Shehri S, Al-Muhanna A, Subbarayalu AV, Al Muhanna D, Al-Muhanna FA. Una revisión del papel de la inteligencia artificial en la atención médica. Rev Med Personalizada [Internet]. 2023 [cited 17/08/2024]; 13(6):951. Available in: https://doi.org/10.3390/jpm13060951
6- Bhagat M, Wankhede M, Kopawar M, Sananse P. La inteligencia artificial en la sanidad: Una revisión. Int J Sci Res Sci Eng Technol [Internet]. 2024 [cited 17/08/2024]; 11:133-138. Available in: https://doi.org/10.32628/IJSRSET24114107
7- Ahmad M, Abdallah S, Abbasi S, Abdallah A. Perspectivas de los estudiantes sobre la integración de la inteligencia artificial en los servicios sanitarios. Digit Health [Internet]. 2023 [cited 18/08/2024]; 9:205520762311740. Available in: https://doi.org/10.1177/20552076231174095
8- Lanzagorta-Ortega D, Carrillo-Pérez D, Carrillo-Esper R. Inteligencia artificial en medicina: presente y futuro. Gac Med Mex [Internet]. 2022 [cited 18/08/2024]; 158. Available in: https://doi.org/10.24875/GMM.M22000688
9- Macintyre MR, Cockerill RG, Mirza OF, Appel JM. Consideraciones éticas para el uso de la inteligencia artificial en la evaluación de la capacidad de decisión médica. Psychiatry Res [Internet]. 2023 [cited 18/08/2024]; 328:115466. Available in: https://doi.org/10.1016/j.psychres.2023.115466
10- Ávila-Tomás JF, Mayer-Pujadas MA, Quesada-Varela VJ. La inteligencia artificial y sus aplicaciones en medicina II: importancia actual y aplicaciones prácticas. Aten Primaria [Internet]. 2020 [cited 27/08/2024]; 53(1):81-88. Available in: https://doi.org/10.1016/j.aprim.2020.04.014
11- Das S, Nayak SP, Sahoo B, et al. Aprendizaje automático en el análisis de la atención medica: Una revisión del estado del arte. Arch Computat Methods Eng [Internet]. 2024 [cited 27/08/2024]; 31:3923–3962. Available in: https://doi.org/10.1007/s11831-024-10098-3
12- Sultana S, Hussain S, Hashmani M, Jafarian A, Zubair M. Una fusión de conjuntos híbridos de aprendizaje profundo para la clasificación de radiografías de tórax. Mundo Redes Neuronales [Internet]. 2021 [cited 27/08/2024]; 31(3):191-209. Available in: https://doi.org/10.14311/nnw.2021.31.010
13- Yang X, Chen A, PourNejatian N, Shin H, Smith K, Parisien C, Compas C, Martin C, Costa A, Flores M, Zhang Y, Magoc T, Harle C, Lipori G, Mitchell D, Hogan W, Shenkman E, Bian J, Wu Y. Un gran modelo de lenguaje para las historias clínicas electrónicas. npj Digit Med [Internet]. 2022 [cited 27/08/2024]; 5. Available in: https://doi.org/10.1038/s41746-022-00742-2
14- Singh B, Olds T, Brinsley J, et al. Revisión sistemática y meta-análisis de la efectividad de los chatbots en los comportamientos de estilo de vida. npj Digit Med [Internet]. 2023 [cited 27/08/2024]; 6:118. Available in: https://doi.org/10.1038/s41746-023-00856-1
15- Martins TGD, Schor P, Mendes LGA, Fowler S, Silva R. Uso de la inteligencia artificial en oftalmología: una revisión narrativa. Rev Med São Paulo [Internet]. 2022 [cited 07/09/2024]; 140(6):837–845. Available in: https://doi.org/10.1590/1516-3180.2021.0713.R1.22022022
16- Rguibi Z, Hajami A, Zitouni D. Aprendizaje profundo en imágenes médicas: una reseña. ResearchGate. [Internet]. 2022 [cited 07/09/2024]. Available in: https://doi.org/10.1201/9781003269793-15
17- Long P, Lu L, Chen Q, Chen Y, Li C, Luo X. Selección inteligente del modo de la cadena de suministro de atención médica: una investigación aplicada basada en inteligencia artificial. Front Public Health [Internet]. 2023 [cited 07/09/2024]; 11:1310016. Available in: https://doi.org/10.3389/fpubh.2023.1310016
18- Kim M, Sohn H, Choi S, Kim S. Requisitos para una inteligencia artificial fiable y su aplicación en la asistencia sanitaria. Healthc Inform Res [Internet]. 2023 [cited 07/09/2024]; 29(4):315-322. Available in: https://doi.org/10.4258/hir.2023.29.4.315
19- Krinkin K, Shichkina Y, Ignatyev A. La inteligencia híbrida coevolutiva es un concepto clave para la intelectualización mundial. Kybernetes [Internet]. 2022 [cited 18/09/2024]; 52(9):2907-2923. Available in: https://doi.org/10.1108/k-03-2022-0472
20- Zeas M, Paredes K, Gavilanes T. Uso de inteligencia artificial como soporte para el aprendizaje en las ciencias de la salud. Rev Imaginario Social [Internet]. 2024 [cited 18/09/2024]; 7. Available in: https://doi.org/10.59155/is.v7i2.180
21- Rivera Valdivia KC. Aplicación de la inteligencia artificial en la nutrición personalizada. Rev Investig [Internet]. 2022 [cited 18/09/2024]; 11(4):265-277. Available in: https://doi.org/10.26788/ri.v11i4.3990
22- Camino D, Clavijo B. La inteligencia artificial en la investigación y redacción de textos académicos. Espíritu Emprendedor TES [Internet]. 2024 [cited 18/09/2024]; 8:19-34. Available in: https://doi.org/10.33970/eetes.v8.n1.2024.369
23- Expósito Gallardo MC, Ávila Ávila R. Aplicaciones de la inteligencia artificial en la Medicina: perspectivas y problemas. ACIMED [Internet]. 2008 [cited 18/09/2024]; 17(5). Available in: http://scielo.sld.cu/scielo.php?script=sci_arttext&pid=S1024-94352008000500005&lng=es
CONFLICTS OF INTEREST
The authors declare no conflicts of interest in the design of this research.
FUNDING
The authors declare no funding for this research.
AUTHORSHIP CONTRIBUTIONS:
KCVR: Conceptualization, Research, Methodology, Project administration, Visualization, Validation, Supervision, Writing - review and editing.
HDMM: Conceptualization, Research, Methodology, Project administration, Visualization, Validation, Supervision, Writing - review and editing.
YER: Conceptualization, Research, Methodology, Project administration, Visualization, Validation, Supervision, Writing - review and editing.