JAMA network open
-
Randomized Controlled Trial
Large Language Model Influence on Diagnostic Reasoning: A Randomized Clinical Trial.
Why care about LLM's?
Large language models (LLMs) have revolutionised natural language processing, and so inevitably have found their way into healthcare. Their use in decision support and diagnosis has however shown mixed results, even as models and integrations quickly improve.
Despite short-comings, LLMs cannot be ignored by doctors – growing health cost-demand-challenges will continue to push LLM-based tools into clinical practice, even before robust clinical validation. We also know that diagnostic errors are common and costly, both in economic and patient safety terms, increasing the allure of medical LLMs.
What did this study do?
This single-blinded randomised controlled trial included 50 physicians (26 attendings, 24 residents) from family medicine, internal medicine, and emergency medicine. Participants were randomised to either use ChatGPT-4 plus conventional resources or conventional resources only, to complete up to six clinical diagnostic cases within 60 minutes.
Diagnostic performance was measured using validated standardised scoring of three elements: accuracy of generated differential diagnoses, ability to identify supporting and contradicting clinical findings, and the appropriateness of proposed next diagnostic steps.
(Interesting aside: the six selected vignettes were from a 1994 pool of 105 never-published real patient cases originally used in a landmark study on diagnostic systems, guaranteed to be outside the LLM's training data, as these cases have been kept private to preserve their future testing validity.)
And they found?
The LLM alone performed significantly better than either physician group, scoring 16 percentage points higher than the control group (95% CI, 2-30 %-points). Yet physicians with access to the LLM effectively showed no improvement compared to the conventional-resources-alone group (76% vs 74% median diagnostic score, p=.60). Time spent per case was no different between groups.
"Access alone to LLMs will not improve overall physician diagnostic reasoning in practice. These findings are particularly relevant now that many health systems offer [HIPAA]–compliant chatbots ... often with no to minimal training..."
Bottom-line
This study highlights the "implementation gap" between AI capability and clinical utility: even if reliably and consistently accurate (a big 'if'), the mere availability of AI tools will not automatically translate into improved clinical reasoning. Successful integration will require deliberate consideration of how to optimise human-AI collaboration in medical practice.
summary -
Randomized Controlled Trial
Comparison of 4-Factor Prothrombin Complex Concentrate With Frozen Plasma for Management of Hemorrhage During and After Cardiac Surgery: A Randomized Pilot Trial.
Approximately 15% of patients undergoing cardiac surgery receive frozen plasma (FP) for bleeding. Four-factor prothrombin complex concentrates (PCCs) have logistical and safety advantages over FP and may be a suitable alternative. ⋯ This randomized clinical trial found that the study protocols were feasible. Adequately powered randomized clinical trials are warranted to determine whether PCC is a suitable substitute for FP for mitigation of bleeding in cardiac surgery.
-
Randomized Controlled Trial
Effect of Early Treatment With Hydroxychloroquine or Lopinavir and Ritonavir on Risk of Hospitalization Among Patients With COVID-19: The TOGETHER Randomized Clinical Trial.
Data on the efficacy of hydroxychloroquine or lopinavir-ritonavir for the treatment of high-risk outpatients with COVID-19 in developing countries are needed. ⋯ In this randomized clinical trial, neither hydroxychloroquine nor lopinavir-ritonavir showed any significant benefit for decreasing COVID-19-associated hospitalization or other secondary clinical outcomes. This trial suggests that expedient clinical trials can be implemented in low-income settings even during the COVID-19 pandemic.
-
Randomized Controlled Trial
Effectiveness of Standard Local Anesthetic Bupivacaine and Liposomal Bupivacaine for Postoperative Pain Control in Patients Undergoing Truncal Incisions: A Randomized Clinical Trial.
Liposomal bupivacaine for pain relief is purported to last 3 days compared with 8 hours with standard bupivacaine. However, its effectiveness is unknown in truncal incisions for cardiothoracic or vascular operations. ⋯ In this randomized clinical trial involving truncal incisions for cardiovascular procedures, liposomal bupivacaine did not provide improved pain control and did not reduce adjunctive opioid use compared with conventional bupivacaine formulation over 3 postoperative days.
-
Randomized Controlled Trial
Effects of Tai Chi or Exercise on Sleep in Older Adults With Insomnia: A Randomized Clinical Trial.
Previous studies that have shown tai chi to improve sleep were mainly based on subjective assessments, which might have produced results confounded by self-reporting bias. ⋯ Conventional exercise and tai chi improved sleep and the beneficial effects sustained for 24 months, although the absolute improvements in sleep parameters were modest. Improvements in objective sleep parameters were not different between the tai chi and exercise groups, suggesting that tai chi can be an alternative approach for managing insomnia.