Trials / Completed
CompletedNCT07037940
Physician Response Evaluation With Contextual Insights vs. Standard Engines - Artificial Intelligence RAG vs LLM Clinical Decision Support
Comparing Clinical Key AI and GPT-4 for Diagnostic Reasoning and Management Decisions
- Status
- Completed
- Phase
- N/A
- Study type
- Interventional
- Enrollment
- 27 (actual)
- Sponsor
- Montefiore Medical Center · Academic / Other
- Sex
- All
- Age
- 25 Years
- Healthy volunteers
- Accepted
Summary
Clinical decision support tools powered by artificial intelligence are being rapidly integrated into medical practice. Two leading systems currently available to clinicians are OpenEvidence, which uses retrieval-augmented generation to access medical literature, and GPT-4, a large language model. While both tools show promise, their relative effectiveness in supporting clinical decision-making has not been directly compared. This study aims to evaluate how these tools influence diagnostic reasoning and management decisions among internal medicine physicians.
Detailed description
Internal medicine attendings and residents are invited to participate in a study investigating how physicians using a RAG-based LLM (OpenEvidence) perform compared to those using a standard general-purpose LLM (ChatGPT) on both diagnostic reasoning and complex management decisions. As AI tools increasingly enter clinical practice, evidence is needed about which approaches best support physician decision-making. This study will help determine if specialized medical knowledge retrieval systems (OpenEvidence) provide advantages over general AI assistants (ChatGPT) when solving real clinical cases. Participants will complete one 90-minute Zoom session where clinical cases derived from real, de-identified patient encounters will be solved. Participants will be randomly assigned to use either OpenEvidence or ChatGPT and all responses evaluated by blinded scorers using a validated rubric. Note that this exempted study will compare OpenEvidence (as opposed to Clinical Key AI) vs ChatGPT although the official study title suggests otherwise.
Conditions
Interventions
| Type | Name | Description |
|---|---|---|
| OTHER | OpenEvidence | Medical information platform which uses retrieval-augmented generation to access medical literature |
| OTHER | GPT-4 | A chatbot application developed that uses GPT-4, a large language model, to engage in conversational interactions with users. |
Timeline
- Start date
- 2025-07-03
- Primary completion
- 2025-12-30
- Completion
- 2025-12-30
- First posted
- 2025-06-26
- Last updated
- 2026-04-09
Locations
2 sites across 1 country: United States
Source: ClinicalTrials.gov record NCT07037940. Inclusion in this directory is not an endorsement.