Diagnostic Accuracy and Pitfalls of Publicly Available Artificial Intelligence Models for Nail Disorders

Journal of Drugs in Dermatology JDD featuring Diagnostic Accuracy and Pitfalls of Publicly Available Artificial Intelligence Models for Nail Disorders

AI in Dermatology: Vision Language Models Show Limited Accuracy for Nail Disorders in JDD Study

In case you missed it, check out this article from the JDD. A recent study tested widely available vision language models, ChatGPT-3.5, ChatGPT-4o, and Google Gemini, on 110 clinical images representing 11 common nail conditions confirmed by a board certified dermatologist. Each model provided the top three differential diagnoses and a 1 to 10 confidence score for its primary choice.

Results were modest: top one accuracy clustered around 31 to 34 percent and top three accuracy around 45 to 51 percent, with Gemini at about 34 percent top one and 51 percent top three. Models showed better performance for onychomycosis, green nail syndrome, and onychocryptosis, yet onychomycosis also frequently appeared as a default incorrect guess across misclassified cases. Periungual warts were commonly missed by all models.

For practicing dermatologists and dermatology healthcare professionals, the findings highlight that current VLMs may have a role as adjunctive pattern recognition tools but are not reliable substitutes for clinical judgment in high stakes nail disease diagnosis.

Read the full JDD article to review the images, methods, and detailed model outputs and consider how these results might inform clinical use and AI stewardship in your practice.

Blog write-up assisted by AI

AI in Dermatology: Vision Language Models Show Limited Accuracy for Nail Disorders in JDD Study

Advertisement

Dermatology Topics

Advertisement

Top News Stories

News Topics

Journal of Drugs in Dermatology

Search