Neck dissection in head and neck surgery: An assessment of ChatGPT performance
Artificial intelligence models such as chat generative pre-trained transformer (ChatGPT) are being increasingly used to inform treatment-related decisions. Among otolaryngology subspecialties, there is a paucity of literature examining the role of ChatGPT within head and neck surgical oncology. The utility of ChatGPT in addressing questions related to surgically relevant anatomy and lymphadenectomy procedures remains poorly understood. The primary pilot study objective was to determine the reliability of ChatGPT in answering neck dissection-related inquiries compared to expert head and neck surgical oncologists. Five neck dissection-related questions were presented to ChatGPT v3.5. Three fellowship-trained head and neck surgeons compared AI-generated responses to those of an expert head and neck surgeon. Raters, blinded to the author’s identity, evaluated the responses given based on a Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree). The median level of agreement between raters for the ChatGPT responses was 1.0 (interquartile range [IQR]: 1.0, 2.5; minimum = 1 and maximum = 4), while the median level of agreement between raters for the surgeon responses was 5.0 (IQR: 5.0, 5.0; minimum = 5 and maximum = 5). The Mann–Whitney U test yielded a significance level of p=0.007 when comparing the level of agreement between ChatGPT and surgeon responses. Raters showed minimal consistency when evaluating ChatGPT responses (intraclass correlation coefficient = 0.05; 95% confidence interval: 0.0–0.88), in contrast to perfect agreement observed for the surgeon responses. In summary, ChatGPT is a promising tool in the acquisition of surgical knowledge. For neck dissection-related inquiries, a discrepancy between the reliability of ChatGPT-generated responses and surgeon expertise exists. Further refinement in AI models is needed to strengthen the utility of ChatGPT in head and neck oncologic surgery.

- Dave T, Athaluri SA, Singh S. ChatGPT in medicine: An overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell. 2023;6:1169595. doi: 10.3389/frai.2023.1169595
- Mesko B. The ChatGPT (generative artificial intelligence) revolution has made artificial intelligence approachable for medical professionals. J Med Internet Res. 2023;25:e48392. doi: 10.2196/48392
- Schmalbach CE. Our otolaryngology future with artificial intelligence. Otolaryngol Head Neck Surg. 2024;170(6):1483. doi: 10.1002/ohn.802
- Lechien JR, Rameau A. Applications of ChatGPT in otolaryngology-head neck surgery: A state of the art review. Otolaryngol Head Neck Surg. 2024;171(3):667-677. doi: 10.1002/ohn.807
- Chiesa-Estomba CM, Speth MM, Mayo-Yanez M, Liu DT, Maniaci A, Borsetto D. Is the evolving role of artificial intelligence and chatbots in the field of otolaryngology embracing the future? Eur Arch Otorhinolaryngol. 2024;281(4):2179-2180. doi: 10.1007/s00405-023-08382-2
- Davis RJ, Ayo-Ajibola O, Lin ME, Swanson MS, Chambers TN, Kwon DI, et al. Evaluation of oropharyngealcancer information from revolutionary artificial intelligence chatbot. Laryngoscope. 2024;134(5):2252-2257. doi: 10.1002/lary.31191
- Kuşcu O, Pamuk AE, Sütay Süslü N, Hosal S. Is ChatGPT accurate and reliable in answering questions regarding head and neck cancer? Front Oncol. 2023;13:1256459. doi: 10.3389/fonc.2023.1256459
- Lee JC, Hamill CS, Shnayder Y, Buczek E, Kakarala K, Bur AM. Exploring the role of artificial intelligence chatbots in preoperative counseling for head and neck cancer surgery. Laryngoscope. 2024;134(6):2757-2761. doi: 10.1002/lary.31243
- Washington CJ, Abouyared M, Karanth S, et al. The use of chatbots in head and neck mucosal malignancy treatment recommendations. Otolaryngol Head Neck Surg. 2024;171:1062-1068. doi: 10.1002/ohn.818
- Mnajjed L, Patel RJ. Assessment of ChatGPT generated educational material for head and neck surgery counseling. Am J Otolaryngol. 2024;45(5):104410. doi: 10.1016/j.amjoto.2024.104410
- Maniyar N, Sarode GS, Sarode SC, Thakkar S. ChatGPT conversations on oral cancer: Unveiling ChatGPT’s potential and pitfalls. Oral Oncol Rep. 2024;10:100280. doi: 10.1016/j.oor.2024.100280
- National Comprehensive Cancer Network. Head and Neck Cancers. Ver. 1. Pennsylvania: National Comprehensive Cancer Network; 2024.
- Miller MC, Goldenberg D. AHNS Series: Do you know your guidelines? Principles of surgery for head and neck cancer: A review of the national comprehensive cancer network guidelines. Head Neck. 2017;39(4):791-796. doi: 10.1002/hed.24654
- Tessler I, Wolfovitz A, Alon EE, et al. ChatGPT’s adherence to otolaryngology clinical practice guidelines. Eur Arch Otorhinolaryngol. 2024;281(7):3829-3834. doi: 10.1007/s00405-024-08634-9
- Long C, Subburam D, Lowe K, et al. ChatENT: Augmented Large language model for expert knowledge retrieval in otolaryngology-head and neck surgery. Otolaryngol Head Neck Surg. 2024;171(4):1042-1051. doi: 10.1002/ohn.864
- D’Cruz AK, Vaish R, Kapre N, et al. Elective versus therapeutic neck dissection in node-negative oral cancer. N Engl J Med. 2015;373(6):521-529. doi: 10.1056/nejmoa1506007
- Chelli M, Descamps J, Lavoué V, et al. Hallucination rates and reference accuracy of chatgpt and bard for systematic reviews: Comparative analysis. J Med Internet Res. 2024;26:e53164. doi: 10.2196/53164
- De Wynter A, Wang X, Sokolov A, Gu Q, Chen SQ. An evaluation on large language model outputs: Discourse and memorization. Nat Lang Process J. 2023;4:100024.
- Bhattacharyya M, Miller VM, Bhattacharyya D, Miller LE. High rates of fabricated and inaccurate references in ChatGPT-generated medical content. Cureus. 2023;15(5):e39238. doi: 10.7759/cureus.39238
