Objective To evaluate the performance of large language models in answering questions about childhood asthma, comprehensively understand the quality of their provision of information on children's health, and identify their limitations to facilitate model improvement.
Methods Sixty common questions related to childhood asthma were formulated and put to two large language models known as Wenxin Yiyan and Zhipu Qingyan, which were publicly available in China. Three pediatric asthma specialists assessed the quality of the large language models'responses by using a blind method.
Results Wenxin Yiyan scored higher in terms of accuracy, understanding, reliability, and logicality; Zhipu Qingyan scored higher in term of safety. Comparing the scores of the five different dimensions, it was found that large language models scored higher in terms of understanding, reliability and logicality, but relatively insufficient in terms of accuracy and safety.
Conclusion Application of large language models in the education of children with asthma can provide useful references for asthma children and their parents. However, the current large language model technology still has certain limitations in terms of accuracy and safety, which requires further improvement and optimization.