Abstract
OBJECTIVES: Recent innovations in Artificial Intelligence based Large Language Models (LLMs) have demonstrated emergent abilities that have allowed for advanced zero-shot performances on downstream tasks, wherein "zero-shot" refers to the model’s ability to perform tasks on which it has not been explicitly trained. This study utilized Social Media Listening (SML) to collect freely available posts from social media sites in the public domain to enhance understanding of pet owner attitudes and experiences with canine pruritus. Irrelevant posts are often picked up by web-scrapes beyond our study’s focus on canine pruritus, rendering inaccurate data analyses. Determining relevance through manual mark-up is typically an exhaustive task and frequently intractable due to the sheer volume of collected data. This motivates use of a zero-shot classifier to automatically filter for relevant posts.
METHODS: SML posts from sources including but not limited to Reddit and X Corp. were collected based on keywords chosen by veterinary dermatology experts for insights into canine pruritis. A domain expert manually marked-up 1560 relevant and 1477 irrelevant (total n=3037) collected posts, after a data cleaning process with duplicates removed. A descriptive text prompt defining the downstream task was provided to a GPT-3.5 LLM accompanied with each collected post. The model iteratively classified each post as either “relevant” or “irrelevant” according to the provided description. Adjusted hyperparameters of GPT-3.5 included a temperature set to zero and number of maximum output tokens set to five, to limit the output generated response and ease post-processing.
RESULTS: Our method yielded results of 0.73, 0.87 and 0.80 for sensitivity, precision and F1-score, respectively (1831 and 1206 posts were classified as relevant or irrelevant).
CONCLUSIONS: LLMs, even though not specifically trained for determining relevance of social media posts, exhibit effective zero-shot capabilities when classifying relevance of SML posts, thus dramatically reducing the need for manual mark-up.