Abstract
This article presents a discussion on the main challenges and considerations involved in addressing stereotypes within Natural Language Processing (NLP), and proposes a set of guidelines and recommendations for their treatment in research and resource development. On the one hand, the growing interest in fairness, bias mitigation, and inclusivity has led to an increasing number of studies and datasets dealing with stereotypes; on the other hand, their conceptualization and operationalization remain highly heterogeneous across works. The aim of this article is therefore twofold: (1) to provide a concise yet comprehensive overview of existing annotation schemes highlighting their key features and offering a comparative analysis and (2) to propose a set of tentative guidelines and recommendations to foster clarity when working with stereotypes in NLP. Furthermore, as a case study, we conduct an annotation exercise of a subset of texts from the QUEEREOTYPES dataset, containing stereotypes targeting LGBTQIA+ people, using all labels proposed in prior work to assess their clarity, overlap, and practical usefulness.