Logo image
Open Research University homepage
Surrey researchers Sign in
[Supplementary data] Harvesting social media and using large language models to analyse online discourse: Developing methodology to explore the challenges faced by Sub-Saharan African women in livestock farming
Dataset   Open access

[Supplementary data] Harvesting social media and using large language models to analyse online discourse: Developing methodology to explore the challenges faced by Sub-Saharan African women in livestock farming

Georgina Tarrant, Taranpreet Singh Rai, Luke Boyden, Kennedy Mwacalimba, Raymond Tiernan, Peter Kimeli, Travis Lee Street, Alasdair James Charles Cook and Kevin Wells
Zenodo
24/08/2025

Abstract

social listening animal health gender Livestock Sub-Saharan Africa
docx
WiAf-SupplementaryMaterials-19Jun25275.47 kBDownloadView
DatasetSupplementary materials to accompany our VeriXiv preprint "Harvesting social media and using large language models to analyse online discourse: Developing methodology to explore the challenges faced by Sub-Saharan African women in livestock farming". Social media listening (SML) data were collected from X (formerly Twitter), blogs, forums, Reddit and Facebook pages using Pulsar Platform™. SML posts were collected if they met the bespoke keyword search criteria and originated in or mentioned one or more of ten Sub-Saharan African countries: Ethiopia, Ghana, Côte d'Ivoire (Ivory Coast), Kenya, Nigeria, Senegal, Tanzania, Uganda, Zambia, Zimbabwe. The searches were grouped into 4 themes: (1) Women in livestock and farming; (2) Challenges faced by women in livestock and farming; (3) Perceptions of disease health and control measures; (4) Women’s training, education and interventions in livestock. The data was scraped using Pulsar Platform. Section 1: Supplementary data Overview of topic modelling outputs by theme for all four themes. Each theme has sub-themes with titles and text descriptions. No direct quotes are used as these are provided in the preprint. Section 2: Supplementary figures and tables Further information on in-country demographics (gender and rural/urban populations) and internet penetration compared to population. Summary tables with the number of collected social media posts by country and totals for posts that mention one of the countries of interest. Author gender distribution for collected posts. References Reference list for data sources used.CC BY-SA V4.0 Open Access
docx
WiAf-SupplementaryMaterials-27Aug25275.46 kBDownloadView
DatasetSupplementary materials to accompany our VeriXiv preprint "Harvesting social media and using large language models to analyse online discourse: Developing methodology to explore the challenges faced by Sub-Saharan African women in livestock farming". Social media listening (SML) data were collected from X (formerly Twitter), blogs, forums, Reddit and Facebook pages using Pulsar Platform™. SML posts were collected if they met the bespoke keyword search criteria and originated in or mentioned one or more of ten Sub-Saharan African countries: Ethiopia, Ghana, Côte d'Ivoire (Ivory Coast), Kenya, Nigeria, Senegal, Tanzania, Uganda, Zambia, Zimbabwe. The searches were grouped into 4 themes: (1) Women in livestock and farming; (2) Challenges faced by women in livestock and farming; (3) Perceptions of disease health and control measures; (4) Women’s training, education and interventions in livestock. The data was scraped using Pulsar Platform. Section 1: Supplementary data Overview of topic modelling outputs by theme for all four themes. Each theme has sub-themes with titles and text descriptions. No direct quotes are used as these are provided in the preprint. Section 2: Supplementary figures and tables Further information on in-country demographics (gender and rural/urban populations) and internet penetration compared to population. Summary tables with the number of collected social media posts by country and totals for posts that mention one of the countries of interest. Author gender distribution for collected posts. References Reference list for data sources used.CC BY-SA V4.0 Open Access
url
https://doi.org/10.5281/zenodo.16813625View
DatasetSupplementary materials to accompany our VeriXiv preprint "Harvesting social media and using large language models to analyse online discourse: Developing methodology to explore the challenges faced by Sub-Saharan African women in livestock farming".Social media listening (SML) data were collected from X (formerly Twitter), blogs, forums, Reddit and Facebook pages using Pulsar Platform™. SML posts were collected if they met the bespoke keyword search criteria and originated in or mentioned one or more of ten Sub-Saharan African countries: Ethiopia, Ghana, Côte d'Ivoire (Ivory Coast), Kenya, Nigeria, Senegal, Tanzania, Uganda, Zambia, Zimbabwe. The searches were grouped into 4 themes: (1) Women in livestock and farming; (2) Challenges faced by women in livestock and farming; (3) Perceptions of disease health and control measures; (4) Women’s training, education and interventions in livestock. The data was scraped using Pulsar Platform.Section 1: Supplementary dataOverview of topic modelling outputs by theme for all four themes. Each theme has sub-themes with titles and text descriptions. No direct quotes are used as these are provided in the preprint.Section 2: Supplementary figures and tablesFurther information on in-country demographics (gender and rural/urban populations) and internet penetration compared to population.Summary tables with the number of collected social media posts by country and totals for posts that mention one of the countries of interest.Author gender distribution for collected posts.ReferencesReference list for data sources used.CC BY-SA V4.0 Restricted. Access maybe granted on request

Metrics

1 Record Views

Details

Logo image

Usage Policy