Geospatial Vaccine Misinformation Risk on Social Media: Online Insights from an English/Spanish Natural Language Processing (NLP) Analysis of Vaccine-related Tweets

Indiana University School of Public Health (Valdez); Texas A&M International University (Soto-Vásquez); Indiana University (Montenegro)
"This study's findings highlight that misinformation does not affect all populations equally."
Misinformation is known to affect norms, attitudes, and intentions to engage in healthy behaviours. Although misinformation studies are prevalent, many are limited by their ethnocentrism, defined here as an unbalanced focus on a specific culture, language, or idea. This may mean there are unexamined cultural assumptions within anti-misinformation campaigns that ultimately lead them to miss their stated goals. Thus, this study takes a comparative computational and linguistic approach by evaluating a substantive body of English/Spanish tweets to assess the prevalence of misinformation and its potential impact on informing beliefs about vaccines, regardless of the underlying disease (e.g., polio, influenza, monkeypox, COVID-19).
Data for this study were collected from the microblogging social networking website X, formerly known as Twitter, between August and October 2022. The study applies Natural Language Processing (NLP) to analyse a corpus of tweets about vaccines, broadly defined, for misinformation indicators. The researchers analysed English (N = 247,140) and Spanish (N = 104,445) tweets using Latent Dirichlet Allocation (LDA) topic models with Coherence score calculation (model fit) with a Mallet adjustment (topic optimisation). They used informal coding to name computer-identified topics and compare misinformation scope and scale between languages.
The NLP analysis of English and Spanish tweets related to vaccines identified various indications of misinformation and polarising content. However, further reviews of these tweets revealed that misinformation differed greatly in scope and themes between languages.
- What themes emerge from a composite collection of English-language tweets related to vaccines and vaccine access? The English tweet analysis revealed 12 topics embedded within the corpus. The researchers determined Topic 1 (Vaccine Promotion), Topic 2 (Vaccine Safety), and Topic 9 (Vaccine Choice) to contain responsible health messaging about vaccines. However, the remaining topics each contained indications of false or misleading content, which included capitalising and sensationalised reports (Topic 8, Myocarditis Outcome Research), promoting strong anti-vaccination stances (Topic 10, Vaccine Scrutiny; Topic 11, Anti-Vaccination), and outlining conspiracy theories related to mRNA vaccines across health concerns (Topic 12, Vaccine Conspiracies).
- What themes emerge from a composite collection of Spanish-language tweets related to vaccines and vaccine access? The Spanish tweet analysis revealed 14 topics embedded within the corpus. The researchers determined Topic 1 (Countering Vaccine Misinformation) and Topic 8 (Vaccine Promotion) to contain responsible health messaging promoting vaccine updates. They identified potential pockets of misinformation capitalising on access disparities (Topic 9, Vaccine Access Disparities) and emphasising adverse outcomes and reactions over vaccine benefits (Topic 5, Adverse Reactions; Topic 12, Negative Vaccine Side Effects; Topic 3, Vaccines Are a Joke). Two topics, Topic 1 (Countering Vaccine Misinformation) and Topic 2 (Vaccine Access Disparity), are conceptually dissimilar from the large cluster of remaining topics. However, these topics may generally refer to misinformation, suggesting high degrees of overlap in the types of misinformation embedded in the Spanish corpus.
- What do thematic differences between English and Spanish tweets show about how misinformation is contextualised in diverse cultures and languages? Both corpora contained overlapping misinformation, including uncertainty of research guiding policy recommendations or standing in support of antivax movements. However, the Spanish data were positioned in a global context, where misinformation was directed at government equity and disparate vaccine distribution. In contrast, misinformation in the English corpus primarily capitalised on talking points specific to the United States (US), including accusations that vaccine mandates were a US partisan (political) ploy and fearmongering technique. For example, some Spanish tweets criticised the so-called pink tide governments in Latin America (e.g., Boric in Chile, Petro in Colombia, and Fernández in Argentina) - whether for not buying enough vaccines, which relates to structural barriers (i.e., access to vaccines), or forcing people to get vaccinated. These criticisms generally parallel accusations related to vaccine mandates in the US but manifest differently in online discourse.
Based on the findings, the researchers offer public health and medical professionals several culturally adaptive strategies to combat misinformation in addition to centering trust building:
- Turn from broad marketing campaigns and instead work closely with community and local healthcare professionals. This research shows that Spanish-language tweets have a higher element of mistrust in their topic modeling, which reflects legacies of governmental corruption and mismanagement throughout the Americas. Future research should identify some of the best practices on the ground practitioners are using to build trust and connect them to national and state public health campaigns.
- Apply rigorous frameworks to translate anti-misinformation campaigns from English into second and third languages. Instead of direct translation, co-construction of campaigns with culturally sensitive language speakers may yield more positive results.
- Ask, during screening, where patients are going for health information, just as a healthcare provider might ask about recent surgeries or medication changes. This question may allow providers to follow up by suggesting better information practices, across all languages.
Thus, these findings support the conclusion that misinformation is a global issue. However, misinformation may vary depending on culture and language. As such, tailored strategies to combat misinformation in digital planes are strongly encouraged.
Social Science & Medicine 339 (2023) 116365. https://doi.org/10.1016/j.socscimed.2023.116365. Image credit: Freepik
- Log in to post comments











































