Using Machine Learning to Compare Provaccine and Antivaccine Discourse Among the Public on Social Media: Algorithm Development Study

Michigan State University (Argyris, Tan, Wiseley); University of Northern British Columbia (Monu, Aarts, Jiang)
"The integrated analyses in this study help identify...why antivaccine communities demonstrate higher engagement..., density, and echo-chamberness... and how antivaccine advocates successfully dissuade the public from immunization despite opposition from their provaccine counterparts."
Exposure to antivaccine content on social media has been associated with delays in and refusal of vaccination. To date, it has not yet been fully explained how the antivaccine movement continues to engage and persuade the public to deny immunisation, despite provaccine advocates' counteracting efforts in the social media arena. Thus, this study compares discursive topics chosen by pro- and antivaccine advocates in their attempts to influence the public to accept or reject immunisation in the "engagement-persuasion spectrum", which starts with engaging the audience with the content and concludes with persuading the audience to accept the claims included in the content. The researchers develop a machine learning (ML)-based automatic classifier of pro- and antivaccine posts and unsupervised clustering for extracting discursive topics that they hope will aid future researchers in assessing the effectiveness of public health campaigns on social media.
A foundational concept for the study is Robert M. Entman's message frames, which are persuasive techniques in which a speaker tries to predispose the audience to a one-sided view of an issue while downplaying other perspectives. Indeed, antivaccine advocates disproportionately emphasise safety concerns while downplaying the preventive benefits of vaccines. When used consistently, Entman's four frames - those that (i) define a specific problem, (ii) diagnose a cause of that problem, (iii) make a moral judgment regarding that problem, and/or (iv) suggest remedies to that problem - can induce behavioural and attitudinal changes among audiences.
In addition, as noted here, antivaccine advocates on social media have shown more notable engagement patterns (e.g., "likes"/comments) than their provaccine counterparts. This higher user engagement has been attributed to a higher diversity of topics included in antivaccine rhetoric (e.g., distrust of the government and pharmaceutical companies, use of natural health and wellness strategies, emphasis on religion and morality, and advocacy for individual liberties).
The researchers used a multimethod approach to analyse discursive topics in the vaccine debate on public social media sites. The approach combined:
- large-scale balanced data collection from Twitter - i.e, 39,962 tweets - 11,103 provaccine, 8,169 antivaccine, and 20,691 neutral;
- the development of a supervised classification algorithm for categorising tweets into provaccine, antivaccine, and neutral groups - this algorithm has an accuracy rate above 90% and, by including the third category (neutral) screens out irrelevant and neutral tweets with an accuracy rate of 96.2%;
- the application of an unsupervised clustering algorithm (using K-means clustering) for operationalising and visualising the topics of vaccine debates discussed on both sides; and
- a multistep qualitative content analysis for identifying the prominent discursive topics and how vaccines are framed in these topics, using Entman's four framing dimensions.
The results indicate that antivaccine topics indeed, as described above, have greater "intertopic distinctiveness" (i.e., the degree to which the topics discussed are distinct from one another) than their provaccine counterparts. The higher intertopic distinctiveness of the antivaccine advocates' topics helps explain how engaging the antivaccine content is; such engagement "is the first step to inducing behavioral changes favorable to the topic..." However, there was no difference between the two groups in terms of intratopic consistency (wherein discourse surrounding a topic is internally consistent and coherent).
In addition, seen through the lens of Entman's thinking, while provaccine advocates identify the cause of the problem, make moral judgments, and suggest remedies for the problem (three of the four frames), they do not clearly state what this problem is. "The absence of a clear problem statement limits their capacity to communicate the urgency of the matter at hand." In contrast, antivaccine advocates provide a compelling statement of the current problem (vaccine injuries) in addition to using Entman's three other frames.
Among the suggestions offered for future research: "The current literature on social media marketing has not yet reconciled the conflicting findings between the effectiveness of consistent messages and varied messages for engagement. A comparison between the two may be an opportunity for future researchers."
In conclusion, based on the results, the researchers attribute higher engagement among antivaccine advocates to the distinctiveness of the topics they discuss, and they ascribe the influence of the vaccine debate on uptake rates to the comprehensiveness of the message frames. "These results provide an explanation for the higher engagement among antivaccine advocates and emphasize the urgency of developing a clear problem statement for provaccine content to counteract decreasing immunization rates."
JMIR Public Health and Surveillance 2021;7(6):e23105. doi: 10.2196/23105. Image credit: JMIR
- Log in to post comments











































