Dr. Jo Lukito of the University of Texas, Austin has made using science to understand political language her life’s work. She joins Jess for the third episode in our ongoing series about the science of democracy in the 2024 election cycle.
Transcript
It’s October 30, 1938. The CBS Radio Network broadcasts an adaptation of the H.G. Wells novel “The War of the Worlds,” directed and narrated by Orson Welles. The show was made famous by the panic of some listeners, who were said to have believed that New Jersey really was experiencing an invasion of hostile aliens from the planet Mars.
Words have always had power, and never has this been more true than in today’s oversaturated media landscape. We need hard data and facts about the effects words and the forces wielding them have on our elections, and on our democracy itself. Good news! We have a whole field of science for that.
I’m your host Jess Phoenix and this is…SCIENCE.
My guest today is Dr. Jo Lukito of the University of Texas, Austin. She's also serving on our new election science task force here at the Union of Concerned Scientists, which is working to strengthen the science behind evaluating and advancing free and fair elections. Jo directs the Media and Democracy Data Cooperative, and she's an expert in computational sociolinguistics. Her work caught my attention because it's tailor-made for understanding democracy in the age of alternative facts. In a time when politicians underscore that they will not compromise with the opposing party, and when the informal rules of political decorum have been trampled into something that would be absolutely unrecognizable to the politicians of yesteryear, we desperately need hard data and facts, to help educate the public, scientists, and politicians on the language that is currently rocking the foundations of our democracy. Jo, I appreciate you being here with me. Would you start us off by explaining how you've blended programming with analyzing political language? Dr. Lukito: As a child, I think I've always been really interested in politics. I remember when I was in elementary school, and they were covering the Bush/Al Gore election, and I was just fascinated in the media coverage of it, how it was discussed, how my parents were voting. And so, I've always had this interest in politics, but specifically how we talk about politics. How do citizens make sense of political issues, and how do politicians try to persuade us to vote for them? And so, when I decided to go to graduate school, I knew this was going to be a real central focus of mine. And I was growing up very much in the internet age, and it was a time when a lot of politicians and a lot of citizens were starting to talk about politics online, on social media platforms. And as I started doing this work, I realized I would not be able to do it all by hand. I needed to find a way to scale it up. And so, I started looking into computational methods and programming languages, to see if there was a way that I could use these new tools to study large and large amounts of social media data, of news data, and of political text. And it ended up becoming a really useful way to not just look at individual messages, but to look at millions of messages, to understand these kind of patterns of communication, and the kind of narratives, and especially now, because of the number of platforms that we have on the internet, and because of how prominent the internet has become in all of our day-to-day lives, this kind of work, I think, has just become even more important. And I think for a lot of scientists, they do the research that they do because something, there was some origin story, something that happened, right, that motivated them to ask questions. And for me, that happened to be related to political language, and especially as someone who, you know, is a first-generation academic, and a first-generation researcher, but a first-generation American as well, I think it just became really important for me to understand how to communicate and talk about politics. And especially, you know, in the day, in the age of alternative facts, as you've mentioned, and the rise of relying on digital platforms for information, some of which is good and some of which is bad. We really do desperately need these computational tools to help us sift and winnow things, because the reality is we can't have, like, no individual person, researcher or otherwise, can look at 10 million tweets, right? And this is really what a computer is actually capable of doing.
Dr. Lukito: As a child, I think I've always been really interested in politics. I remember when I was in elementary school, and they were covering the Bush/Al Gore election, and I was just fascinated in the media coverage of it, how it was discussed, how my parents were voting. And so, I've always had this interest in politics, but specifically how we talk about politics. How do citizens make sense of political issues, and how do politicians try to persuade us to vote for them? And so, when I decided to go to graduate school, I knew this was going to be a real central focus of mine. And I was growing up very much in the internet age, and it was a time when a lot of politicians and a lot of citizens were starting to talk about politics online, on social media platforms. And as I started doing this work, I realized I would not be able to do it all by hand. I needed to find a way to scale it up. And so, I started looking into computational methods and programming languages, to see if there was a way that I could use these new tools to study large and large amounts of social media data, of news data, and of political text. And it ended up becoming a really useful way to not just look at individual messages, but to look at millions of messages, to understand these kind of patterns of communication, and the kind of narratives, and especially now, because of the number of platforms that we have on the internet, and because of how prominent the internet has become in all of our day-to-day lives, this kind of work, I think, has just become even more important. And I think for a lot of scientists, they do the research that they do because something, there was some origin story, something that happened, right, that motivated them to ask questions. And for me, that happened to be related to political language, and especially as someone who, you know, is a first-generation academic, and a first-generation researcher, but a first-generation American as well, I think it just became really important for me to understand how to communicate and talk about politics. And especially, you know, in the day, in the age of alternative facts, as you've mentioned, and the rise of relying on digital platforms for information, some of which is good and some of which is bad. We really do desperately need these computational tools to help us sift and winnow things, because the reality is we can't have, like, no individual person, researcher or otherwise, can look at 10 million tweets, right? And this is really what a computer is actually capable of doing.
Jess: What does data about how specific language spreads across different platforms, what does that allow policymakers to do to protect free and fair elections?
Dr. Lukito: As you alluded, our media ecology now consists of tons of platforms. When I started my time on the internet, we maybe had, like, six or seven different social media platforms. I remember when Myspace was a thing, and Facebook was all the rage, and now we have over 20 really popular platforms, in various forms, right? And we have platforms, some of which are more video-based, like YouTube and TikTok, and we have platforms that are still more text-based, right? The platform formerly known as Twitter, now X. And of course, plenty of other social media platforms have imitated that sort of structure. And what we're finding now is that not only are these platforms using different modal forms of communication, like some video, some image, some text, but we're also finding that these platforms have different types of communications strategies, right, based on different audiences. And so, I've become especially concerned with when there is harmful online content, whether in the form of misinformation, or calls for violence, or hate speech, how does it kind of manifest on one platform, and at what point is there a spill-over onto another platform? I do get really worried when we do a lot of research on just the mainstream platforms that we miss out on potentially harmful communication and conversation.
So, to use a really concrete example, a lot of the organizing that happened prior to the insurrection on January 6th was not necessarily happening on mainstream platforms. They were happening on these alternative tech, or these smaller platforms, like Telegram and Parler, and if we don't pay attention to it, there are really dire and serious offline consequences.
Jess: Talk to me about dis and misinformation, because there is a difference between the two, and I think it would be really helpful for our audience to know what that distinction is.
Dr. Lukito: When I talk about dis and misinformation, the one thing that is similar about both of them is that they both relate to false information. It's content that we can determine is factually inaccurate. Misinformation is when you are spreading that factually inaccurate information, but believe it to be true. One common form of misinformation is the belief that vaccines cause autism, for example. And if I genuinely believe that, and I shared that on a social media post, that would be misinformation. Disinformation is when I know it's false, but I share it anyway. And so, intent becomes a really key distinction between disinformation and misinformation.
Jess: It sounds like, associating, for people who need mnemonics and stuff, saying, like, disinformation, is deliberate. Misinformation isn't.
Dr. Lukito: That's a really great way to phrase it. I totally agree. Because the deliberate intent to share false information is the core of disinformation, right. And oftentimes, you know, a really big question I get is, why do people spread disinformation? Like, if you know it's false, why would you share it? And it's important to recognize that there are foreign actors who spread disinformation, with the intent of manipulating elections, and persuading people to believe in things that are genuinely not true. There are also individuals who lie about their personas online. There is this common phrase, "On the internet, you don't know if you're a dog." And that, I think still rings true now. There are a lot of situations where people either try to lie about their identity, or, especially with the rise of things like deep fakes, there's a lot of content out now where people intentionally try to make it look authentic, when in reality it's artificially-generated, or fake content. And so, this kind of concern about disinformation is more prevalent, I think, than people recognize.
Jess: Yeah. And it makes me wonder. Because like so much in U.S. history, a lot of things designed to manipulate the public have been targeted at specific groups. So, are you seeing that this disinformation in particular is disproportionately aimed at traditionally underrepresented or disadvantaged people, like communities of color or people from low-income households?
Dr. Lukito: Absolutely. And it's, you know, especially frustrating because they tend to disenfranchise voters from these underrepresented populations. So, we saw this as early as 2012 and 2016, and in 2016, probably the most famous example of disinformation in the United States came from Russia, or from Russian troll armies. In that particular disinformation campaign, what we saw were that Russian disinformation actors were specifically targeting black communities. And so, they were impersonating Black Americans, pretending to be Black Americans, and producing disinformation specifically targeting those communities. In the lead-up to 2020, and now with 2024 on the horizon, we are also seeing a lot more disinformation produced in non-English, because what a lot of disinformation actors have realized is that America and social media platforms in the U.S., do a much better job of tracking mis and disinformation in English. But we don't quite have the same resources for Spanish or Mandarin Chinese, or many of the other languages that are spoken in the United States. And so, we are starting to see more disinformation content that's produced in non-English, and we're less prepared to deal with that, unfortunately.
And we're also seeing it a lot in not just text form, but in image and video form. My family is really active on WhatsApp, for example. That's a really popular platform, especially for diaspora communities, right, families in which they've moved from another country to the U.S. And on WhatsApp, probably the most prevalent form of misinformation that I see comes in the form of either videos or images, memes that family members are sharing with me, oftentimes believing it to be true. And it's much trickier, I think, to track those forms of text and language, because they're embedded into videos, they're embedded into images. And so, as a computational researcher, as a programmer, I need to do additional processing, to extract that text out, so that I can detect the misinformation in it.
Jess: So, of course, the multimillion-dollar question here. Can you tell us how to identify mis and disinformation? Because I know there are so many smart people who listen to this program, people who are scientists, and people who are super well-educated in other fields. We think we're smart. We think we're savvy. But stuff is changing. So, any pro tips here?
Dr. Lukito: In general, like, there's no silver bullet way to detect misinformation and disinformation. And part of it is because information changes. The pandemic, as a really good example, during that time, there was a point where we did not have a vaccine, right? A vaccine had not been developed yet. And so, if you shared information about there being a vaccine or a solution for COVID, that would have been misinformation at that time. However, when the vaccine came out, if someone shared a very similar message, of course that no longer becomes misinformation. Misinformation and disinformation tends to have this kind of timely component to it. It's very context-based. For those reasons, it's kind of hard to have, like, one surefire way to detect it. However, there are a handful of different cues, particularly for telling if something is false content, and then especially for identifying fake accounts. So, I'll start with the first thing, with fake content. Super importantly, I think it's essential when you see a piece of content that you think might be misinformation, that you verify it with another source of information. So, for example, one type of misinformation or disinformation we're seeing is fake screenshots of news stories. So, someone will, like, make up a news story, slap the CNN logo on it, and then say, like, "Oh, look at what CNN produced." The easy way to find that as misinformation is to go to CNN's website and see if they've actually covered that story. And so, that sort of double-checking is really essential for detecting misinformation content, specifically. For the disinformation side of things, what I always encourage folks to look for are disinformation accounts. And so, these are people who either lie about their identity, or are anonymous online and are spreading false content. And you can tell when some account is a disinformation account in a couple of different ways. Usually, their user profile is a mix of letters and numbers, as opposed to a genuine screen name. Oftentimes they will also have a generic image. Sometimes that image is of, like, a landscape or something like that. It tends to be a little bit vaguer. But if it is an image of a person, that image tends to be either AI-generated or lifted from somewhere else online. And so, one way that you can identify those images as kind of stolen is if you do a reverse image search. And there are many times where we've done reverse image searches, and then realized, like, "Oh, they just pulled this random picture from Twitter, or Facebook, or a social media platform here."
Jess: Wow. It is a little bit little Brave New World-ey, because we can't believe what our eyes and ears are telling us. I mean, I guess it's, "just go cross-check everything," is the short answer.
Dr. Lukito: Yeah. You know, and I think cross-checking everything, realizing that we have to have the media literacy, right, to, like, recognize, or be able to recognize on our own when something is misinformation or disinformation. And I know that's a little daunting, but I do think I've seen improvements in that sort of media literacy work, especially over the last couple of years. Even just talking through how do you make sure that the source of a story is a real news organization, is a validated news organization? Those are tips that I have only seen brought into education in the last couple of years.
Jess: I was wondering, because you're at a university. Are they starting to incorporate, basically, best practices about media literacy or teaching, like, "this is how you consume media if you want to get to the truth, or to the objectivity?" Is that something that you're seeing evolve in academia?
Dr. Lukito: Yeah, absolutely. I think, you know, I'm very fortunate in that I work in a journalism department. And so, for many of my students, they are expected to watch the news or to read the news as part of their anticipated occupation. We are also starting to see media literacy brought into biology classes, and brought into many, many other disciplines, which I think is absolutely critical. We're also seeing a lot more courses and time dedicated to science communication, which is so absolutely critical to combating this misinformation. For example, a lot of misinformation is in the area of health, right? Health communication. To me, it's really important that scientists step up and correct this sort of misinformation from a data-driven, empirical way. But sometimes that information gets kind of lost in the technicalities. And so, increasingly, I've noticed that STEM fields and STEM departments have started to include science communication as an important part of their education practice, because if someone's gonna be a doctor, if someone's gonna be a biologist or a STEM specialist, they still need to be able to communicate that very important research to the public.
Jess: Going back to elections, though, I know that you kind of touched on it a little bit before, but one of the biggest stories from the last few election cycles did focus on Russian interference in U.S. elections. And that was via organized, engineered, online campaigns. So, I know you've worked on that and studied it. What are some of the key takeaways? Obviously, knowing that they were targeting Black people with a lot of their work, but what else can you tell us?
Dr. Lukito: One of the other things that is really important to recognize when it comes to Russian disinformation specifically is that they do a fair bit of surveillance before they actually conduct their disinformation campaign. And so, prior to the Internet Research Agency, which was the troll army in 2016, Russia's troll army, they actually visited the United States, understood the political issues that we were grappling with, and then, when they were building these disinformation campaigns, they would exploit existing tensions within the country, in order to make sure their content was more viral, in order to increase distrust in the media ecology, and in order to leverage the media ecology's, kind of, attention-getting metrics. So, for example, one thing that Russia has historically done, and continues to do into 2020, and into 2024, is to exploit news media, specifically. That includes the creation of a bunch of fake news organizations. So, Russia had these, like, fake accounts that included terms like "Today in Miami," or "Today in Philadelphia," and they created hundreds of these fake news organizations that would then share a weird mix of sometimes true and sometimes false stories.
What we also saw were that Russian trolls were very often @-mentioning, especially on Twitter, they were @-mentioning reporters, and they were @-mentioning the social media accounts of these new organizations, so that their social media post would be hopefully, like, picked up by a news organization. And we did, in 2016, 2017, find that many, many U.S.-based news organizations were accidentally quoting Russian trolls. And when they did that, Russian trolls benefited, because they got more followers as a result of being quoted in a news story. And these were things that, you know, journalists, over time have increasingly started to use social media as a way to source information, as a way to get quotes from the public. And while I think that is inevitable, and can be really helpful for journalism, it also created this loophole that Russian trolls could exploit, and absolutely did.
Jess: I wanted to ask you, at this point in your career, how have you seen the data you've collected and analyzed help in the fight for, like, freer, fairer elections, and democracy, either here in the U.S. or abroad?
Dr. Lukito: Yeah. That's such a great question. So, when I was doing my work originally on Russian trolls, and looking at, it's kind of the spread of that social media content into news media, I had an opportunity to talk to a lot of journalists in newsrooms about better practices for validating a quote that they wanted to, like, a tweet that they wanted to quote in a news story. And I spoke with them a lot about how to retain that information as well, because when these Russian trolls were suspended, their content also got removed from the news story, and a lot of journalists also noticed this when it came to Donald Trump. So, when he was suspended from Twitter, all of the embedded tweets broke, across these social media platforms, or, across these news stories. And so, I was working a lot with journalists about how to make sure those things have long-standing survivability, and I've seen a lot of improvements in the way that journalists have reported on social media stories, and especially reported on stories where they're quoting social media users. The other area that I think has become, that gives me a lot of optimism, is specifically when it comes to election information, especially when it comes to voting. So, in 2016 and in 2020, I think it was much harder to have one or two, like, really concrete places where you can find out, like, where your polling location is. Or how to, you know, submit a mail-in ballot? And I think a lot of folks have realized that this is critical information for any citizen to have during an election. And so, I've seen this increase in not just media literacy, but political voting literacy, right? And I think there are a lot more resources now for how to find your polling location, how to put in a mail-in ballot, compared to 5 or 10 years ago.
Jess: That makes sense too. The thing that always strikes me is that it seems like there's somebody using these tools, using, whether it's generative AI, or just, basically, whisper campaigns, sharing this stuff. They're trying to get people not to vote, which, it just comes, it kind of touches at the idea of your vote is your voice. And if they can silence you, then they don't need to accommodate you when they're making policy. So, this is critical.
Dr. Lukito: As you mentioned, this is, like, not only is it critical, but it does tend to target underrepresented populations, you know, communities of color, low-income communities, communities that have very few polling locations, right? If you have to wait in line longer to place your ballot, it's gonna be much... Like, your willingness to vote becomes a lot harder, right? And as a country, we should be doing more to encourage voting, not making it harder.
Jess: Hear, hear. Let's get on that. Seriously, though, that is so critical. And so, you said you would join UCS's Election Science Task Force.
Dr. Lukito: Yeah.
Jess: So, tell me how you think this task force can help us move the needle in the run-up, and after the 2024 election.
Dr. Lukito: Yeah. You know, when I first met the group of folks who were going to be a part of this task force, one of the things I was in awe of was the sheer number of disciplines that were included in this, as well as the number of civil society organizations, election commissions, and people who are actually on the ground doing ballot work and election work. And I think that's a pretty, not only is it a remarkable group of folks, but it's such a diverse group of individuals, in terms of their expertise. But they all shared this unifying interest and goal of improving our elections, making sure citizens had the information that they need to be able to vote responsibly. And so, I think that was one of the most exciting things that I saw when I was, you know, meeting with this group for the first time. I think this is also a group of researchers and scientists that are really data-driven, and empirically-driven. And I think, in elections, we don't always get that, right? A lot of times, the goal is more so to persuade individuals with political language, as opposed to thinking through, with data, how can we best help citizens? And so, I think, in this kind of task force, they've been thinking very carefully about what kind of data do we need in order to help democracy, and to help our elections? And they've been thinking about it very carefully, both in terms of what is the information that is important prior to an election, right? That includes information about how to cast your ballot, what candidates are saying, you know, like, explicitly. It includes also, though, research on what happens after the election. And I think that, a lot of times in the past, we kind of tend to disregard it, right? Like, an election happens, and then it's over, and then we, like, move on to the next thing in our very busy lives. But if 2020 has shown us anything, it is that that period after the election is so critical, because not everyone's gonna necessarily believe that the election results were true. And so, the task force has also expressed a lot of interest in making sure the work we're doing before the election continues during and after the election, and I think that's so essential to our democracy.
Jess: We need that continuity. And as we saw in 2020, without a general acceptance on what reality is, you can have some pretty deadly results. So, there is a last question that I have for you, and it's a two-parter.
Dr. Lukito: Love it.
Jess: So, we are the Union of Concerned Scientists. So, Dr. Jo Lukito, why are you concerned?
Dr. Lukito: I am concerned because I have been an American citizen, I was born and raised in the United States, and I've cared a lot about the democratic institutions in the United States. And I am really concerned that they are under threat. Not only by external actors, such as Russia, but also by our own citizens. And I wanna make sure that this is a democracy that stays a democracy long after I'm alive, long after my kids are alive. And so, I think whatever we can do as citizens, and whatever we can do as scientists, to make sure that our democratic institutions remain, is just so critical.
Jess: Okay. So, now, the second part. What are you specifically going to do about that concern?
Dr. Lukito: Yeah. So, one of the ongoing projects that I'm working on is the collection of social media content for any federal candidate that is running for office. The project is called the Candidates and Social Handles Election Database. CASHED '24. And this is the first effort, with a couple of different research centers, as well as several civil society organizations, to collectively come together and try to compile all the social media handles from any candidate running for federal office in the United States for 2024. We think that this content is really important to citizens. We think citizens have a right to this information, and we think that citizens should know what their politicians or what their candidates are saying, whether it's about the economy, whether it's about healthcare, and whether it's about election fraud. And so, we wanna make this database and this resource available to citizens, available to journalists and researchers, and we're working really hard to make sure that if this works for 2024, doing it in future elections. Science is important in so many places in our lives, right? And it touches upon us in so many ways, and I think for as long as we have politics, we're going to need scientists to make sure the work is data-driven.
Ok everyone, keep an eye out for those hostile disinformation invaders and visit UCSUSA.ORG/RESOURCES to learn more about our Election Science Task Force.
Thanks again to Anthony Eyring and Omari Spears for production help. See ya later, Science Defenders!