News

3 Words Mislead Online Regional Mood Analysis

You can tell a lot about people’s general state-of-mind based on their social media feeds. Are they always tweeting about their biggest peeves? Or posting pics of particularly cute kitties? Well, in a similar fashion, researchers are turning to Twitter for clues about the overall happiness of entire geographic communities. What they’re finding is that regional variation in the use of common phrases produces predictions that don’t always reflect the local state of well being. But removing from their analyses just three specific terms—good, love, and LOL—greatly improves the accuracy of the methods. Their work appears in the Proceedings of the National Academy of Sciences.

“We’re living in a crazy covid-19 era and now more than ever we’re using social media to adapt to a new normal and reach out to the friends and family that we can’t meet face to face.”

Kokil Jaidka studies computational linguistics at the National University of Singapore.

“But our words aren’t useful just to understand what we as individuals think and feel, they’re also useful clues about the community we live in.”

One of the simpler methods that many scientists use to parse the data involves correlating words with positive or negative emotions. But when those tallies are compared with phone surveys that assess regional well being, Jaidka says they don’t paint an accurate picture of the local zeitgeist.

To find out why, Jaidka and her colleague Johannes Eichstaedt of Stanford University, analyzed billions of tweets from around the United States. And they found that among the most frequently used terms on Twitter are LOL, love, and good.

“And they actually throw the analysis off. In fact, when we removed these three words alone we managed to improve upon the simpler word counting methods. And obtain better if not perfect estimates of happiness.”

Why the disconnect? Well, Jaidka says one issue is:

“Internet language is really a different beast than regular spoken language. We’ve adapted words from the English vocabulary to mean different things in different situations.”

Take, for example, LOL.

“I’ve tweeted the word LOL to flirt, express irony, annoyance, and sometimes just pure surprise. When the methods for measuring LOL as a marker of happiness were created in the 1990s, it still meant laughing out loud.”

There are plenty of terms that are less misleading, says Eichstaedt.

“Our models tell us that words like excited, fun, great, opportunity, interesting, fantastic, and those are better words for measuring subjective well being, just looking at the data.”

Their work appears in the Proceedings of the National Academy of Sciences. [Kokil Jaidka et al, Estimating geographic subjective well-being from Twitter: A comparison of dictionary and data-driven language methods]

Being able to get an accurate read on the mood of the population is no laughing matter.

“That’s particularly important now in the time of COVID where we’re expecting a mental health crisis and we’re already seeing in survey data the largest diminishment in subjective well-being in 10 years at least if not ever.”

No doubt we could all use more fantastic opportunities for great fun and excitement. Give or take the LOL.

—Karen Hopkin

(The above text is a transcript of this podcast)



Source link