About The AV and EV Sentiment Indices

The Automated Vehicle and Electric Vehicle Sentiment Indices separately track the positive and negative sentiments expressed on Twitter regarding these two subjects. The indices are updated daily based on a sample of up to 5,000 tweets per day on each subject. The tweets are selected based on a custome set of keywords. Retweets, replies, and quote tweets were not included in the data set, only original tweets. In addition, the tweets are filtered, so that identical or nearly identical tweets (a very common occurrence due to the number of automated bots on Twitter) are filtered out.

The resulting set of tweets are then analyzed using VADER, a popular open source tool that was developed and optimized for conducting sentiment analysis on social media posts such as tweets. Some minor additions tailored to the subject matter were made to the lexicon that is included in VADER. From each daily sample, two measures are computed. The Score, or average sentiment, is the simple average of all sampled tweets, where negative tweets are assigned a value of -1, neutral tweets a 0, and positive tweets a +1. The score, therefore, ranges between -1 (every tweet negative) to +1 (ever tweet positive). The positive/negative ratio throws out the neutral tweets and is simply the ratio of positive to negative tweets. The complete ratio, using all three values, is displayed as a donut graph.

In addition, the 100 most frequently occurring one and 2-word n-grams are determined for each set of tweets. These are displayed as word clouds. Using the word cloud data, a sample of the words that appear in the current day's cloud, but not in the cloud from seven days ago is listed under "What's Hot" while a sample of those that were in the word cloud 7 days ago, but not in the current cloud are listed under "What's Not."

There have been a number of studies using sentiment analysis to examine attitudes concerning automated and electric vehicles, including several that looked at changes in sentiment caused by crashes. However these have been limited to either a single snapshot or a short term comparison across a couple of weeks. In addition, those studies tailored the search terms for a specific crash event, rather then measuring from a constant baseline across long time spans and multiple incidents. A key goal of this project is to provide daily tracking of sentiment over an extended period. This allows trends to be monitored as well as measuing the effect that events such as automated vehicle crasehes have on the sentiment scores.and how quickly they recover. The first two months of data collection coincided with a significant incident involving a Tesla in self-driving mode, and the AV sentiment data showed a large, significant increase in the number of negative tweets concerning automated vehicles for about a week afterwards. More information can be found in a short write up: Observing the Effect of a Crash on Twitter Sentiment: Early Results from Time Series Data.