Analyze Sentiment in Text

This example shows how to use the Valence Aware Dictionary and sEntiment Reasoner (VADER) algorithm for sentiment analysis.

The VADER algorithm uses a list of annotated words (the sentiment lexicon), where each word has a corresponding sentiment score. The VADER algorithm also utilizes word lists that modify the scores of proceeding words in the text:

  • Boosters – words or n-grams that boost the sentiment of proceeding tokens. For example, words like "absolutely" and "amazingly".

  • Dampeners – words or n-grams that dampen the sentiment of proceeding tokens. For example, words like "hardly" and "somewhat".

  • Negations – words that negate the sentiment of proceeding tokens. For example, words like "not" and "isn't".

To evaluate sentiment in text, use the vaderSentimentScores function.

Load Data

Extract the text data in the file weekendUpdates.xlsx using readtable. The file weekendUpdates.xlsx contains status updates containing the hashtags "#weekend" and "#vacation".

filename = "weekendUpdates.xlsx";
tbl = readtable(filename,'TextType','string');
head(tbl)
ans=8×2 table
    ID                                        TextData                                     
    __    _________________________________________________________________________________

    1     "Happy anniversary! ❤ Next stop: Paris! ✈ #vacation"                             
    2     "Haha, BBQ on the beach, engage smug mode! 😍 😎 ❤ 🎉 #vacation"                 
    3     "getting ready for Saturday night 🍕 #yum #weekend 😎"                           
    4     "Say it with me - I NEED A #VACATION!!! ☹"                                       
    5     "😎 Chilling 😎 at home for the first time in ages…This is the life! 👍 #weekend"
    6     "My last #weekend before the exam 😢 👎."                                        
    7     "can’t believe my #vacation is over 😢 so unfair"                                
    8     "Can’t wait for tennis this #weekend 🎾🍓🥂 😀"                                  

Create an array of tokenized documents from the text data and view the first few documents.

str = tbl.TextData;
documents = tokenizedDocument(str);
documents(1:5)
ans = 
  5×1 tokenizedDocument:

    11 tokens: Happy anniversary ! ❤ Next stop : Paris ! ✈ #vacation
    16 tokens: Haha , BBQ on the beach , engage smug mode ! 😍 😎 ❤ 🎉 #vacation
     9 tokens: getting ready for Saturday night 🍕 #yum #weekend 😎
    13 tokens: Say it with me - I NEED A #VACATION ! ! ! ☹
    19 tokens: 😎 Chilling 😎 at home for the first time in ages … This is the life ! 👍 #weekend

Evaluate Sentiment

Evaluate the sentiment of the tokenized documents using the vaderSentimentLexicon function. Scores close to 1 indicate positive sentiment, scores close to -1 indicate negative sentiment, and scores close to 0 indicate neutral sentiment.

compoundScores = vaderSentimentScores(documents);

View the scores of the first few documents.

compoundScores(1:5)
ans = 5×1

    0.4738
    0.9348
    0.6705
   -0.5067
    0.7345

Visualize the text with positive and negative sentiment in word clouds.

idx = compoundScores > 0;
strPositive = str(idx);
strNegative = str(~idx);

figure
subplot(1,2,1)
wordcloud(strPositive);
title("Positive Sentiment")

subplot(1,2,2)
wordcloud(strNegative);
title("Negative Sentiment")

See Also

| |

Related Topics