Skip to content

Mistake in script Dictionary-Based Text Analysis.Rmd #2

@Gutschlhofer

Description

@Gutschlhofer

Thanks a lot for the tutorial, it helped me guide a friend that barely knows the basics of R to get started with text analysis!

While going through it, I found a mistake in the ordering of commands in the "Dictionary-based Text Analysis" - you create the factor after assigning the top_20 variable and then you plot the top_20 that are obviously not arranged by frequency. I thought you might want to change this:

#select only top words
top_20<-trump_tweet_top_words[1:20,]
#create factor variable to sort by frequency
trump_tweet_top_words$word <- factor(trump_tweet_top_words$word, levels = trump_tweet_top_words$word[order(trump_tweet_top_words$n,decreasing=TRUE)])
library(ggplot2)
ggplot(top_20, aes(x=word, y=n, fill=word))+
geom_bar(stat="identity")+
theme_minimal()+
theme(axis.text.x = element_text(angle = 90, hjust = 1))+
ylab("Number of Times Word Appears in Trump's Tweets")+
xlab("")+
guides(fill=FALSE)

to this:

#create factor variable to sort by frequency 
trump_tweet_top_words$word <- factor(trump_tweet_top_words$word, levels = trump_tweet_top_words$word[order(trump_tweet_top_words$n,decreasing=TRUE)]) 

#select only top words 
top_20<-trump_tweet_top_words[1:20,] 

# library(ggplot2) 
# ggplot...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions