
lexicon is a collection of lexical hash tables, dictionaries, and word lists. The data prefixes help to categorize the data types:
| Prefix | Meaning | 
|---|---|
| key_ | A data.framewith a lookup and return value | 
| hash_ | A keyed data.tablehash table | 
| freq_ | A data.tableof terms with frequencies | 
| profanity_ | A profane words vector | 
| pos_ | A part of speech vector | 
| pos_df_ | A part of speech data.frame | 
| sw_ | A stopword vector | 
| Data | Description | 
|---|---|
| cliches | Common Cliches | 
| common_names | First Names (U.S.) | 
| constraining_loughran_mcdonald | Loughran-McDonald Constraining Words | 
| emojis_sentiment | Emoji Sentiment Data | 
| freq_first_names | Frequent U.S. First Names | 
| freq_last_names | Frequent U.S. Last Names | 
| function_words | Function Words | 
| grady_augmented | Augmented List of Grady Ward’s English Words and Mark Kantrowitz’s Names List | 
| hash_emojis | Emoji Description Lookup Table | 
| hash_emojis_identifier | Emoji Identifier Lookup Table | 
| hash_emoticons | Emoticons | 
| hash_grady_pos | Grady Ward’s Moby Parts of Speech | 
| hash_internet_slang | List of Internet Slang and Corresponding Meanings | 
| hash_lemmas | Lemmatization List | 
| hash_nrc_emotions | NRC Emotion Table | 
| hash_sentiment_emojis | Emoji Sentiment Polarity Lookup Table | 
| hash_sentiment_huliu | Hu Liu Polarity Lookup Table | 
| hash_sentiment_jockers | Jockers Sentiment Polarity Table | 
| hash_sentiment_jockers_rinker | Combined Jockers & Rinker Polarity Lookup Table | 
| hash_sentiment_loughran_mcdonald | Loughran-McDonald Polarity Table | 
| hash_sentiment_nrc | NRC Sentiment Polarity Table | 
| hash_sentiment_senticnet | Augmented SenticNet Polarity Table | 
| hash_sentiment_sentiword | Augmented Sentiword Polarity Table | 
| hash_sentiment_slangsd | SlangSD Sentiment Polarity Table | 
| hash_sentiment_socal_google | SO-CAL Google Polarity Table | 
| hash_valence_shifters | Valence Shifters | 
| key_contractions | Contraction Conversions | 
| key_corporate_social_responsibility | Nadra Pencle and Irina Malaescu’s Corporate Social Responsibility Dictionary | 
| key_grade | Grades Data Set | 
| key_rating | Ratings Data Set | 
| key_regressive_imagery | Colin Martindale’s English Regressive Imagery Dictionary | 
| key_sentiment_jockers | Jockers Sentiment Data Set | 
| modal_loughran_mcdonald | Loughran-McDonald Modal List | 
| nrc_emotions | NRC Emotions | 
| pos_action_verb | Action Word List | 
| pos_df_irregular_nouns | Irregular Nouns Word Dataframe | 
| pos_df_pronouns | Pronouns | 
| pos_interjections | Interjections | 
| pos_preposition | Preposition Words | 
| profanity_alvarez | Alejandro U. Alvarez’s List of Profane Words | 
| profanity_arr_bad | Stackoverflow user2592414’s List of Profane Words | 
| profanity_banned | bannedwordlist.com’s List of Profane Words | 
| profanity_racist | Titus Wormer’s List of Racist Words | 
| profanity_zac_anger | Zac Anger’s List of Profane Words | 
| sw_dolch | Leveled Dolch List of 220 Common Words | 
| sw_fry_100 | Fry’s 100 Most Commonly Used English Words | 
| sw_fry_1000 | Fry’s 1000 Most Commonly Used English Words | 
| sw_fry_200 | Fry’s 200 Most Commonly Used English Words | 
| sw_fry_25 | Fry’s 25 Most Commonly Used English Words | 
| sw_jockers | Matthew Jocker’s Expanded Topic Modeling Stopword List | 
| sw_loughran_mcdonald_long | Loughran-McDonald Long Stopword List | 
| sw_loughran_mcdonald_short | Loughran-McDonald Short Stopword List | 
| sw_lucene | Lucene Stopword List | 
| sw_mallet | MALLET Stopword List | 
| sw_python | Python Stopword List | 
To download the development version of lexicon:
Download the zip ball or
tar
ball, decompress and run R CMD INSTALL on it, or use
the pacman package to install the development
version:
if (!require("pacman")) install.packages("pacman")
pacman::p_load_gh("trinker/lexicon")You are welcome to:
- submit suggestions and bug-reports at: https://github.com/trinker/lexicon/issues
- send a pull request on: https://github.com/trinker/lexicon/
- compose a friendly e-mail to:
tyler.rinker@gmail.com