Stratifyd automatically generates and scores sentiment based on the corpus of documents processed. Sentiment polarities for N-grams range from -5 to +5 by default.
Stratifyd allows for domain-specific lexicons and other custom lexicons to be uploaded and used instead of relying on Stratifyd’s AI generated lexicon.
Why would you use a different lexicon?
Stratifyd's built in sentiment lexicon is already super powerful out of the box, but it does have its drawbacks in certain analyses. You would use a domain-specific lexicon in order to achieve more accurate sentiment results.
- Certain industries or data analyses require sentiment scoring for specific terminology unique to their languages.
- For example, more accurate buy and sell signals can be obtained for a stock sentiment analysis when the lexicon used is specific to the corporation, the industry, and general financial markets slang.
- Custom lexicons may be desirable when testing for sentiment not related to explicit positivity vs. negativity – such as uncertain vs. constraining, strong modality vs. weak modality, complexity vs. simplicity, etc.
- Multiple lexicons can be applied to the same dataset as if they were a single, combined lexicon.
Custom Sentiment Lists
Sentiment lexicons are lists of words and/or phrases with assigned polarities. They are not limited to positive and negative sentiment. For example, the opposing concepts of uncertain and constraining can be tested by assigning polarities to words and phrases in your list that are deemed to be associated with either end of the given spectrum.
- Polarities can be any integer, positive or negative, but it is recommended to use a sliding scale of whole numbers when assigning polarities.
- Certain terms may be viewed as leaning more less to one end of the spectrum you choose to test.
- For example, in a financial stock market lexicon, the term “bullish” may be considered highly positive, earning it the polarity of +5, with the equally opposite “bearish” earning a -5 score
- In the same lexicon, “trending upward” may be considered less positive for your analysis compared to “bullish”; therefore, the term is weighted with a +3 score rather than a +5 score.
- Sentiment lexicons must be formatted as comma-separated files (e.g. .txt or .csv).
- A description is optional as a third value after polarity. Descriptions are helpful when auditing lexicons for their accuracy and validity over time to remind users of the rationale for including a term or assigning a certain polarity to the term.
Stratifyd has version control for advanced options like lexicons as well as taxonomies and stop word lists.
- The first import or creation of a lexicon is registered as V1.
- Any edits made in the lexicon creation/editing module creates a V2 version.
- Each time the lexicon is edited and re-saved, a newer version is registered.
- All versions are preserved within the system for your convenience in case you want to revert back to a previous version of the lexicon for analysis.
This feature is helpful when testing the effectiveness of adding and redacting terms from the lexicon.
Importing Custom Lexicons
- Navigate to the advanced tab on the Stratifyd platform home-page.
- Click on the sentiment module. From here, you can choose to create a new lexicon or edit an existing lexicon. You can also share any lexicon with team members by selecting the blue share icon on the lexicon’s module.
- Click on the icon in the bottom right corner of the screen to create a new lexicon.
- Give your lexicon a title.
- Select upload a file to import a comma-separated lexicon file.
- You can create the lexicon within this window by clicking the +Add… hyperlink and typing the terms, sentiment polarities, and optional descriptions one-by-one.
It is recommended to upload a lexicon because most effective lexicons contain several hundred to thousands of terms.
- After uploading the lexicon file, the terms, polarities, and descriptions should display for preview.
- Select the custom negation words button under Types.
- Type a list of any negation words separated by commas you want Stratifyd to recognize preceding any terms on your lexicon and flip the sentiment polarities.
- Click save to save the lexicon to your collection of sentiment lexicons in Stratifyd.
Common negation words include “no, never, doesn’t, isn’t, shouldn’t, not, won’t”.
How to Apply a Custom Lexicon to Your Analysis
- When importing or reprocessing data from a data connector, click on the advanced hyperlink drop-down.
- Select the second advanced option called sentiment.
- Choose a lexicon and version number from your library of lexicons in the pop-up display you want applied to the dataset.
- Click apply
You can always view which lexicons are applied to a dataset by clicking the dataset’s name from the data manager in your dashboard.
Changing lexicons will always require the dataset to be reprocessed; it recommended to make copies of the dataset in order to compare different lexicon results side-by-side in the same dashboard.