Stop words are words that carry no significant value or meaning in search queries and natural language processing text analytics. These words are filtered out because they contain unnecessary information. The most common words in a language are often stop words – such as “the” and “be” – but there is no standard for stop word lists to adhere to. In fact, Stratifyd does not use stop word lists by default in order to support phrase searching.
Any list of words can be chosen as stop words depending on the purpose of your analysis. Stratifyd’s use of pointwise-mutual-information (PMI) limits the effects of typical would-be stop words since their PMI scores are very low. Nevertheless, you are given the option to upload a list of stop words to train the system for specific analyses.
You can type a list of stop words in the prompt’s text box separated by commoas or you can upload a comma-separated file containing your desired stop words.
Stopwords also have version control, enabling you to test out which stopwords lists are working the best for your analyses.
You can tune your analysis by adding stopwords directly from your dashboard widgets.
We recommend using the bigram list in the Semantic Topics widget for tuning analyses. Simply go through the top terms in your list and strikeout any bigrams that appear to be junk or unhelpful in your analysis
Stratifyd is a next-generation analytics platform powered by Augmented Intelligence™. Inside your data there are key signals to health of your business and the Augmented Intelligence engine helps you discover them:
Ready to see it for yourself? Let's schedule a short demo.
Contact us to learn more