Word processing tool

Find more languages here https://github.com/hermitdave/FrequencyWords/tree/master/content/2018/en

To clean your words from no usable or not appropriate language use this notebook

  • Drag your words list to Collab folder and rename to input.txt

  • Set language in the Configuration Variables

  • Optionally set Open AI API key to process words more accurately

  • Press "Run All"

  • Download "processed_words.txt" or "llm_filtered_processed_words.txt" if you used LLM option

  • Follow this doc to train model with your processed txt file

Last updated