Word processing tool
Find more languages here https://github.com/hermitdave/FrequencyWords/tree/master/content/2018/en
To clean your words from no usable or not appropriate language use this notebook
Drag your words list to Collab folder and rename to input.txt

Set language in the Configuration Variables
Optionally set Open AI API key to process words more accurately
Press "Run All"
Download "processed_words.txt" or "llm_filtered_processed_words.txt" if you used LLM option

Follow this doc to train model with your processed txt file
Last updated