# Word processing tool

Find more languages here <https://github.com/hermitdave/FrequencyWords/tree/master/content/2018/en>

To clean your words from no usable or not appropriate language use this [notebook](https://colab.research.google.com/drive/1PHkvPC6DOVWJjF78QHmVTzhe8AChNm6F?usp=sharing)

* Drag your words list to Collab folder and rename to input.txt

<figure><img src="https://2833866240-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FjXBqDJjxO8HgWsUBESnx%2Fuploads%2Fm0JwlVoj8ffU0FRoLQEo%2Fimage.png?alt=media&#x26;token=ce92430b-7181-47f2-9aa8-46f123d23550" alt=""><figcaption></figcaption></figure>

* Set language in the Configuration Variables
* Optionally set Open AI API key to process words more accurately
* Press "Run All"
* Download "processed\_words.txt" or "llm\_filtered\_processed\_words.txt" if you used LLM option

<figure><img src="https://2833866240-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FjXBqDJjxO8HgWsUBESnx%2Fuploads%2FaFrtQSFEkxZkxaMful6T%2Fimage.png?alt=media&#x26;token=d64ac632-88a9-46e1-972a-79255bc022f5" alt=""><figcaption></figcaption></figure>

* Follow this [doc](https://docs.candy-smith.com/main/word-connect-game-toolkit/train-new-model-or-update) to train model with your processed txt file
