Auto-translation used

Creating an autocorrection application

I decided to check my closed GitHub repositories and found an autocorrection application, I checked my code and decided to open it. Let's figure out how it works.

1. The script reads a text file and extracts the words to create a dictionary
2. The script constructs a set of unique words (vocabulary) and calculates the frequency of each word.
3. The probability of each word is calculated based on its frequency in the text.
4. For the input word, the Jacquard similarity between the input word and each word in the dictionary is calculated.
5. The script sorts words by similarity and probability to suggest the most likely corrections.

The Jacquard coefficient is determined by the ratio of two dimensions (areas or volumes), the size of the intersection divided by the size of the union, also called the intersection of the union (IoU - Intersection over Union)

The code for this project can be viewed in the GitHub repository, written in Python along with the pandas, numpy and textdistance libraries.

Comments 0

Login to leave a comment