Auto-translation used

Developing KazLLM in July: what was accomplished by the specialists of the Sustainable Innovation and Technology Foundation

The work on the implementation of the project on the development of the Kazakh language model KazLLM in July 2024 was focused on several key tasks.

Firstly, the specialists of the Sustainable Innovation and Technology Foundation expanded the dataset for the Kazakh LLM in order to train the model to correct spelling errors in the Kazakh language.

Secondly, work to expand the data set of the parallel corpus Kazakh language was carried out. This allows machine translation of texts into 4 languages: Kazakh, English, Russian and Turkish.

Thirdly, the specialists worked on significant improvements in automatic speech recognition (ASR) for the four languages mentioned above and on text-to-text translation.

Fourth, a team of specialists tested different modes of a virtual avatar that can conduct educational lectures in the Kazakh language, including ones in real time.

The implementation of works within the framework of the KazLLM development project became possible with the partial support of AstanaHub.

Comments 1

Login to leave a comment