The post has been translated automatically. Original language: English
Work on the implementation of the project to develop the Kazakh language model KazLLM in July 2024 focused on several key tasks.
Firstly, the specialists of the Sustainable Innovation and Technology Foundation expanded the dataset for the Kazakh LLM in order to train the spelling correction model in the Kazakh language.
Secondly, work was carried out to expand the data set of the parallel Kazakh language corpus, which will allow for machine translation of texts in 4 languages: Kazakh, English, Russian and Turkish.
Thirdly, the specialists worked on significant improvements in automatic speech recognition (ASR) for the four specified languages and on text-to-text translation.
Fourthly, the team of specialists tested different modes of the virtual avatar, which can conduct educational lectures in Kazakh, including in real time.
The implementation of the work within the framework of the KazLLM development project became possible with the partial support of AstanaHub.
Work on the implementation of the project to develop the Kazakh language model KazLLM in July 2024 focused on several key tasks.
Firstly, the specialists of the Sustainable Innovation and Technology Foundation expanded the dataset for the Kazakh LLM in order to train the spelling correction model in the Kazakh language.
Secondly, work was carried out to expand the data set of the parallel Kazakh language corpus, which will allow for machine translation of texts in 4 languages: Kazakh, English, Russian and Turkish.
Thirdly, the specialists worked on significant improvements in automatic speech recognition (ASR) for the four specified languages and on text-to-text translation.
Fourthly, the team of specialists tested different modes of the virtual avatar, which can conduct educational lectures in Kazakh, including in real time.
The implementation of the work within the framework of the KazLLM development project became possible with the partial support of AstanaHub.