AI Bootcamp: Data preparation process

How does the AI “understand” the text?
Every time you enter a query —
The model goes a whole way from words to meaning.
That's how it happens. 👇
1️⃣ Data download - collecting and cleaning texts for analysis
2️⃣ Tokenization - splitting the text into small parts
3️⃣ Embeddings - turning words into numeric vectors
4️⃣ Token conversion to ID — each token has its own number
5️⃣ Byte Pair Encoding (BPE) - splitting rare words into understandable fragments
6️⃣ Creating embeddings of tokens - forming semantic links between words
All this is the foundation on which ChatGPT, Gemini, Claude and other models work.
Date: October 22nd
Beginning: 19:00
Location: Aqtobe Hub, Aktobe, 52A Abilkayir Khan Avenue
-
Area
Artificial Intelligence and Machine Learning
-
Format
Offline
-
Region
Aktobe
-
Address
просп. Абулхаир хана 52а, Актобе 030019, Казахстан
-
Start date
Oct. 22, 2025, 7 p.m.
-
End date
Oct. 22, 2025, 11:59 p.m.