AI Datasets Platform in Kazakhstan

Accelerating AI Development Through Open Data
Access government datasets, open source data, and more to kickstart your AI projects

AI Policy

Kazakhstan

Open Data

GovTech

AI Data

Open Sources

Partner Data

Open Sources

Cloud Support

Hub Perks

Artificial Intelligence Policy in Kazakhstan
Concept for the Development of Artificial Intelligence for 2024-2029

 

Concept for the Development of Artificial Intelligence for 2024-2029


Public Administration Sector::

In Kazakhstan, there are over 183 information systems registered by central government bodies that ensure the collection and processing of data for government needs. 

Financial Sector and Business:
Banking institutions, cloud-based customer services, and telecommunications companies also actively generate big data, contributing to the development of AI in the country.

Read more

Participate in AI Development
Become a Partner

01

Provide your data for AI development:

Share your datasets to support new AI projects and contribute to the development of innovative solutions in key industries.

02

Become a Partner of the AI Ecosystem

Unlock opportunities for collaboration with leading startups and experts by providing data for the development of practical and promising solutions.

03

Increase the Value of Your Data

Engage with AI projects at early stages and see how your data becomes the foundation for new technologies that align with your strategic goals.

WORKING WITH OPEN DATA

Types of Licenses
for Datasets

  • When working with datasets, it is important to understand the different licenses that govern their use. 

     

    Below is a brief overview of common licenses for datasets, from the most open to the most restrictive.

    1. Public Domain Public
    Domain Mark You can dedicate your dataset to the public domain, relinquishing all rights. This isn’t technically a license but a dedication of your work for unrestricted public use.
    2. CC-0 (Creative Commons Public Domain Dedication)
    Similar to the public domain, this license allows you to formally waive your rights, enabling anyone to use your dataset without restrictions.
    3. PDDL (Open Data Commons Public Domain Dedication and License)
    Another public domain-style license, PDDL enables dataset owners to surrender rights even if local law does not directly support public domain dedication.
    4. CC-BY (Creative Commons Attribution 4.0 International)
    This open license allows users to share and adapt your dataset, provided they give proper credit to you.
    5. CDLA-Permissive-1.0 (Community Data License Agreement – Permissive)
    This permissive license allows users to use, modify, and share your dataset as long as they credit you, with no restrictions on the results derived from computational use.
    6. ODC-BY (Open Data Commons Attribution License)
    Similar to CC-BY, this license allows users to share and adapt your dataset, requiring them to credit the original creator.
    7. CC-BY-SA (Creative Commons Attribution-ShareAlike 4.0 International)
    This license allows users to share and adapt your dataset while requiring any modifications or adaptations to be distributed under the same license. This "viral" license may deter others from using your dataset due to these conditions.
    8. CDLA-Sharing-1.0 (Community Data License Agreement – Sharing)
    Designed with copyleft principles, this license allows users to share your dataset and their modifications under the same license, ensuring any adaptations remain open and credited.
    9. ODC-ODbL (Open Data Commons Open Database License)
    Like CC-BY-SA, this license allows sharing and adapting the dataset while requiring modifications to remain under the same license. It can be considered a viral license due to its share-alike obligations.
    10. CC BY-NC (Creative Commons Attribution-NonCommercial 4.0 International)
    Users can share and adapt your dataset, but they must give credit and refrain from using it for commercial purposes.
    11. CC BY-ND (Creative Commons Attribution-NoDerivatives 4.0 International)
    Users can share your dataset with proper credit but cannot make any modifications or transformations.
    12. CC BY-NC-SA (Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International)
    This restrictive license allows sharing and adapting for non-commercial purposes only, while requiring any modifications to be licensed under the same terms.
    13. CC BY-NC-ND (Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International)
    One of the most restrictive licenses, it allows only unmodified sharing for non-commercial purposes with proper credit. No adaptations are permitted.
    14. Other License
    Options If the license you require isn’t listed, you can specify your own terms under “Other” when creating your dataset.
    15. No License
    Specified Without a license, your dataset cannot legally be used, shared, or adapted by others.

Explore the Main Types of Data Used for AI Project Development

Spatial
3D Data

Datasets with three-dimensional models of objects, CAD files, depth maps, and motion data.

Read more

 

Audio-Speech
Data

Datasets with audio recordings, annotated speech data, audio clips with different accents and languages.

Read more

Computer
Vision

Datasets of images and videos with annotations for object, face, and scene recognition.

Read more

Data Science

Datasets with business data, financial records, statistics, and time series.



Read more

Image Generation

Image datasets, image metadata, and datasets with samples for generating new visual content.

Read more

Natural Language
Processing (NLP)

Text corpora, emotion-labeled datasets, contextual texts, and dialogue records.

Read more

Reinforcement Learning

Simulation datasets, game environments, and interaction scenarios for training models based on actions and feedback.

Read more

Video Data

Video archives with object and motion annotations, data for tracking behavior and actions in dynamic scenes.

Read more

Other useful resources
Open sources of data and tools for AI research and development

Cloud Credits - Programs for Startups



Microsoft for Startups Founders Hub

Free access to OpenAI models, up to $150,000 in Azure credits

Get access

NVIDIA Inception Program

Special conditions for hardware and software products, cloud credits from NVIDIA partners


Get access

 

Google for Startups Cloud Program 

Up to $200,000 ($350,000 for AI startups) in cloud credits over 2 years

Get access

Yandex Cloud Boost

Astana Hub resident companies can receive a grant to use the Yandex Cloud cloud platform with an availability zone in Kazakhstan.


Get access

 

Digital Ocean Hatch

Digital Ocean Hatch is DigitalOcean's global startup program that enables startups to grow and build in the cloud. Become a part of Hatch and receive up to $5,000 in credits on DigitalOcean's cloud platform for 12 months.

Get access

Open Datasets for Research and Development

Cloud Solutions and Resources for Developers

Cloud Solutions and Resources for Developers

Are there questions or suggestions? Contact us

Close