China will explore new data set trading models such as word element trading
The National Data Administration recently released a notice soliciting public opinions on the "Implementation Plan for Promoting the Construction of High Quality Data Sets in the Industry (Draft for Comments)".
A high-quality industry dataset is a collection of industry data that has been collected, processed, and can be directly used for developing and training artificial intelligence models, and can effectively improve the performance of models, agents, intelligent terminals, and other applications. It includes industry general and industry specific datasets.
The plan proposes to build a batch of high-quality industry datasets that cover key areas and have been verified through application by the end of 2028, create a group of typical application scenarios for data-driven artificial intelligence innovation and development, cultivate a group of innovative data enterprises and professional talents with leading advantages, and form a group of industry high-quality dataset construction standards and tools.
The plan is clear, focusing on the pre training and reinforcement learning stages of artificial intelligence, and continuously promoting the construction of high-quality multimodal datasets such as text, images, audio, and video. Strengthen the construction of datasets such as knowledge bases, knowledge graphs, and ontologies for new forms of intelligent applications such as intelligent agents. To meet the development needs of embodied intelligence, accelerate the construction of real machine interaction datasets for key scenarios such as physical interaction, environmental perception, and motion control. Actively layout the construction of cutting-edge datasets such as world models.
In terms of innovating high-quality dataset business models in the industry, the plan proposes to promote a leap in business models from selling basic data packages to calling application programming interfaces (APIs), modeling solutions, and full stack services. Explore new data set trading models such as word element trading, and construct a quantifiable and priced data set value system based on word elements. (Reporter Wang Shanshu)