(TibetanReview.net, Dec03’25) – The Chinese government is developing AI systems in the languages of the ethnic minorities in the People’s Republic of China (PRC) to expand state surveillance and control, an Australian think tank has warned Dec 1. In this connection, China announced last month that it had launched a Tibetan large language model (LLM) and it would be made official after being registered with regulatory authorities.

The Australian Strategic Policy Institute (ASPI) has said in a report published Dec 1 that Beijing was developing LLM-based public opinion analysis systems for Korean, Uyghur, Tibetan and Mongolian. The goal, it said, was to increase the state’s ability to monitor and control communications in those languages across text, video and audio.

The report, titled “The party’s AI: How China’s new AI systems are reshaping human rights,” points to a government-backed laboratory at Minzu University of China as a key driver of the effort. China’s Ministry of Education is stated to have established the National Key Laboratory of Ethnic Language Intelligent Analysis and Security Governance at the university for this purpose.

The lab’s website states it was set up to “maintain national stability and ethnic unity.” Its main research areas include developing LLMs for Korean, Uyghur, Tibetan and Mongolian to build public opinion analysis and online security systems for minority communities.

Researchers collect internet data from regions inhabited by ethnic minority groups, extracting meaning from text, audio, video and even emojis posted by users. Using this, the lab aims to build what it calls “internet public opinion monitoring and sentiment analysis technology,” and claims to have developed large-scale knowledge databases in more than 10 minority languages, reported koreajoongangdaily.joins.com Dec 3.

The PRC has sizeable minority populations including 1.7 million ethnic Koreans, 12 million Uyghurs, 6 million Tibetans and 6 million Mongolians. Minority languages have long represented a blind spot for Chinese state surveillance, the ASPI report has noted.

***

In the case of Tibet, a Tibetan large language model, SunshineGLM V1.0, the first Tibetan foundation model in the PRC with hundreds of billions of parameters, was launched on Nov 19 in capital Lhasa, reported China’s official Xinhua news agency Nov 24.

The report said the launch was held at an event in Tibet University, during which Nyima Tashi, the chief scientist of the research team and a professor with the university, said that the model was trained using around 28.8 billion tokens of high-quality Tibetan-language data.

These data were stated to include a large-scale corpus of Tibetan sentences and texts, Chinese-Tibetan and Tibetan-English parallel corpora, as well as entries from Chinese-Tibetan bilingual dictionaries, covering various fields such as news reporting, law, medicine, philosophy, education, culture, science and technology.

The report cited developers as saying SunshineGLM V1.0 could handle complex language structures and multi-domain knowledge. It demonstrates proficient semantic understanding of Tibetan, is capable of producing prompt responses to queries, as well as clear and accurate content. It excels in various areas, including Tibetan text generation and machine translation.

The report said that as a foundation model, SunshineGLM V1.0 could be widely applied in the development of sector-specific models, such as in agriculture, tourism, education, Tibetan medicines and high-altitude healthcare.