ANHUI, East China — These days, Xu Li doesn’t worry whenever her elderly father takes off on one of his frequent solo trips to Europe. Although the native of provincial capital Hefei speaks little English and has Chinese so heavily infused by his city’s local patois that other Chinese cannot always understand him, Xu trusts that, in case of an emergency, he can communicate using the dialect transcription and translation app on his phone. After all, she helped develop the program.
Xu is a researcher at iFlytek, a Chinese tech firm known for its AI-powered iFlytek Voice Input keyboard app. The app’s voice-to-text function lets users transcribe vocal recordings and dictate written messages. But to win over consumers and succeed in the Chinese market, it needs to do more than just be able to recognize and translate spoken Standard Mandarin, the country’s only official national language — it has to be able to understand and process dialect, too.
That’s where Xu comes in. Despite decades of concerted government effort, tens of millions of Chinese, including Xu’s father, still speak little-to-no Standard Mandarin, instead communicating in one or more of the country’s thousands of distinct regional and local vernaculars. It’s Xu’s job to increase the accuracy of iFlytek’s input method — and the software’s potential scope — by using deep learning techniques to teach the program these dialects.
Xu Li demonstrates iFlytek’s translation tool in the company’s headquarters in Hefei, Anhui province, April 10, 2019. Qian Zhecheng/Sixth Tone
To facilitate her work, iFlytek set up the Dialect Protection Plan in 2017. The plan uses a network of volunteers across the country to record and upload dialect samples to what it claims is a dedicated dialect-protection app run by the company — everything from simple words and sentences to complex, practical examples of everyday language. The company doesn’t pay them, instead appealing to their pride and passion for their native tongues. Xu and her team then use the data to train their machine learning algorithms.
For years, linguistic activists have bemoaned the steady retreat of local dialects in the face of the growing cultural and social dominance of Standard Mandarin — which is amplified by heavy-handed government support, rapid urbanization, and the rise of new mass media technologies like radio and television. But Xu and her team claim they’ve found a novel solution to this problem. Rather than diminishing dialects, they believe new technology can protect them and keep them relevant, even in the face of globalization and longstanding attempts to establish a countrywide lingua franca in China.
The Standard Mandarin used on the Chinese mainland today dates back to 1956, when the nation’s new leaders launched a renewed campaign for linguistic unity. That year, in a directive supporting the campaign to popularize Standard Mandarin, then-Prime Minister Zhou Enlai noted how China’s lack of a universally intelligible language was getting in the way of communication between far-flung communities and “slowing the construction of socialism.” He claimed that a national language would be crucial for China’s efforts to develop politically, economically, and culturally, as well as enhance its national defense.
“It was inevitable that promoting Standard Mandarin would devastate dialects, but at the time, this was hard to foresee,” linguistics professor Tao Huan tells Sixth Tone. A specialist in Chinese dialects at Shanghai’s Fudan University, Tao adds that dialect loss is an acute, radical, and irreversible phenomenon — Standard Mandarin simply possesses too many natural advantages over dialects in business, government, and media communications. “Standard Mandarin will dominate the market, even if you don’t promote it,” he says.
A woman walks past posters of an exhibition about the city’s local dialect in Hohhot, Inner Mongolia Autonomous Region, May 23, 2019. Ding Genhou/VCG
China’s early leaders, however, did not seem to share Tao’s opinion. In addition to mandating official business be conducted in Standard Mandarin, Zhou’s directive took aim at the education and entertainment industries. It asked provincial governors to provide targeted learning materials for speakers of local dialects, education authorities to produce new audio learning materials, and radio stations to begin incorporating Standard Mandarin into their programming.
Over the years, the campaign to promote Standard Mandarin grew in scale and scope. By 2001, the law required broadcasters to get official authorization from the national or provincial authorities to use dialect in their programming; it also banned the use of dialects for government business unless absolutely necessary. A three-year study concluded that same year found that only 53% of Chinese could communicate in Standard Mandarin. Roughly 10 years later, in 2010, a Ministry of Education (MOE)-backed research study found that Standard Mandarin proficiency had spread to over 70% of the population, and in 2017 MOE predicted this number would reach 80% by 2020.
Even if this does happen, however, that would still leave hundreds of millions of Chinese with an inadequate command of the country’s official language.
Xu thinks iFlytek can advance the goal of mutual intelligibility without forcing people to learn and use Standard Mandarin. “If you create a barrier-free environment for dialect speakers outside of the mainstream (Standard Mandarin) environment … people will feel more comfortable speaking dialect,” she says.
An iFlytek booth during a CES ASIA event in Shanghai, June 12, 2019. Gao Yuwen/VCG
But in order to pull this off, Xu’s team first needs to train its translation algorithm to understand the country’s diverse range of often highly distinct dialects — something the Dialect Preservation Plan hopes to tackle. “Current methods of dialect preservation rely heavily on investigators’ subjective senses and ‘mouth-to-ear’ (oral) collection, which are neither effective nor efficient,” Li Qiangjun, a vice-general manager at iFlytek, tells Sixth Tone via a document. According to Li, the Dialect Protection Plan came from a desire to accelerate data collection.
Whereas the company once relied entirely on teams of investigators and researchers to collect linguistic samples from remote, mountainous villages across the country, for the past year it has been able to supplement this data with recordings from over 380,000 volunteers. Not only has the new approach saved the development team money, it’s also broadened their access to linguistic samples. Previously, researchers could only ask participants to read aloud from a pre-prepared script. On the app, however, contributors can specify their dialect, upload recordings of anything they want, and translate their remarks into their Standard Mandarin equivalents.
This pool of volunteer contributors has so far allowed iFlytek to build databases on 23 different dialects: from coastal Cantonese and Shanghainese to inland Sichuanese. Once these databases are in place, Xu can use them to power the company’s input system. She likens this process to a baby that still needs processed food: Researchers — or, in the case of data acquired via the Dialect Protection Plan, volunteers — must first tag the samples before they can be run through the system. Typically, tagged data consists of a written line of dialect, its Standard Mandarin counterpart, and annotations pairing the corresponding elements of each phrase.
If she can successfully train her algorithms to understand and translate dialects, Xu envisions a dialect-friendly digital age. Standard Mandarin was largely designed to overcome the difficulties in regional communication across a wide range of linguistic fault lines. Xu says she wants to achieve the same thing, but through an app that equips dialect-speakers with portable translators.
iFlytek isn’t the only Chinese company currently working on speech recognition and input technology — or the only one framing its efforts as dialect protection. An AI lab sponsored by Chinese e-commerce giant Alibaba launched a 100 million yuan ($14.9 million) “dialect protection program” of its own this March. The program aims to create a comprehensive database of Chinese dialects, which in turn will be used to improve Alibaba’s own smart voice technology.
One reason why companies like iFlytek and Alibaba have started flying the flag of dialect protection is that there’s a real demand for such programs among Chinese dialect-speakers hoping to protect their linguistic and cultural heritage. In 2010, residents of the southern city of Guangzhou organized a rally to support Cantonese-language programming. In addition to grassroots volunteers who use their free time to contribute linguistic samples to iFlytek, there are also famous dialect enthusiasts like TV host Wang Han, who in 2016 donated 5 million yuan to record and preserve his native Hunan dialect.
Linguistics scholar Tao acknowledges that technology like iFlytek’s is valuable for preservation, but he’s unsure if it can really save China’s dialects. “The tech has value in terms of researching language learning processes or (understanding) how computers study language through deep learning, but it might not have an obvious social impact,” Tao says.
The accuracy of iFlytek’s input method also remains inconsistent. In an interview, Xu claimed that the company’s software boasts an accuracy rate of 90% or above for Sichuanese, and 80% or above for eastern China’s Suzhou dialect, which is less closely related to Standard Mandarin.
When Xu demonstrates the app’s capabilities for Sixth Tone, the program has little problem recognizing her native Hefei dialect — which was among the first dialects her team worked on. However, when Sixth Tone later tests the program on the reporter’s native Suzhou dialect — the most recent addition to iFlytek’s database — its accuracy is well below the 80% claimed by the company.
But whether or not her app can live up to the hype, Xu hopes it can make a difference. “Our languages are all converging into the mainstream, meaning weak (dialects) will only get weaker,” Xu says. “If we don’t protect dialects, if they disappear, then after a few generations, no one will know them — or this period’s culture.”
Editor: Kilian O’Donnell.
(Header image: Wu Huiyuan and Ding Yining/Sixth Tone)