The leaked data also reveals close police surveillance of foreigners from the moment they arrive in the country.
By Jane Tang for RFA Mandarin
July 16, 2022
Ren, a U.S. citizen who has lived in China for decades, didn't realize she was the victim of what could be the biggest data breach in Chinese history until she got a call from RFA.
She held her breath as, one by one, her ID card number, date of birth, entry and exit information and home address were read out to her from the massive data leak from the Shanghai police computer system, and confirmed that they were all correct.
Ren was left reeling at the public exposure of so much of her personal information, but also with a sense of helplessness; that there was little she could do about it.
"It feels so weird and creepy at the same time, as if all your personal information are just out there," she said. "I also think about my [COVID-19 test results], health code, everything related to me is tied to my passport number. Are they all public?"
"What can I do now? I can't change any information, that is my identity in China, and it was leaked from the government. It's annoying, alarming, but I just can't do anything about it," she said.
Security experts estimate that Ren's details, like those of around one billion other people, were exposed online as early as April 2021.
But it wasn't until June 30 that the leak came to the attention of the media, after a hacker forum user with the handle ChinaDan posted to offer for sale 23 TB of data from the Shanghai police department that included sensitive personal information on a billion people, for 10 bitcoin (around U.S.$200,000).
ChinaDan didn't specify how they came by the data; only that it was hosted on Alibaba Cloud.
But they uploaded three folders containing some 750,000 database entries by way of a sample for potential buyers, among which RFA found Ren's information.
As well as names, photos, phone numbers, street addresses, age, gender and ID number of victims, the data files on offer included people's hometowns -- an important part of law enforcement and access to public services under the ruling Chinese Communist Party (CCP)'s household registration, or "hukou," system -- details of business trips, and even instructions left for delivery couriers.
Shortcomings on data protection
China is one of the few countries in the world to enforce total real-name registration requirements for online services, which critics say enables the authoritarian regime's sweeping surveillance of its people.
Now, the Shanghai leak has exposed massive shortcomings in the way the authorities protect all the data they hold on anyone living in China, not just Chinese nationals.
RFA found at least 55 other U.S. citizens in the third folder of people who had come to the attention of police, mostly because they didn't register with their local police station within 24 hours of arriving in China.
This has been a requirement for all foreigners arriving in the country since the Exit and Entry Administration Law took effect in 2013.
"If you want to live in China, you have no choice but to submit this information again and again," Ms. Ren told RFA.
A U.S. State Department spokesman told RFA in background comments that the department was aware of reports of a data leak in Shanghai, but declined to comment further due to privacy concerns.
The State Department's information page for China warns U.S. citizens that their movements will be monitored.
"Security personnel carefully watch foreign visitors and may place you under surveillance," it says.
It warns that hotel rooms, meeting rooms, offices, cars, taxis, telephones, internet usage, digital payments, and fax machines used by overseas nationals could all be monitored on site or remotely, while personal possessions in hotel rooms, including computers, may be searched without their consent or knowledge.
"Security personnel have been known to detain and deport U.S. citizens sending private electronic messages critical of the Chinese government," the page says.
Neither the Shanghai government, nor the police department, nor the municipal branch of the Cyberspace Administration had responded to requests for comment at the time of writing.
'Reliable population data'
The leaked folder of data relating to people who have come to the attention of the Shanghai police department includes cases of fraud, theft, domestic violence, child abuse and rape.
But it also includes details of two people who reposted or posted tweets relating to China's leaders on Twitter, getting around China's Great Firewall of internet censorship.
Both were shocked and worried when contacted by RFA, and had no idea their details had been leaked.
"How did you get this number?" one asked. "Who are you, and where did you get this [information]?" the other wanted to know.
RFA dialed some phone numbers from the samples at random, in a bid to confirm at least some of the details were correct.
Some numbers were no longer valid, while some people hung up the moment they heard about the data breach.
At least 10 verified to RFA that their information was correct.
One woman said she had been getting two or three calls a day from unknown numbers in the days immediately following the data breach.
Some internet users have been confirming the authenticity of the data using phone number and name searches via the Alipay digital payments system.
Chinese population statistics researcher Yi Fuxian of the University of Wisconsin-Madison said the Shanghai data sample posted by ChinaDan was highly dispersed and random, covering almost every county in China, and were in line with data from the 2010 census.
"It shows that the quality of the sampling is very high, and that overall, this is reliable population data," Yi said.
Online security experts told RFA that the leak came as no surprise to them, but were cautious about commenting further.
"These three sets of data are different from previous [leaks] because they contain police intelligence," a Hong Kong-based technology company founder surnamed Wong told RFA.
This means that the data could already have been sold off to a private buyer, before ChinaDan offered it for general sale.
"For hackers, the most valuable thing isn't to sell the data publicly, but [privately,] without the hacked platform knowing anything about it," he said. "Then they can use it to do something illegal and lucrative."
"Once that's happened, then the first layer of value has been used up ... and it will get cheaper and cheaper over time," Wong said.
The leak intelligence website LeakIX has found that data from the Shanghai police database had been exposed as early as April 2021.
When the programmer was using the ElasticSearch server to build a big data search system for the Shanghai Public Security Bureau, he backed up the data to Alibaba Cloud, but turned it into a data visualization website in error, making all the information downloadable or viewable through Kibana.
Bob Diachenko, founder of cybersecurity research firm Security Discovery, has said via Twitter that his company was concerned about the exposure of this set of data in April this year, and no password was set until the database was hacked in June.
ChinaDan's "for sale" notice was in fact a ransom note.
"This data has been circulating for a long time, and now it has attracted attention because it has been sold on a forum used by many people," a data practitioner familiar with the industry in both China and the United States told RFA.
Alibaba Cloud won the bid for the Shanghai Public Security Bureau's "Smart Public Security Comprehensive Service Platform Construction Project" July 15, 2019 with a budget of 22.53 million yuan (U.S. $3.3 million), which was to include the building of a portal and search function for the database.
In 2020, on CSDN, China's largest technical blogging platform for programmers, a user shared how to back up data to Alibaba Cloud, and in doing so, inadvertently leaked the access key to the Shanghai police server.
This isn't the first time a breach like this has happened.
In 2019, 90 million documents belonging to the Jiangsu provincial police department were exposed on the publicly accessible ElasticSearch server.
And at the end of 2020, a list with the personal details of 1.95 million CCP members from Shanghai was leaked online.
David Robinson, founder of the data security analysis agency "Internet 2.0," said such breaches aren't linked, but are the result of systemic, political issues.
"The major concern is how they publish leaked data with [indicators of people's identity] with no regard for privacy," he said. "A lot of the time this type of leak the data can be tampered with, have deleted sections or additions to the data."
Anyone inadvertently exposing access keys can be arrested and charged with "destroying computer information systems," so they are unlikely to report any security breaches to their employer, industry insiders told RFA.
'White hats' at risk
Meanwhile, "white hat" data security researchers also face similar fears, meaning that vulnerabilities are unlikely to be identified, much less patched.
The shutdown of white hat hacker platform Wuyun.com in 2016, just six years after it started trying to get companies to pay more attention to cybersecurity, left the industry in disarray.
No reason for the shutdown was given at the time, but one suggestion is that Wuyun hackers may have exposed vulnerabilities in systems belonging to the CCP's outreach and influence arm, the United Front Work Department.
"If I find a loophole, I may contact software companies, developers or institutions in other countries, but you can't do that in China," a Chinese programmer surnamed Ma told RFA. "You have to submit it to the Cyberspace Administration first, and then to the state, but you don't know what they will do with it."
The central government has been strengthening controls over the management of security vulnerabilities.
The "Regulations on the Management of Online Security Vulnerabilities" jointly issued by the Ministry of Industry and Information Technology of China, the Cyberspace Administration and the Ministry of Public Security in July 2021, state that when a security vulnerability is discovered, the information must be shared with the Ministry of Industry and Information Technology within two days.
In December 2021, China's Ministry of Industry and Information Technology punished Alibaba Cloud for discovering an online security loophole linked to the U.S.-based Apache Software Foundation, but failing to report it to the telecommunications authorities in a timely manner.
Instead, the Alibaba Cloud security team had first notified Apache.
And the "Measures for Security Assessment of Data Exports" published on July 7 require a national security review for any transfer of personal data involving more than 100,000 people.
Companies undergoing such reviews must show the purpose of the data transfer, the security measures being taken, and the laws and regulations of the destination country, which investigators then review based on the possibility of data breaches.
Meanwhile, all references to the data breach have been censored on Chinese social media, with blog posts about the breach quickly deleted.
"Most Chinese people are asking similar questions, and the ones that are censored and deleted the most are: Has my data been leaked? How much data do they have about me? Why isn't my personal information stored securely?" Charlie Smith, co-founder of the China-based internet censorship watch website GreatFire.org, told RFA.
As one social media user commented wryly after the news broke: "Data is leaked; everyone's running around naked. It's a lovely day on the Chinese internet."
Translated and edited by Luisetta Mudie.