Exclusive: Documents seen by Guardian show tech firms using information to build ‘Covid-19 datastore’
Technology firms are processing large volumes of confidential UK patient information in a data-mining operation that is part of the government’s response to the coronavirus outbreak, according to documents seen by the Guardian.
Palantir, the US big data firm founded by the rightwing billionaire Peter Thiel, is working with Faculty, a British artificial intelligence startup, to consolidate government databases and help ministers and officials respond to the pandemic.
Data is also being used by Faculty to build predictive computer models around the Covid-19 outbreak. One NHS document suggests that, two weeks ago, Faculty considered running a computer simulation to assess the impact of a policy of “targeted herd immunity”. Lawyers for Faculty said the proposed herd immunity simulation never took place.
NHSX, the digital transformation arm of the National Health Service that has contracted the tech companies to help build the “Covid-19 datastore”, said the technology would give ministers and officials “real-time information about health services, showing where demand is rising and where critical equipment needs to be deployed”.
“The companies involved do not control the data and are not permitted to use or share it for their own purposes,” a spokesperson said. Faculty’s lawyers said the firm only had access to aggregated or anonymised data via NHS systems.
The government had previously said it would use Faculty and Palantir in a Covid-19 data project. But the full scope of that operation, and the sensitive nature of patient-level data being used, is revealed in the documents seen by the Guardian.
One portion of the project involves giving leaders in the NHS, Cabinet Office and Downing Street a live feed of “aggregate” statistics on hospitalisations, availability of critical care beds, ventilator orders and oxygen supplies.
However, the documents also appear to show the project includes large volumes of data pertaining to individuals, including protected health information, Covid-19 test results, the contents of people’s calls to the NHS health advice line 111 and clinical information about those in intensive care.
While such data will be anonymised, it remains sensitive and confidential, and its use on a centralised new government database is likely to raise questions among privacy experts. A Whitehall source said they were alarmed at the “unprecedented” amounts of confidential health information being swept up in the project, which they said was progressing at alarming speed and with insufficient regard for privacy, ethics or data protection.
The documents also suggest that:
- While anonymised, confidential 111 information in the Covid-19 datastore may include people’s gender, postcode, symptoms, the mechanism through which any prescription was dispatched to them, and the precise time they ended the call.
- The project appears to be using a “pseudo NHS number” to cross-match large datasets, including a master patient index, an existing NHS resource that uses “social marketing data” to segment the British population into different “types” at household level.
- While not a current priority, phone location data could be used in the datastore after it was “offered” to the government by two private companies for help with contact tracing. The NHS declined to say which companies had offered the location data or how it would be used.
- Faculty’s proposed simulation of a policy described as “targeted herd immunity” was part of an NHSX and Faculty planning document considered around 23 March, more than a week after ministers insisted the controversial policy was no longer being contemplated.
Lawyers for Faculty suggested the proposed simulation was the result of entirely internal, preliminary discussions. The planning document listed potential analysis of the impact of “targeted herd immunity (only isolate most vulnerable parts of population)” alongside other possible government policies such as social distancing, school closures and household quarantines.
The document was considered by senior NHS officials more than a week after the health secretary, Matt Hancock, tried to draw a line under the government’s controversial flirtation with the strategy of herd immunity, which involves enough people contracting the disease to develop population–level resistance.
The NHS said the data in the Covid-19 datastore would remain under its control and be subject to severe restrictions under data protection legislation. “Strict data protection rules apply to everyone involved in helping in this critical task,” an NHSX spokesperson said.
However, the Guardian was able to see confidential documents used by Palantir, Faculty and NHSX officials to plan, develop and execute the Covid-19 datastore. It is unclear who was responsible for making the documents – which did not contain NHS patient data – accessible via an unrestricted portal.
The Whitehall source described the open accessibility of the documents as a “shocking data breach”. Palantir did not respond to repeated requests for comment. Faculty’s lawyers said there had been no breach of patient or other sensitive NHS data.
The involvement of private sector data scientists at the heart of the government’s Covid-19 response stems in part from a Downing Street “summit” on 11 March attended by executives from dozens of tech firms, chaired by the prime minister’s chief adviser, Dominic Cummings, an enthusiast of artificial intelligence and computer modelling.
Faculty, which had a pre-existing contract to build an artificial intelligence lab for the NHS, took on a leading role in the data response to the pandemic. It is run by Marc Warner, whose brother, Ben, was reportedly recruited to Downing Street by Cummings after running the Conservative party’s private election model.
Ben Warner, who used to be a principal at his brother’s AI company, is said to have worked closely with Cummings on the modelling programme used in the Vote Leave campaign to leave the European Union. Faculty’s lawyers said its NHS contract was the result of a tender process that was not influenced by Cummings.
Faculty said in a statement it was “enormously proud” of its work for the NHS, which it said was helping save lives. “Faculty is not dealing with personally identifiable information. Faculty is helping to develop dashboards, models and simulations to provide key central government decision-makers with a deeper level of information about the current and future coronavirus situation to help inform the response.”
Palantir’s role in the project involves integrating NHS datasets with the US company’s data-management platform, Foundry. Microsoft, Google and Amazon products are also being used on the datastore project, but staff at those companies are understood to be less directly involved.