Why the NHS loves Palantir
Palanatir have been brought in to provide a federated data platform to the NHS. The NHS has a fragmented data landscape. Data is broken into functional silos split across hospitals, regions and departments.
Data collected by GPs is not connected to the NHS; it's stored in their own local systems. There are 200 different NHS trusts, each using different flavours of EHR (electronic health records). There are statistical outputs from hospitals (number of Accident and Emergency Visits) and many more sources of data.
It is the ultimate in data challenges.
How do you get all this into a data product to make informed decisions on where to recruit, identify trends and provide an accurate picture of healthcare? The NHS has a global advantage in that it's health system is both large and centralised, unlike the US where healthcare data is privatised.
The NHS haven't chosen to go with Palantir. A controversial one given the involvement with contrarian billionaire, Peter Thiel, but in defence of NHS procurement, they are likely the best able to make progress.
Why would you go with Palantir?
Palantir emerged from the "Paypal mafia", a group of highly successful technology companies that emerged in the early 00s.
They offer a full data engineering solution through their Foundry product. Their consultants are also widely recognised as top data engineering consultants.
The Palantir foundry platform brings data pipelines, transformations, monitoring and quality control into a low code type environment. This makes it easy to build data products, harmonising the data sets.
Data connectors are exposed to ingest data in a variety of means, flat files, apis, db transfers, csv… This is all foundry specific and best-in-class.
Also provided in the suite of products:
Bind data to Object, Actions and Processes. It allows data assets to behave like a digital twin and exist in relationships with other assets.
Track where data has originated from, where/how it was transformed and where it is being used. This is an important capability to allow clients to trust data products.
Transforms and pipelines are created and stored as code assets. This supports robust testing and version control.
A Power BI equivalent specifically built to browse Palantir warehouses.
Lock in, of the most severe sort…
The main drawback with Palantir is the vendor lock in. Everything is bespoke to the Palantir ecosystem. The database, the pipelines, the ETL processes, connectors, scheduling. It will be close to impossible to migrate away from Palantir should the NHS sour on the contract.
Integrating regional data silos comes at huge costs and is frequently underestimated. This will become another lock-in property that will keep Palantir in the NHS for the long run.
Palantir have FAANG level engineering capabilities and pay salaries to match. 200k+ and salary options to boot. This means they can create complex systems beyond most software engineering capabilities.
While providing great functionality, it means replacing this level of expertise will require eye-wateringly expensive staff to maintain. More likely, is that the NHS will be paying high consulting fees indefinitely.
The contract is already set to £330 million + £120 million extension. This is likely to go up as the scale of the data integration challenge is so vast.
UK Patient data should be guarded with national level security. While data will be hosted locally, there are concerns about what data sharing actually entails. Anonymised data is highly prized across the world. Given the unique position the NHS is in to collect this, the UK Government needs to be wary of what contracts and agreements are in place.
What does the rest of the field bring
If someone asked Metaops tomorrow to build a similar system, our answer would be of course! But also, be wary of the incredible scope in this project. A solution using a typical open source stack would include:
iPaas and Data integration solutions such as Glue to ETL data from existing systems.
Data Engineering teams that would be source specific i.e. teams to enable data pipelines from all disparate sources.
Databricks experts in place to setup staging areas and configure transforms.
Data catalogues and Ontology, open to the market and ideally interchangeable.
PySpark expertise to provide transformations.
DBT to create the necessary views and exports on the transformed data.
PowerBI as a warehouse browser.
Why might go wrong
The major problem we see with the Palantir initiative is scale. The sheer heft of moving all data sources into a single warehouse may take decades to achieve. Given the nature of the NHS, this will require serious coordination and some incredible leadership.
While the project may show quick wins, organisational inertia could be the killer here. It should not diminish the potential for progress however.
The second major barrier will be educating staff. Palantir configuration requires experienced professionals, this is likely to be a major stumbling block in the transition to BAU.
It's an exciting phase for NHS data engineering and the most important technical transformation for the UK government this decade.