Using machine learning to make healthcare services better, cheaper and safer


Most of the interest (and venture capital funding) in healthcare AI is currently focused on very clinical use cases – automatically interpreting CT scans or retinal photographs for example, or trying to make a diagnosis from patients’ symptoms. These are the types of uses of AI that feature in a typical doctor-patient consultation. But behind each consultation is the whole multi-trillion dollar industry of healthcare: all the work, activity, people, and infrastructure that would perhaps be less obvious to patients (and hence data scientists and engineers turning an eye to using AI in healthcare) but which actually make up the bulk of healthcare activity and expenditure. Human resources, management, financial administration, logistics and supply chains, planning, laboratories, facilities, R&D, safety systems : all of these are data rich aspects of the healthcare industry where AI systems could find many uses.

I have a particular interest in using data to understand and improve the quality and safety of healthcare and I am struck by the wide number of uses that one particular approach to AI, machine learning, could have in the type of work I do. One of the challenges faced in making this happen is that the machine learning experts (by and large) know very little about the healthcare industry, and the healthcare experts (in turn) know very little about machine learning. Bridging this knowledge gap through collaboration is going to be key.

So what then are the types of problems in healthcare quality that could be addressed by machine learning? Here I outline, in an admittedly extremely broad and simplistic sense, the main types of problems that machine learning algorithms can be used to solve, and how they could be used to make the industry of healthcare better, cheaper and safer.


These are problems where the goal is to classify data into groups or categories. Examples include systems to help self-driving cars detect and avoid pedestrians or to automatically classify photographs according to subject matter (“Pictures of cats and dogs”)

  • Classifying hospitals into categories of performance or service provision, to generate hospital quality ratings or scorecards
  • Classifying patients into different categories based on diagnostic or procedure codes, or measures of healthcare utilization and cost (such as length of stay). These classifications are widely used as the currency in healthcare payment and reimbursement systems

These are problems where the goal is to make predictions based on an existing set of data. Examples include prediction systems used in the finance (e.g. Financial forecasting, fraud detection) and retail (e.g. More efficient logistics by predicting demand)

  • Predicting the effects of a service reorganisation or a quality improvement intervention (e.g. What will happen if we introduce this new patient referral pathway?)
  • Predicting patient outcomes for prognostication, providing better information for shared decision making or planning future health and social care needs
  • Estimating case mix adjusted outcomes such as survival rates after cancer or rates of surgical complications. These case mix adjusted outcomes are often used to compare the quality of hospitals
  • Predicting counterfactuals (what would have happened if the intervention had not taken place) as part of the evaluation of service reorganization or improvement interventions
  • Predicting variation in demand for healthcare services

These are problems where the goal is to identify data points that are similar to each other. For example, clustering algorithms are widely used in recommender systems in online retail (“Customers who bought this item also bought these….”) and in entertainment platforms such as Netflix and Spotify

  • Identifying inequalities in care provision and quality, according to time (e.g. the weekend effect), place (e.g. geographical disparities) and person (e.g. inequalities)
  • Estimating the associations between processes of care and patient outcomes. These types of analyses are widely done as part of epidemiological or health services research studies and are useful in generating hypotheses for randomized controlled trials
  • Grouping  together similar healthcare providers to enable more representative benchmarking and comparisons (e.g. Between hospitals or between surgeons)
  • Identifying subgroups of patients with unexpectedly poor outcomes. This could help in detecting safety problems
  • Detecting significant patterns in time series data (Anomoly detection; also a Regression type problem). Time series such as Run Charts and the various flavor of Statistical Process Control chart are some of the most frequently used tools in healthcare quality improvement

This is the process of identifying the most significant variables (“features” in the language of ML) in datasets with lots of variables. These methods can use used to help summarise complex datasets

  • Devise and select metrics to measure the quality and safety of healthcare systems
  • Extract relevant information from electronic healthcare record systems with large numbers of data items
  • Design datasets for programmes to measure the quality and safety of healthcare (e.g. Clinical registries and audits)


This is a very high level and simplistic look at the types of ML methods available – underneath this extremely broad (and arguably over simplistic classification) are a whole ecosystem of different methods and families of ML algorithms. The other key ingredient here is of course, data – without training data these algorithms are merely concepts. Healthcare is full of data, but using it for machine learning is going to throw up all sorts of technical and ethical challenges. More on this another time…




Artificial Intelligence & Machine Learning in Healthcare (the first of many posts)


AI is HOT right now. A lot of money and (so far, mostly human) brains are being put into developing ways of using AI in all sorts of industries: finance, logistics, manufacturing – it’s actually quite hard to find an area of human endeavour which someone, somewhere is not trying to build an AI system for.

One of the areas where AI has been generating the most amount of interest is healthcare. Indeed, Channel 4 News did a nice piece about AI in medicine this week, featuring the likes of Google Deepmind and a lot of visual imagery of glowing numbers cascading down hospital curtains.

Graphic designers: still stuck in the Matrix

There are a lot of people in healthcare who aren’t sure what AI is and what it might mean for their everyday work. I’m not an expert by any means, but I thought that a gentle introduction might be useful for all the doctors, nurses, managers, therapists, pharmacists (and all the many other people working in healthcare) who want to know a bit more about what this means.

Firstly, some terminology. “Artificial intelligence” is a general term for building computer systems to do the things that our human brains are good at: solving complex problems, recognising patterns, communicating through speech, making forecasts about the future and so on.

“Machine learning” is one example of a method used to build AI systems. The central idea here is actually quite intuitive (and the clue is in the name) – it’s all about learning. How are we able to drive a car, speak English, French or Japanese, create a beautiful work of art? These skills are not hard coded into our brains, or activated in an instant when we reach certain ages: we learn them over time though interacting with and sharing data with the world around us.  Machine learning takes the same principle and applies it to computers. Instead of programming computers to do things based on fixed rules (“If yes in English then output oui for French or はい for Japanese”) machine learning involves feeding in data and training the computer to do the thing we want it to do. If we do this well, then we end up with an algorithm (a set of instructions or actions) that does something useful – recognising faces in photos for example, or translating languages. Some of the most exciting recent advances in AI have come about from new techniques in machine learning.

The critical thing about machine learning is that you need data to train these algorithms: often a lot it. Data is the fuel for making useful machine learning algorithms.

So what does this mean for healthcare? Well there are a few reasons why healthcare is fertile ground for AI, and machine learning in particular:

  1. Healthcare is full of complex problems and pattern recognition. The classic example here is the process of making a diagnosis, based on interpeting a rag bag mix of language data (symptoms) and quantitative data (lab tests and imaging). In the past humans were able to solve these types of problems much better than computers – but these are things that AI is getting increasingly good at
  2. Healthcare is full of data – from electronic healthcare records, to lab data, to administrative data, healthcare is heaving with data. Machine learning algorithms will be at home here
  3. Healthcare is expensive. Rich countries spend something like 10% of their GDP on healthcare – globally the amount of money spent on healthcare is probably north of $10 trillion per year. Most of this is spent on people (roughly 60% of the NHS budget is spent on staffing costs for example), since healthcare is mostly a people business. There is therefore a very strong economic drive to automate some of the things in healthcare that people currently do
  4. Healthcare has lots of scope to be, well…better. If we invented no new drugs or medical devices for the next 10 years, and just tried very hard to apply the things we know work, reduce the number of errors and mistakes, close inequalities in access and provision – basically improve the quality of existing healthcare – then we could achieve vast improvements in patient outcomes. Healthcare is a long way from being optimised, meaning that there is lots of room for new ways of doing things (such as AI) to make healthcare better

Making all this happen is however, another matter. From a technical point of view, the role of AI in healthcare is still very limited, but is moving fast. Some areas of healthcare are going to be affected quicker than others. Most of the biggest recent advances in machine learning have been in language processing and image recognition – I suspect the first machine medics are going to be graduating in radiology (…and working 24/7, for no pay). I also think that there are whole swathes of healthcare that are currently off the radar of the likes of Google Deepmind, but where the biggest gains could me made: managing and improving healthcare services for example.

The rise of AI in medicine is also going to raise all sorts of issues. What are the ethical implications of a world where algorithms are making medical decisions? What does this mean for legal liability and regulation? What new skills and knowledge will the people working in healthcare need? What do patients want and who is going to be in control? Who is going to own and profit from this? What are the implications for how we collect and use sensitive healthcare data? If data is so central to all this, do we need to be investing in collecting better data? I hope to explore these topics, and more, in future posts.