9/20/2023 9:57:17 AM | 13 minute read

Sex bias, racial bias and AI – An ESG issue for the life sciences and healthcare sectors

Get in touch

Nicholas Tyacke

Partner

Greg Bodulovic

Partner

Alexandra de Zwart

Senior Associate

+2 more...

Get in touch

Nicholas Tyacke

Partner

Greg Bodulovic

Partner

Alexandra de Zwart

Senior Associate

Alex Horder

Senior Associate

Jordan Davis

It is increasingly important for organisations wishing to be good corporate citizens to have robust policies in place surrounding the environmental, social and governance (ESG) issues that impact them, and to actively uphold these policies in their operations and behaviours, both internally and externally. Organisations in the life sciences and healthcare sectors are no different, typically encountering a wide spectrum of ESG-related issues, including in relation to antibiotic resistance, equitable access to healthcare and medicines, manufacturing waste and contamination management, recyclable packaging, ethical supply chain management, modern slavery, anti-bribery and corruption, and diversity and inclusion. It is vital, as ESG considerations in industry become increasingly important, that organisations’ ESG policies are able to adapt to cater for new issues that arise.

With the exponential increase in the use of Artificial Intelligence (AI) driven technologies, organisations are now faced with a suite of new ESG-related issues connected to the responsible and ethical design, training, testing, deployment and use of AI in their business processes, which need to be addressed in their ESG policy framework. Significant use of AI is already being seen in the life sciences and healthcare spaces: as we recently reported, there have been a number of AI developments that have the ability to revolutionise the way we develop pharmaceuticals and medical devices, personalise patient care and treatment, take medical notes, answer medical questions, manage illness, and perform surgery. However, to fully realise the benefits of these advancements, there are also hurdles to overcome.

While the world at large grapples with the possible form and function of AI regulation, and the legal consequences of attempting to regulate such a nascent and rapidly evolving technology, a significant proportion of the global population are approaching AI with abundant caution,[1] particularly where it is being used in the life sciences and healthcare sectors. A recent survey by the Pew Institute found that 60% of respondents said they would feel uncomfortable if their healthcare provider relied on AI for diagnosis and treatment recommendations, and 75% of respondents felt concerned that companies are moving too fast with AI, before fully understanding the patient-related risks.[2]

Accordingly, the use of AI in the life sciences and healthcare sectors presents some possible reputational and public perception issues, in addition to legal and regulatory considerations. Central to these issues, is the concept of algorithmic bias in AI systems. Where AI is used in a way that produces decisions or predictions that are biased, this creates a significant risk for user organisations when observed through an ESG lens – this article explores the nature of that risk in the life sciences and healthcare sectors, where AI-driven decision making can have a real-life impact on the lives of patients.

As part of our series on the use of AI in the life sciences and healthcare sectors, in this article, we consider the specific issues of sex bias and racial bias in AI algorithms, and what life sciences and healthcare companies can do to mitigate algorithmic bias, ensure the responsible and fair use of AI, and honour their ESG commitments accordingly.

Training AI Systems

The ultimate goal of any AI system is to produce useful output, be that an answer to a medical question, identification of a potential drug candidate, discovery of a new drug compound, personalisation of a treatment plan, or diagnosis of a disease.

In order to obtain a useful result, an AI system must be trained to generate such output, using training data that may have been procured from a raft of different sources. Training data (such as pictures, text or scientific literature, depending on the capabilities of the relevant AI system) is fed into an AI system and a learning methodology is implemented in order to teach the system to be able to generate (with varying degrees of user guidance) the desired output.

However, the fatal flaw in this process is that, because an AI system and the quality of its output is generally only as good as the data it was trained on, an insufficient quantity of data used to train an algorithm will likely compromise that quality. Further, if there is a lack of ‘diversity’ in the data (where the data does not incorporate diverse demographics), this could lead to an AI system generating output that is biased. For example, an AI system that is designed to detect skin cancer, that was trained on data taken from previous scans of persons with lighter skin tones, could produce false positive or otherwise incorrect results in relation to persons with darker skin tones.

If the training data is reflective of the social and cultural biases of human society generally, including sex and racial bias, this will likely result in those biases being reproduced in the form of the relevant AI system’s output.

The Problem with Sex-Biased Training Data

As the body of medical data and research to date has historically been developed using the male anatomy and biology, medical literature, as a whole, is sex-biased. One study of medical textbooks revealed that images of male bodies were used three times as often as images of female bodies to illustrate “neutral” body parts (i.e. body parts that are present in both men and women).[3] Even today, women are under-represented in clinical trials (including for treatment of diseases that affect more women than men), with a variety of questionable reasons having been given for this, including that the variation in women’s hormones supposedly makes data too difficult to analyse and interpret (though, of course, if hormonal variation can affect results in a clinical trial, it is also likely to affect results for women receiving treatment in a real world context, and thus study of those effects would seem to be both desirable and necessary to ensure an equal standard of care for men and women).[4] Regardless of the reason(s), the result is a dearth of medical and scientific data when it comes to women. Things are even worse when it comes to pregnant women, with Caroline Criado Perez lamenting that: [b]ecause of their routine exclusion from clinical trials we lack solid data on how to treat pregnant women for pretty much anything”.[5]

As explained above, not only does the use of male-biased literature and data in the context of training an AI system risk the perpetuation of male-centric biases in AI outputs, it also has the potential to result in poorer health outcomes for women. A lack of gender diverse representation in training data poses direct risks to women’s health if diagnoses or treatments are developed based on overwhelmingly male data, but applied to women as if their bodies, and responses to disease or treatment, are the same as those of men, which we know is not the case.

Studies have shown that there are "sex differences in every tissue and organ system in the human body, as well as in the ‘prevalence, course and severity’ of the majority of common human diseases”,[6] including “the fundamental mechanical workings of the heart”,[7] lung capacity,[8] prevalence of autoimmune diseases,[9] antibody responses and adverse reactions to vaccines,[10] blood serum-biomarkers for autism,[11] immune cells used to convey pain signals,[12] how cells die following a stroke,[13] the expression of genes important in drug metabolism,[14] and in the presentation and outcome of diseases like Parkinson’s, stroke and brain ischaemia,[15] to name just a few examples. In addition, the timing of treatment during a person’s menstrual cycle has also been found to impact outcomes and/or appropriate dosages for a wide range of drugs, including antipsychotics, antidepressants, antihistamines, antibiotics, and heart medications.[16]

Despite all these significant differences, women represent a mere 22% of participants in phase one trials, and even animal studies tend to be carried out on majority male animals.[17] This lack of data about the female body already results in poorer health outcomes for women; the use of AI could worsen the current state of affairs unless AI systems are actively trained on (sex-disaggregated) data that meaningfully represents women.

Racial Bias in AI Systems can Lead to Unequal Health Outcomes

AI systems that have not been trained on data sets that reflect a sufficiently diverse range of racial and cultural demographics, will also have the propensity to generate racially biased outputs, which can also contribute to unequal health outcomes for ethnic and racial minorities. This means that advances in AI may not necessarily translate into advances in life sciences and healthcare for Black, Hispanic, Indigenous, and other racial and cultural minorities and persons of colour. For example, a 2019 study found racial bias in an algorithm widely used to guide health decisions in the United States. The authors found that Black patients assigned a certain level of risk were overall sicker than white patients assigned the same risk level.[18] This was because the algorithm used healthcare cost, rather than illness, as an indicator of health needs and, because generally, less money is spent caring for Black patients than white patients, the algorithm wrongly concluded that Black patients are healthier than equally sick white patients. Racial bias was similarly identified in an algorithm used by the US government to distribute COVID-19 financial aid to hospitals, with the result that communities with large Black populations received disproportionately low funding.[19] Racial bias has also been found in algorithms used to guide patient care regarding breast cancer, heart and kidney failure, chest surgery and obstetrics.[20]

Just as women are under-represented in clinical trials, Black, Hispanic, Indigenous, and other racial minorities and persons of colour are also frequently under-represented, which may be due to poor discoverability and reachability, lack of participation, or ineligibility to participate;[21] this can result in a lack of relevant scientific data related to those classes of person. For instance, in the US, Black and Hispanic patients are generally under-represented in clinical trials and, for Black patients, participation has actually declined considerably in the last five years.[22] Participation also varies in relation to specific therapeutic areas, with Black patients under-represented in trials relating to oncology, hepatology, endocrinology and respiratory therapies.[23]

This can be problematic because the clinical trial data used to approve novel treatments, and the scientific literature used to train AI systems, are unlikely to consider diverse populations, which may (and has previously) lead to poorer health outcomes for Black, Hispanic, Indigenous, and other racial minorities and persons of colour. For example, as we recently reported, the pulse oximeter, a device that uses light transmitted through tissue to ascertain oxygen concentration and help determine whether COVID-19 patients have developed hypoxemia and silent hypoxia, was found to be more likely to overestimate oxygen levels in people with darker skin (due to the higher level of melanin pigmentation in darker skin, which absorbs some light transmitted by the device), resulting in poorer disease identification and treatment.[24] This issue arose because clinical testing of the pulse oximeter involved largely white patients; it is therefore easy to see how an AI model trained on historical patient data may misdiagnose patients with darker skin tones. Evidently, if the data on which an AI system is trained is racially biased, this may cause racially biased output, leading to disparate health outcomes.

How Might an Organisation Resolve Algorithmic Bias?

It is of paramount importance that, where AI systems are used in a manner that could actually affect people and their health, the relevant systems are trained on a diverse data set (for example, gender diverse, racially diverse, or disaggregated data (by sex and racial or ethnic background) that meaningfully represents a variety of demographics. Doing so will help minimise the potential for algorithmic bias in that system to generate adverse results in respect of certain demographics that are underrepresented (or unrepresented completely) in the relevant data set. For the life sciences and healthcare sectors, this could be difficult, as the development of more diverse data sets will rely largely on more studies being conducted on women and people who are Black, Hispanic, Indigenous, and of other racial minorities and persons of colour.

In June 2022, the Australian Therapeutic Goods Administration (TGA) published clinical evidence guidelines for medical devices, and listed medicines. These guidelines raised, amongst other things, the need for evidence to be representative of, or capable of being reasonably extrapolated to, the general Australian population (for example, consideration as to whether data generated from a patient group is representative of the intended treatment population). Consideration of the diversity of participants in clinical trials has also been mandated in the US and Europe (for further information, see here). Although a long-term goal, this is potentially a useful approach because it addresses the underlying cause of bias in the (training) data itself (i.e. in clinical trials) proactively, rather than focusing on and seeking to reactively minimise the downstream effects of biased AI output. In the meantime, however, organisations should take care in respect of data selection and AI system training to ensure, as far as is possible, that bias in AI algorithms is minimised and that the function of the relevant AI system does not further perpetuate and amplify sex or racial bias.

While this is a challenge, it is also an opportunity for life sciences and healthcare companies to be at the forefront in tackling the issue of algorithmic bias, through an ESG lens. Policies should be developed and new strategies implemented to address the bias risk that comes with AI use and mitigate associated legal risks that may come from, for example, AI producing biased or discriminatory outcomes, or persons relying on AI decisions/outputs to their detriment (for example, an erroneous determination that a patient with cancer is in remission). Organisations implementing a proactive approach to these issues will also seek to gain a reputational advantage in their relevant market that comes from being seen to be conscious of, and address, AI use issues.

Further, as the risks pertaining to algorithmic bias and explainability/transparency are likely going to be a key regulatory focus for most jurisdictions seeking to regulate AI in one way or another, organisations have much to gain by implementing good AI governance strategies early, in order to ‘get ahead of the curve’ in respect of the regulation that we know is coming.

We are already seeing major global life sciences companies begin to work towards resolving issues related to algorithmic bias; firstly by acknowledging the problem of biased training data, and then developing strategies and tools to reduce that bias and prevent its negative effects (as discussed in our previous article).

Organisations deploying AI-driven tools should also have a robust organisational framework in place for testing their AI systems for bias, and then seeking to mitigate or remove it, once identified. As an example approach, the Australian Federal Government’s recent discussion paper on the safe and responsible use of AI recommends that where developers cannot mitigate or remove unwanted bias in an AI system, they should either reconsider the appropriateness of deploying the AI system at all, or find supplementary data to ‘diversify’ the system’s training data.

Further to this high-level recommendation, where an organisation is concerned about algorithmic bias and the consequences thereof, or indeed, identifies bias in its AI system’s output, there are a number of steps that may be taken to investigate, mitigate and (hopefully) resolve that bias. In a forthcoming article on this issue, we will provide practical and actionable guidance on how organisations can resolve algorithmic bias, and will consider the position regarding the use of AI systems purchased or licensed from third parties.

Looking Forward

Just as the focus of corporate ESG policies has had to shift in light of increasing and varying environmental concerns, so too, do such policies and frameworks have to adapt to the evolutionary uptake of AI, particularly in the life sciences and healthcare sectors. While developing a robust and appropriate policy framework is a start, these measures must be meaningfully and proactively implemented alongside additional strategies and safeguards designed to deal with more significant issues pertaining to the use of AI within organisations, including algorithmic bias, as well as a raft of other potential issues. Together, this will assist in addressing the issues of sex and racial bias in AI systems’ training data and reducing the downstream consequences that biased AI output may have on organisations and their broader business ecosystem.

Life sciences and healthcare companies, as good corporate citizens, and with an increased focus on their role and impact on society by both regulators and the general public, should lead the way.

How can DLA Piper Help?

DLA Piper’s global, cross-functional team of 100+ lawyers, data scientists, programmers, coders and policymakers deliver technical solutions to our clients all over the world, on AI adoption, procurement, deployment, risk mitigation, monitoring and testing, and legal and regulatory compliance. We also offer a unique-to-market forensic data science capability, enabling us to help clients monetise and productise data, develop AI systems and algorithmic models in a legal and ethical manner, and conduct verification of AI systems to detect and mitigate algorithmic bias.

Because AI is not always industry-agnostic, our team also adopts a sector focus and has extensive experience in the life sciences and healthcare sectors; we’re helping industry leaders and global brand names stay ahead of the AI curve.

References:

[1] https://ai.uq.edu.au/files/6161/Trust%20in%20AI%20Global%20Report_WEB.pdf.

[2] https://www.pewresearch.org/science/2023/02/22/60-of-americans-would-be-uncomfortable-with-provider-relying-on-ai-in-their-own-health-care/.

[3] The Drugs Don’t Work – reference 4.

[4] Invisible Women (Vintage, 2020), p 200.

[5] Ibid.

[6] Invisible Women, p 198, citing Marts and Keitt (2004) and Karp, Natasha A. et al (2017), ‘Prevalence of sexual dismorphism in mammalian phenotypic traits’, Nature Communications, 8:15475.

[7] Invisible Women, p 198, citing Martha L. Blair (2007), ‘Sex-based differences in physiology: what should we teach in the medical curriculum?’, Advanced Physiological Education, 31, 23-5.

[8] Ibid.

[9] Invisible Women, p 198, citing https://www.washingtonpost.com/national/health-science/why -do-autoimmune-diseases-affect-women-more-often-than-men/2016/10/17/3e224db2-8429-11e6-ac72-a29979381495_story.html?utm_term=.acef157fc395.

[10] Invisible Women, p 198, citing https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4157517/

[11] Invisible Women, p 198, citing https://docs.autismresearchcentre.com/papers/2010_Schwartz_SexSpecific_MolAut.pdf.