Invisible Women: "no inherent clinical utility"
AI discovering sex differences we didn't know existed, and dressing aids made for men
Hello GFPs! I have more news from the world of Great British Gaslighting. So the good news is that either Anneliese Dodds or someone in her office is subscribed to this newsletter because it prompted her to table this Parliamentary Question:
To ask the Secretary of State for Health and Social Care, what assessment he has made of the implications for his policies of reports that women are struggling to access HRT despite medical bodies reporting no supply issues.
The bad news is, the government has replied with pretty much the written form of this gif:
Sigh. Any more for any more?
Gender data gap of the week
Another pretty amazing one for you this week, GFPs —thankfully for all concerned it’s much better news than last week’s lecanemab analysis which got a LOT of you justifiably furious.
This week we’re talking AI. And as many of you will know, AI has historically come in for quite a lot of criticism from me because many of the people developing AI have simply no understanding of sex and gender and as a result are often blithely, and literally, coding inequality into our future. Actually, they’re often making it worse. In Invisible Women I wrote about a phenomenon called the amplification effect. This is where a machine learning algorithm takes the bias in a dataset and makes it worse. A study I cited in the book tested this effect by training an algorithm on a dataset that had a known bias:
In the 2017 images study, pictures of cooking were over 33% more likely to involve women than men, but algorithms trained on this dataset connected pictures of kitchens with women 68% of the time. The paper also found that the higher the original bias, the stronger the amplification effect, which per- haps explains how the algorithm came to label a photo of a portly balding man standing in front of a stove as female. Kitchen > male pattern baldness. (IW, p.167)
This should be a concern for everyone, because AI is rapidly being introduced into pretty much every arena you are to think of from HR to criminal justice, but the area I’ve always been most concerned about when it comes to the introduction of AI is healthcare, because we know that the data gap here is already killing women. We do not have capacity for AI to come in and make things worse. And yet, that is exactly what seems to be happening.
Shortly after Invisible Women was published, I wrote in TIME magazine about an algorithm that was heralded around the UK media at the end of 2019 as being able to “predict heart attacks 5 years before they occur.” Well, default male heart attacks, maybe. Female heart attacks, not so much given they provided hardly any sex disaggregated data and made no mention of sex sensitive risk factors like smoking, diabetes, blood pressure, pregnancy, irregular periods, early periods, the list goes on.
In season one of the Visible Women podcast we did an episode on AI in healthcare specifically and we spoke to Irene Chen, a postdoc at the Microsoft Research Lab in New England whose focus is on machine learning for equitable healthcare. Irene told us about a study she had done, training an algorithm on 3 commonly used chest x-ray datasets. She wanted to see how the algorithm performed on different groups, like ethnicity, age and sex. Here’s a clip of Irene talking about the study:
TL;DL (that’s too long didn’t listen OBV), Irene found that the algorithm, trained, remember, on 3 very commonly used chest x-ray datasets, was systematically under-diagnosing women. And the worst part was that, before her study was published, no one had even been thinking about sex differences in performance. All the papers published up till that point had been raving about the radiologist-level performance of the algorithms.
This is typical of the AI space. It’s not so much that developers hate women and want us to die as that they simply ARE NOT THINKING ABOUT SEX. Not this kind of sex anyway.
My research for the podcast left me feeling a bit gloomy: all the papers I read left me with the strong impression that fixing the (amplified) bias problem in AI was REALLY hard, given the lack of good data on female bodies. But then Irene told us about some really cool research she was involved in. Here she is talking about it:
Basically, Irene’s approach to AI is that instead of using it to replicate what we are already doing with doctors, we could — and should — instead be using it to fill in the gaps that doctors and researchers have left open for decades. Irene has chosen to do this with domestic violence (the algorithm she developed was able to spot a victim on average 3 years in advance of them entering a violence prevention programme); others have been working on algorithms that could diagnose endometriosis — as I explained in Invisible Women, this is a debilitating disease that affects 10% of women, but takes an average of seven years to diagnose.
Another algorithm I mentioned briefly in the podcast sifted through thousands of papers across 11 different disease to quantify the extent of the data gap in these research areas; this is hugely important because at the moment this work of measuring the data gap is done by hand, sporadically; being able to automate it, and therefore do it cheaply and regularly, then we would be able to draw on a much better evidence base to quantify the extent of the problem.
OK, this is all very well, CCP, you say very reasonably, but none of this is new content. What’s the goss?
Invisible Women is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.
WELL. The reason I’ve written this long preamble is that this week I’m talking about an intriguing paper that was sent to me by GFP Helen Lewis (check out her newsletter here). This paper described an algorithm that was able to predict sex from the eye’s retinal fundus. And what’s kind of wild about that is that…we didn’t know there was a sex difference. We don’t know what it is that the algorithm is spotting that’s enabling it to discern sex from these images of eyes.
The sex difference was discovered kind of by accident in another paper a few years ago that was looking at how retinal scans could be used to help predict cardiovascular risk. This paper stated that “our results show strong gender differences in the fundus photographs and may help guide basic research investigating the anatomical or physiological differences between male and female eyes,” and yes they said gender when they mean sex but I’m cutting them some slack here because at least they disaggregated their data and recognised that this finding was potentially important.
Anyway, the more recent paper that Helen sent me looked more specifically into this sex difference and confirmed the earlier paper’s findings: there really is a sex difference in the retinal fundus, we just don’t know what it is yet.
Here, the physiologic cause and effect relationships are not readily apparent to domain experts. Predicting gender from fundus photos, previously inconceivable to those who spent their careers looking at retinas, also withstood external validation on an independent dataset of patients with different baseline demographics. (Source)
These researchers are also using sex when they mean gender (although they also use sex later on in the paper 🤷♀️) and I’m less inclined to forgive them as while this paper is undeniably interesting, the authors are at pains to not once, but TWICE point out how unlikely it is that this sex difference is really in any way interesting beyond its function in demonstrating how deep neural networks can “derive previously hidden patterns in large volumes of data.”
The sex difference is first “not likely to be clinically useful,” and then later on really emphatically disavowed:
While our deep learning model was specifically designed for the task of sex prediction, we emphasize that this task has no inherent clinical utility. Instead, we aimed to demonstrate that AutoML could classify these images independent of salient retinal features being known to domain experts, that is, retina specialists cannot readily perform this task.
Yes, we’re back to sex, clearly sex and gender are interchangeable.
Irrespective of the narrow-mindedness of these researchers, who can think of no possible reason anyone might like to investigate these sex differences further (I mean, for example if we’re using retinas to predict cardiovascular risk, and there is a sex difference in retinas and in heart attack risk factors….I dunno I guess feel like this is something worth at least looking at???), this is a really exciting example of the potential for AI to do the work we haven’t bothered to do when it comes to the medical data gap for women. And another example of how we should be thinking of AI not as a replacement for humans doing the work we’re already doing, but as an effective way to do all the work we’ve been ignoring for decades.
Looked at this way, AI could be a total game-changer — for the better.
Default male of the week
GFPs, this week I’ve been looking after my mum, who needs a hip replacement op. She was meant to have it on Friday and I was planning on being here to take her to hospital and to stay with her after the op until she’s recuperated. On Thursday night, the hospital called to cancel, which was awful news as she’s suddenly deteriorated so rapidly over the past couple of weeks. A few weeks ago she was in pain, but still having a pretty normal life. Now she can’t get dressed on her own.
I’ve been helping her, but we are also trying to find a whole load of mechanical aids, as obviously she would prefer to be able to do this all herself.
One of the major issues is putting her socks and pants on. The pants we have solved with a grabber (technical term) and the socks we hoped to solve with another device that looks like this:
My mum’s sock meanwhile, looks like this
It literally does not fit on the sock dressing aid BECAUSE OF COURSE IT BL**DY DOESN’T, the thing is ENORMOUS! ARGH. I would shame the manufacturer, but looking online they ALL seem to be this ludicrous size — if anyone knows of a female-friendly sock dressing aid PLEASE do get in touch!
Christmas bonus default male of the week
As any GFP knows, I get pretty furious whenever people act like social infrastructure isn’t infrastructure. So you can imagine how I reacted when I saw this tweet:
For those who missed the memo, let’s give ourselves a little refresher from Invisible Women, shall we?
The term infrastructure is generally understood to mean the physicalstructures that underpin the functioning of a modern society: roads, railways, water pipes, power supplies. It doesn’t tend to include the public services that similarly underpin the functioning of a modern society like child and elder care.
The Women’s Budget Group argues that it should. Because, like physical infrastructure, what the WBG calls social infrastructure ‘yields returns to the economy and society well into the future in the form of a better educated, healthier and better cared for population’. Arguably then, this exclusion of care services from the general concept of ‘infrastructure’ is just another unquestioned male bias in how we structure our economy.
[…] In the UK [investment in social infrastructure] would generate up to 1.5 million jobs, compared to 750,000 for an equivalent investment in construction. In the US, an investment of 2% of GDP in the caring industries ‘would create nearly 13 million new jobs, compared to the 7.5 million jobs that would be created by investing 2% of GDP in the construction sector’. And, because the care sector is (currently) a female-dominated industry, many of these new jobs would go to women – remember that increasing female employment drives GDP.
The WBG found that investing 2% of GDP in public care services in the UK, US, Germany and Australia ‘would create almost as many jobs for men as investing in construction industries [...] but would create up to four times as many jobs for women’. In the US, where two-thirds of newly created care jobs would go to women compared to only one-third of newly created construction-sector jobs, this investment would increase women’s employment rate by up to eight points, reducing the gender employment gap by half. In the UK the investment would reduce the gender employment gap by a quarter. (IW, pp.248-50)
It goes on, but if you want more you’ll have to buy the book, ha! Maybe send it to the Minister as a Christmas present…
I’ll be honest, GFPs, I hadn’t been intending to write about this this week, as I’d already made my plans for this week’s newsletter; I thought I might save it and write it up properly for next year. But then I read this newsletter by Meg Conley…
…and it made me so furious I couldn’t help myself. Read her post and tell me again how childcare isn’t infrastructure.
Job of the week
Another amazing job opening for a Senior Public Affairs Officer from my FAVOURITE NGO, The Women’s Budget Group. Long-time GFPs will remember that this UK-based organisation regularly gender analyses UK budgets and policy — work that the government should be doing but fails to do because, well you know why. Working for them would be a dream come true for any GFP! Application deadline is 9th Jan so get your skates on!
Poppy pic of the week
That’s it! Until next time, my dear GFPs…which now I think of it will probably not be for a while, because Christmas and then I’ll be away. So have a wonderful Christmas and New Year and see you on the other side! xoxoxo