Pride in data!

The blog for the project has been quieter than we might have wanted this Pride Month. This is partly to work and life things getting in the way.

The other bigger challenge, that we have to be honest about (and our funders are well aware of this!) is data access issues. The issue is sexual identity/orientation data is considered special category data under GDPR. This scares people a lot, as the headline message from GDPR is that it is not legal to process this data.

This is completely understandable. In many EU member states, LGBTQ+ people still face persecution and legal discrimination. The sharing of such data could put people’s lives at risk. Further, someone’s sexual identity is deeply personal – indeed, from my experience, it is usually heterosexuals that have the greatest concerns about their sexual identity being shared (“how dare you presume I’m not heterosexual!”).

However, GDPR does have clear guidelines for when such data can be legally processed, under article 9 – when there is a “substantial public interest”. Transferred into UK law this means such data can be processed where it promotes “Equality of opportunity or treatment” and this “Safeguarding of economic well-being of certain individuals” which pretty much defines the aims of this research project. Research is also treated differently within GDPR.

Despite this, we have found it extremely difficult to access data. A lot of the barriers were known and scheduled into the project, such as staff going on appropriate training to manage the data securely. As you will have noticed in our previous posts, we have had access to the UK Household Longitudinal Survey restricted dataset, which includes the LGB variable, and have had a lot of success analysing this.

We also want to analyse the Scottish Household Survey, to understand if experiences in Scotland are different at all; the Family and Resources Survey, because it has rich data on welfare benefits and income and material deprivation; and the Wealth and Assets Survey, because it is one of the best sources of data on this in the world, and the only one we have in the UK.

One of these datasets we now cannot access, because the physical barriers put in place make it impossible for team members to carry out the analysis. The reason given for this is that this is data about “vulnerable people”. To change the tone of this blog to the personal – as a gay man writing this, this made me angry. It misunderstands the GDPR protections, and assumes a level of vulnerability in the population which, ironically, we just do not have the analysed data to understand. It also effectively blames LGB people for being vulnerable and so their data cannot be shared, rather than framing the issue as the fact we live in a homophobic and biphobic society, which puts these people at risk if their data is shared.

On one of the other datasets, we actually realised we can identify same-sex couples in the unrestricted dataset; although some of these might actually say there are heterosexual. This has already, interestingly, revealed that women-women couples seem to be particularly disadvantaged. We did not know this before, and this is really important for tackling inequality in society.

Going back to my own identity as a gay man, I think there is an ethics issue here which is being overwhelmed by the ethics and legal issues of GDPR and thus overlooked. We do need to be very careful about how we collect sexual identity data (and even more so data on gender identity), as Kevin Guyan does an excellent job of explaining.

However, I would argue that if you are collecting this very sensitive data, but are then putting up such barriers to its analysis that it cannot be analysed by research organisations, then this is an ethical issue. I would argue that it is an ethical issues that should be treated with the same level of concern as the risks of disclosure. As a sexual minority it is an emotional act revealing this to an other – this is why we call it “coming out”. You are asking people to trust your organisation, to do something very emotional – tick a box saying “I’m very different to the vast majority of society and have spent my life experiencing discrimination” – and then leaving that story under lock-and-key.

I wonder if this is partly because of the way GDPR is framed. All GDPR training starts by describing the astronomically large fines that can be imposed for breaking it. Better training will then go on to describe how that’s unlikely to happen, and that the Information Commissioner’s Office will work with data handlers regarding breaches and not just immediately punitively punish.

The fear of the fine though frames people’s thinking about special category data. Yet the data can only be collected in the first place is there is “substantial public interest”, so this must have been considered. If this substantial public interest exists, then data holders need to allow the data to be processed.

As LGBTQ+ people, I believe we have a role in this. It will be heterosexuals making these decisions on our behalf – the data shows we’re only about three per cent of the population, so there’ll only be a small number of us in the large teams collecting and managing these datasets. It is heterosexuals who are telling us we are vulnerable and this means our data must remain locked away. I think we need to publicly campaign, and make very clear, that you can only collect this data if you let people use it responsibly.

Survey data is a tiny little story about someone, that we then aggregate-up to tell stories about society. Pride is all about sharing our stories; shouting our stories; making our stories visible so that cis-het society can understand the oppression we face every day (read the replies). We need to be as proud of our data in surveys and allow them to shout about the inequality we face. So, we need to fight to get our data analysed. Pride is a protest. But cis-het allies, we’re also tired of fighting. Please make it easier for us to be seen and heard.

Theme by the University of Stirling