The harms of hidden research – Part II

Last week I wrote about the need for transparency in inequalities research –  how hidden research both reduces the truthfulness of its claims, and how it works against the collective nature of social science. This week I want to finish off my argument, and deal with the objections to transparent social science, and in particular to respond to Kat Smith’s wonderful warning. As she wrote on opendemocracy, when it comes to tobacco you can argue that ‘freedom of information reduces transparency’ rather than increases it.

Despite this, I think that all inequalities researchers should be as transparent as possible, and that the users of this research should demand transparency. Here’s why.

Practical objections to transparency

1. Why would it help anyone to be transparent?

The most common argument against transparency is that researchers have to put a huge amount of effort into collecting data – why would they bother doing this if they then had to share it with everyone else?  (This is the response that Firebaugh found when floating compulsory transparency for the American Sociological Review in the 80s, and Abbott agrees – in the SM&R debates).

But transparency doesn’t mean removing the incentives to collect data – data collectors always have the first publication(s) from their data, so the headlines are theirs. Moreover, sharing data may even make your research more high-impact.  Gary King has argued that articles that share data are cited twice as often as those that don’t. Andrew Abbott‘s SM&R piece points out that this claim is over-stated – using transparently available data! – and that the main drivers of citations are wider than data sharing. Still, there is a plausible argument that sharing data increases your public profile, gets your name known to lots of people, and has a greater number of people paying close attention to your research as a starting point for their own research.

I would go further than this though. Fundamentally, I think this argument is selfishness, pure and simple.  And while selfishness is understandable for individuals, I can see no reason why we (as a scientific community) would tolerate it. Social science is a collective enterprise, which depends on people working together rather than the work of heroic individuals. It’s also (primarily) publicly-funded, the justification for which is that it helps society. This is why, for example, the main social science research council in the UK, the ESRC, requires any data collection it funds to be shared ‘to the maximum extent possible’. To me, reaping the rewards of data is a non-argument.

2. Where do find the time to be transparent?

The other main complaint is that sharing data and code is time-consuming (see Firebaugh). To which I can only say – yes…


…if you follow good-practice principles in your research, you can have data + code ready to share and actually save time! Scott Long‘s wonderful book sets out the principles for good data management, and personally I think that any analyses that you can’t follow if you come back to in two years shouldn’t be published anyway.

So rather than extra effort, it’s an incentive for us to do our analyses properly in the first place.

3. Do I have permission to be transparent?

One caveat with transparency is that sometimes we don’t have permission to be transparent. I primarily do analyses using other people’s data – and so it’s their decision to share the data, rather than mine. So for example, I use the influential Whitehall II cohort of civil servants in two of my PhD chapters, but anyone wanting to replicate this will have to go to a committee that guards access to the data. And sometimes people didn’t ask research participants if they could share data at the time. So there are limits to what we can do here.

A related point is whether it’s possible to share data/code and maintain confidentiality to research participants. Huge amounts have been written about this, and the main consensus seems to be that most data can be shared, but it’s sensible to restrict access to trustworthy individuals (in the UK, this basically means anyone working at a university), and to get people to agree to terms and conditions in using the data – on pain of losing their right to do so in future. For some data though (e.g. date of birth, local area identifiers), access probably needs to be restricted yet further – so for example, some analyses I’m doing on British Social Attitudes probably won’t be replicable unless other researchers approach the data holders, NatCen. Still, I can make the process of obtaining data as transparent as possible to make replication easier.

Kat Smith and Heather Lanthorn mentioned the need to look at qualitative research in a comment on last week’s post, and I should mention it here. I think all of the same principles apply to qualitative research (the UK Data Archive has a ‘Qualidata’ section)- it’s just that the confidentiality issues are harder. Personally I ask all qualitative interview participants for permission to share anonymised data, and I anonymise my interviews at the point of transcription, so it’s relatively straightforward to make these available when I publish them. But in some circumstances – particularly interviewing elites who are more easily identifiable (which I’ve done), or those such as drug dealers or benefit cheats who would be uncomfortable with sharing – there are inevitable limits to how far data can be shared. And this is fine.

Political objections to transparency

Finally, to Kat’s main points. I don’t agree that consent is implicitly  restricted to people on the side of what we think of as ‘good’ – surely the point is that the data are being collected to find the truth, rather than to demonstrate what the people taking part want to show! So if any bona fide researcher wants to do research, then they should be able to access the data, even if they’re working for the tobacco industry, the BNP, revolutionary communists (like some of my colleagues) or whatever.

It’s a trickier  issue around the political consequences of transparency. I recommend reading Kat’s comment or post – this shows how Big Tobacco use transparency against the public interest, by (i) creating endless hassles for researchers they don’t like through freedom of information requests; and (ii) re-analysing analyses they don’t like until they get the result they want, and using these to create doubt in public debate.  At the same time, they refuse to share any of their own data – and Kat argues this creates an unfair power imbalance.

While this a critical point to raise, I don’t agree that the solution is to keep research secret [ADDED: not that this is what Kat is suggesting!  See her comment underneath this post]:

  • Re time-wasting – this is only a problem because researchers aren’t transparent to begin with, and hidden research conflicts with freedom of information requirements. If we get in the habit of being transparent, then there’s nothing for tobacco companies or other lobby groups to request.
  • Re manufacturing doubt – this reflects a wider problem about the relationship between commercial interests and science. If we don’t have the right institutions in place to try and get truth into public debate, then we get a mess of claim and counter-claim, all driven by opposing political interests. The right response is to say, ‘these people have no place in scientific debate’. If we get this wrong, then even without transparency we have what I’ve elsewhere called  the ‘evidence game’ – all the rhetoric of truth, with none of the substance.

This is a short answer to a large question. But I think it doesn’t get us anywhere to sabotage science, as a defensive reaction to aggressive corporate interests. By doing this, we undermine the credibility of science in the public domain (as the ‘Climategate’ scandal showed) – and it is this credibility we need if we are to fight off the attempts to distort science. There’s nothing wrong with making (anonymised) data available to tobacco industry-paid researchers. The problem is in listening to their results.

For all these objections, transparency can and does work.  The American Economic Review from 2004 stated that they “publish papers only if the data used in the analysis are clearly and precisely documented and are readily available to any researcher for purposes of replication” (taken from Freese), and they have since been followed by EconometricaJournal of Political Economy, and Review of Economic Studies.  The Center for Global Development are managing it.

I will try to prompt every organisation and colleague I come into contact with to do the same. And whether you’re a researcher or a user of researcher, I think you should expect transparency too – because the objections don’t stand up (unless confidentiality is an issue); because it undermines the collective nature of science; and because it’s simply less credible without it.

The famous political scientist Gary King argued recently in Science that “when we teach we should explain that data sharing and replication is an integral part of the scientific process. Students need to understand that one of the biggest contributions they or anyone is likely to be able to make is through data sharing”.  I agree, and I hope you’ll follow.

5 responses to “The harms of hidden research – Part II”

  1. Incidentally – Kat, when I have more time I’m going to come back to this and post other thoughts about your article underneath, I’m aware that this is a bit of a short response to a considerable problem!

  2. Great post again, Ben, thanks but just to be clear (for anyone reading this who doesn’t also read my comments / OpenDemocracy piece!), I’m not against transparency in research, despite issues I’ve raised – I generally agree that it’s a good thing and I try to be as transparent as possible in my own research. I’m worried your post implies that I think we should keep research secret, at least where commercial interests are involved and I definitely don’t – I just think there are some complexities we need to debate and think through. As you say, it may not be possible to grant access to data where this wasn’t requested from research participants at the time of collection (although I agree that this doesn’t mean we shouldn’t seek permission in future research, which I have committed to doing myself). In addition, I think we need to come up with better responses to the issue of the power imbalance created by requirements for publicly funded data to be publicly available, whilst privately funded data are not. In effect, I’m arguing that we need more transparency, not less – we need ways of ensuring commercial research is equally transparent.

    I’m not sure what you mean by getting ‘the right institutions in place to try and get truth into public debate’. First, who would pay for these institutions and who is campaigning for this? I haven’t seen much evidence that this is happening and we need to think about what else we can do if we don’t have these institutions (or enough of them).

    Second, I think this is V hard to monitor, e.g. we know the tobacco industry often promotes its claims via academics, front groups, think tanks, etc so it may not be clear to those receiving the information (journalists, policymakers, etc) that tobacco industry interests were involved in the research or the claims based on the research. These issues are almost certainly not unique to tobacco. Two things that could be done on this front would be: (1) to require all those involved in lobbying (think tanks, consultants and NGOs) to be fully transparent about their funding sources in order to be able to access policymakers or submit consultation responses, etc (there have been lots of debates about this in the EU as you probably know) and (2) as we’ve both mentioned, for journals to require researchers to make all relevant data publicly available prior to publication, no matter who funded it and to be open about funding.

    Third, interpreting policy-relevant research data is not simply about ‘truth’, it is also about values and (whilst I realise this is a slightly different point) I think we need to be more up front about this. For example, there is good evidence that tobacco tax increases are one of the most effective means of reducing tobacco consumption, and that they are more effective amongst poorer groups. Yet there is also good evidence that some people are unable to give up (and that living in more difficult circumstances makes smoking cessation harder). For poorer smokers who continue to smoke, tobacco tax increases are therefore regressive (they will spend more of their income on tobacco and therefore have less for other things). Consequently, it is possible to use evidence (and commitments to health inequalities) to support claims for and against tobacco tax increases (this conflict is evident in that the Marmot Review argues against them whilst a recent IARC review argues for them, both on the basis of evidence). I think one of the downfalls of evidence-based policy was the suggestion that it could free policymakers from value-based, ethical and ideological decision-making. Whilst we can use research to try to get at ‘truth’, we are increasingly being encouraged (as researchers) to go beyond merely describing research findings and to explain what we think the implications of research findings are for policy. Even if you think the process of interpreting research for academic purposes can be objective, value-free and apolitical (I don’t, at least not for an overtly political and ethical field such as health inequalities, so transparency is therefore particularly important), the process of interpreting research findings for policy/action is inevitably informed by our values as well as the particular research and policy paradigms within which we are working. I therefore think the task of monitoring ‘truth in public debate’ may be more complex than your post suggests. Not, again, that this is an argument against transparency (or efforts to monitor dodgy research) in any way! Just some issues that I think warrant more consideration. So again, looking forward to your further posts on this topic…

    • One of the guarantees of blogging is that you never quite say what you mean, because you have to say it so quickly – so apologies for giving people the wrong impression about your position [I’ve now put a note above clarifying this].

      I agree on the need to ensure private research is more transparent – and if journals require privately-funded research data to be deposited, then this is a definite step forward. (Particularly in combination with the various schemes of pre-registration of clinical trials, so that journals could request related outcomes to be in the data source, to avoid cherry-picking results). And transparency in funding sources and lobbying would also be helpful, although we have to be careful in assuming that transparency is enough in some cases, as people are very bad at using information on conflict of interest (as this paper shows).

      I also completely agree with your third point about making clear that ‘evidence-based policy’ doesn’t determine courses of action – I really like the example, and the way you put this!

      As for the point about what I mean by ‘the right institutions’ for getting truth into public debate – this is something that would take a book rather than a blog post… By ‘institutions’, I mean the way that institutions like ‘universities’, ‘learned societies’, ‘the media’ etc, rather than a partciular body. But we should definitely chat about this at some point.

  3. Dear Ben, Kat and all
    What a great debate! As one person whohas been interviewed by Kat years ago, I am trying to remember what embarrassing disclosures might emerge if her tapes became publicly available. I most clearly remember her records of my tedency to hammer on the table a lot. One reaction I am having to all this, including what the LSE Impact Blog is doing, is that the approach that I took to what used to be called the relatioship between research and policy takes a rather different angle on it all. In particular that you cannot regard the spheres of research, policy and practice as running along separate tracks so that a ‘relationship’ or an ‘impact’ occurs as a definable crossing-point. Rather these 3 processes are interwoven all the time. What your discussion has made me think is that the ethical way forward is to understand how the ‘lash-up’ (as Bruno Latour would call it) in each case we look at actually works. What is in it for whom? What are the material interests involved. The smoking case is right on the nose but what I argue is that ALL ‘evidence’ is produced in a way that is not totally dissimilar. As also is the appearance of ‘no evidence’ or ‘insufficient evidence’.
    Keep on blogging

    • Thanks for the comment below – I particularly like the quotes that mention the hammering on the table!

      I think the inter-relationship of research, policy and practice is a really interesting point, and there are definite insights from Latour that are helpful in this (even if there are some bits of Actor-Network Theory that I’m never going to swallow). At the same time, I think that bringing the idea of ‘truth’ back in is also important. Science CAN be more truthful than competing institutions (that’s it’s main point), even after accepting that it isn’t value-free and that research is not easily separable from politics, BUT there are practices that we can (sociologically) observe in the practice of science that are unhelpful.

      Anyway, I was talking to Kat about this the other day, and sure I’ll come back to you (and the blog) on this – and very much enjoying these discussions in the meantime.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: