Creative Destruction

October 20, 2006

Substantive Criticisms of the Lancet Report: Part 1

Filed under: Current Events,Debate,Iraq,War — Robert @ 6:12 am

Well, I should be going to bed, but I’m not tired. I can think of nothing better than statistical nitpicking to put me to sleep, so herewith is the first annual Lancet Skeptical Review and Somnolence Soliloquy.

There are two documents in play here. The first one is entitled “The Human Cost of the War in Iraq” and subtitled “A Mortality Study, 2002-2006”. That document can be viewed in the original here. The second document is a companion article which provides some more detail on the study and which can be viewed here. I shall refer to these documents as “the study” and “the article”, respectively.

Let me begin with a quick disclaimer. I am not a trained statistician; any numerical analysis which crawls its way into this post should be viewed with a skeptical eye and read broadly and generally. I am skeptical towards this article’s conclusions on grounds of its consistency with the other things that I know, but this post is not about that inconsistency, and is instead a list of what valid critiques I can come up with against the study and the article. I have skimmed the IBC press release slamming the study, and have glimpsed other criticisms, but have not done any extensive reading in the “opposition research”.

Some of the following criticisms may seem trivial. I have not made an attempt to pick every possible nit, but I have listed each flaw or criticism that I can find in the interest of completeness and thoroughness.

1. My first criticism comes in the first sentence of the first paragraph of the study, which states that 600,000 people have been killed “in the violence of the war that began with the U.S. invasion in March 2003”. This criticism is not statistical, but historical and editorial. The war did not begin in March 2003; the war began in Kuwait on August 2, 1990, when Saddam Hussein invaded his neighbor. We do not speak of World War II as beginning on D-Day, or when Operation Torch put Allied troops back into the continental mass in 1942. This may seem a minor quibble, but it is revelatory of an authorial mindset that the war is blamed on the United States, and not on the original aggressor.

2. Later on the first page, the study states “The survey also reflects growing sectarian violence, a steep rise in deaths by
gunshots, and very high mortality among young men.” These are all facially plausible claims, but only the second and third are actually supported by the study. The study goes on to assert “growing sectarian violence”, “sectarian violence”, “sectarian animosity” and “sectarian lines”, again as assertions. These assertions of sectarianism are plausible from what I know, but an attempt appears to be being made to rest the “fact” of sectarianism upon the study’s foundation. No such finding is supported, however.

3. In the Introduction (p. 4), the study authors assert “Such methods [passive data collection such as morgue reports] can provide important information on the types of fatal injuries and trends. It is not possible, however, to use these methods to estimate the burden of conflict on an entire population. Only population-based survey methods can estimate deaths for an entire country.” This is flatly untrue. Survey methods are in most circumstances the best method for estimating a systemic variable like countrywide-deaths, but it is trivial to reach reasonably strong conclusions concerning deaths using counting methods. Demographers do not do this very often, because survey methods are really very powerful. But they could do if they needed to, and in fact they used to quite extensively before the development of the statistical knowledge that permits us to use survey methods. The survey authors here appear to be attempting to bolster the strength of their work by denying any validity to alternative methods. Those other methods, however, function – and the study authors, if they are competent statisticians, know that they function.

4. In the Introduction, the authors claim that 2.5% of Iraq’s population has been killed since the invasion. The casualty figure they use, 654,965, would thus indicate a total Iraqi population of 26,198,600 people. However, the chart on page 5 detailing the population figures as they were used to assign clusters has a total Iraqi population of 27,072,200 people. With that population total, the percentage ought to be 2.4%. Either they are misreporting the figure, or they are using a different population total for their conclusions versus their starting point.

5. On page 5, the authors note that “For ethical reasons, no names were written down, and no incentives were provided to participate.” While it is indeed ethical to refrain from providing incentives, it is difficult to see the ethical merit of making it impossible to verify or check the study results. That information must ethically remain confidential, but in order to validate a demographic study, it must be possible for other researchers to recompile data. This is a major lapse. It may be justified by the security situation, but given the seeming eagerness to participate in the study on the part of the Iraqi people, it seems unlikely that cooperation could not have been elicited even while following standard demographic survey protocols. The survey work is not reproducible.

The lack of name recording, even informally by the survey takers, also opens up a major area of uncertainty. Without recording names, it is impossible to reliably check for duplicate reporting. Household statuses in war zones are not always fixed and immutable. It is entirely possible that the death of a relative who lived in more than one household over the course of the occupation was reported twice or more. This is made even more likely considering that the surveyors went literally house to house in the cluster area; in Iraq, as in many places in the world, it is quite common to see brothers and cousins living in proximity. The magnitude of this effect could be quite small or it could be very substantial, and we will never know because the surveyors did not keep records of the names.

6. Also on page 5, it is noted that 92% of respondents who reported a death were able to produce a death certificate. This is not a priori impossible but it does seem like a high value considering the condition of the country’s health and governmental infrastructure over the period in question. The central bureaucracy is reported by the study authors as failing to retain a miserable one-third of the death certificate information in peacetime, yet the local versions of that same bureaucracy managed to achieve an essentially 100% rating on ensuring that every dead body went through the proper government protocol. This is again not impossible, but there does seem to be a disconnect between these two observations.

7. On page 7, the post-occupation non-violent death rate for the country, as indicated by the current survey reports, is calculated by the study authors as being essentially the same as during the pre-occupation period, with a deteriorating trend beginning to show itself. The authors hypothesize that “this may represent the beginning of a trend toward increasing deaths from deterioration in the health services and stagnation in efforts to improve environmental health in Iraq.” This seems unlikely; it would seem much more reasonable that those infrastructure components would deteriorate rapidly following the invasion and then either slowly recover as coalition troops and Iraqi government agencies restored capacity, or stay at a low level if insurgent activity was sufficient to eradicate any gains made. This is a small but potentially significant indicator that the survey sample used by the authors does not jibe with the overall population of the country.

8. On page 10, the authors compare this study with the 2002 study and find that the surveys indicate similar results. The authors report “That these two surveys were carried out in different locations and two years apart from each other yet yielded results that were very similar to each other, is strong validation of both surveys.” To describe it politely, this is wishful thinking. That the two surveys yielded similar results is a strong validation that the surveys have similar methodology, execution, and sample, and nothing more than that. A smashed barometer will give the same wrong reading a hundred days in a row; this indicates nothing about the weather and everything about the barometer. This is not the only instance of the study authors hyping the strength and quality of their results without providing foundation for the assertion.

I will hopefully post Part 2 of this on Friday, covering the article itself, which contains some fairly serious problems. Thanks for reading thus far. Comments are welcome. (Update: Part 2 posted.)

25 Comments »

  1. I think you need to clarify your objections. In particular, you need to explain the “then what?” question.

    A good example is this:

    4. In the Introduction, the authors claim that 2.5% of Iraq’s population has been killed since the invasion. The casualty figure they use, 654,965, would thus indicate a total Iraqi population of 26,198,600 people. However, the chart on page 5 detailing the population figures as they were used to assign clusters has a total Iraqi population of 27,072,200 people. With that population total, the percentage ought to be 2.4%. Either they are misreporting the figure, or they are using a different population total for their conclusions versus their starting point.

    Say you are right.

    Then what?

    And so on.

    I say this because you are engaging in a particularly common and illogical attack. You need to maintain the choice between the Lancet and soemting ELSE, not just lancet/not lancet. Without presenting an alternate theory or set of numbers, your challenge is problematic–you need a higher level of scrutiny, because you need to show the lancet is unreliable.

    Comment by Sailorman — October 20, 2006 @ 3:28 pm | Reply

  2. I disagree, Sailorman. It is not necessary to provide an alternative tothe Lancet report in order to say that the Lancet report is unreliable, unless of course, you are explicitly arguing that the death rate is much less (or more), in which you should post how you came to that conclusion. But it is possible that there are no good ways to calculate mortality in Iraq, or at least such a calculation has not yet been done.

    Comment by Glaivester — October 20, 2006 @ 5:01 pm | Reply

  3. OK, let me rephrase that. If you are going to list factors which one posits ffect the reliability of the numbers, it is important to also explain HOW they would affect the numbers.

    Comment by Sailorman — October 20, 2006 @ 5:06 pm | Reply

  4. I’m not yet presenting any conclusions on reliability. In the case you mention, I am simply showing that the study authors are using two different population figures without explanation or attribution.

    Comment by Robert — October 20, 2006 @ 6:11 pm | Reply

  5. Still on vacation with limited internet access, so I don’t have time to do a long reply. I wanted to point one thing out, though. On page 6, the study authors says where the 2.5% mortality projection comes from:

    Mortality projections were applied to the 2004 mid-year population estimates (26,112,353) of the surveyed areas (which exclude the governates of Muthanna and Dahuk, which had been omitted through misattribution9) to establish the mortality projections.

    654,965 is 2.5% of 26,112,353.

    Comment by Ampersand — October 21, 2006 @ 10:30 am | Reply

  6. (I should point out that while I do not know how reliable the Lancet report is per se, I have no reason to think that the figure of 655,000 excess deaths is unreliable).

    Comment by Glaivester — October 21, 2006 @ 1:24 pm | Reply

  7. Sorry, what I mean is I have no reason to thin that the figure is incorrect.

    That is to say, I do not know whether the methodology used was good, but I do not find the numbers themselves to be impossible.

    Comment by Glaivester — October 21, 2006 @ 2:03 pm | Reply

  8. Bob:

    The lack of name recording, even informally by the survey takers, also opens up a major area of uncertainty. Without recording names, it is impossible to reliably check for duplicate reporting. Household statuses in war zones are not always fixed and immutable. It is entirely possible that the death of a relative who lived in more than one household over the course of the occupation was reported twice or more. This is made even more likely considering that the surveyors went literally house to house in the cluster area; in Iraq, as in many places in the world, it is quite common to see brothers and cousins living in proximity.

    One of the survey’s authors, quoted in The Washington Post (and requoted on Deltoid):

    Double counting of deaths was a risk we were concerned with. We went through each record by hand to look for this, and did not find any double counting in this survey. The survey team were experience in community surveys, so they knew to avoid this potential trap.

    Comment by Ampersand — October 21, 2006 @ 2:42 pm | Reply

  9. Each record of what? They didn’t write down anyone’s name.

    Comment by Robert — October 21, 2006 @ 3:03 pm | Reply

  10. House #1 Male, age 23, Died Oct 2005 w/dc
    House #2 Female, age 16, Died Dec 2004 wo/dc
    House #3 Male, age 8, Died July 2006 w/dc

    Of course, without recording their names, there’s no way to tell them apart.

    Comment by Daran — October 21, 2006 @ 4:17 pm | Reply

  11. From the survey:

    “For ethical reasons, no names were written down, and no incentives were provided to participate. The survey listed current household members by sex, asked about births, deaths, and migrations into and out of the household since 1 January 2002…Deaths were recorded only if the person dying had lived in the household continuously for three months
    before the event. In cases of death, additional questions were asked in order to establish the cause and circumstances of deaths (while considering family sensitivities).”

    I don’t see “age” in there, nor is there any indication either way that they recorded the specifics of individual death certificate possession. Nor is it apparent that the precise date of death was recorded – only causes and circumstances. So it’s more like:

    House #1, male, died from Coalition bombing
    House #2, male, died from Coalition bombing
    House #3, female, died from gunshot wound

    Comment by Robert — October 21, 2006 @ 7:56 pm | Reply

  12. I don’t see “age” in there, nor is there any indication either way that they recorded the specifics of individual death certificate possession.

    The “indication” that they did record these things is that they produced statistics concerning these parameters.

    Comment by Daran — October 22, 2006 @ 9:04 am | Reply

  13. That’s circular logic, Daran. And essentially yet another area of the study where those seeking validation or external checks are told “oh, we checked that and it’s fine – trust us”. The whole point of a scientific survey is that we don’t have to trust them – we can check their work.

    They can’t both refrain from collecting demographic data for ethical reasons, AND use the demographic data they’ve collected to do quality control on their work. I think it’s most likely that they collected some data and it was enough, in their view anyway, to permit validation checks – but I don’t see that reflected in the documents they produced, and I want to know why.

    Comment by Robert — October 22, 2006 @ 1:33 pm | Reply

  14. That’s circular logic, Daran. And essentially yet another area of the study where those seeking validation or external checks are told “oh, we checked that and it’s fine – trust us”. The whole point of a scientific survey is that we don’t have to trust them – we can check their work.

    Robert, if they didn’t record information about age, when killed, and whether or not a death certificate was shown then there’s a much bigger problem with the study than merely not being able to recognise double-counting. The much bigger problem is that the study is totally fraudulent. Because without those data, they could not produce any numbers at all.

    Is that your allegation? That the survey was 100% fraudulent? If so, then say so, and be prepared to defend any lawsuit the authors might wish to mount.

    Any study might be fraudulent. But if we’re going to reject studies on the grounds that they might be fraudulent when there is no evidence that they are, then we might as well just reject science alltogether.

    Even if they had said “we recorded age and date of death” you would still have to trust that they were telling the truth. Most studies do not publish the raw data, but researchers are usually prepared make them available to other legitimate researchers.

    They can’t both refrain from collecting demographic data for ethical reasons, AND use the demographic data they’ve collected to do quality control on their work.

    They didn’t say they refrained from collecting demographic data. They said they refrained from collecting names. My guess is that they also refrained from collecting addresses. There is an obvious ethical issue here. If these data fell into the hands of sectarian militias, the families could become targets. Other demographic information does not suffer from this problem.

    I think it’s most likely that they collected some data and it was enough, in their view anyway, to permit validation checks – but I don’t see that reflected in the documents they produced, and I want to know why.

    Probably they realised that no matter how much detail they put into their published paper, they would never be able to satisfy rightwingers desperate for excuses to dismiss their results.

    Comment by Daran — October 22, 2006 @ 4:38 pm | Reply

  15. Since I haven’t dismissed their results, your criticism there is off base.

    It isn’t complex to lay out what demographic information they collected, and they gave some indication of what they collected. That information seems to be in conflict with what they also claim to have done in the way of validation. It is hardly out of line to expect a scientific study’s authors to be able to provide consistent information about the study metadata.

    Nor is it whacked to wonder how they are validating, when the information most useful for validation was explicitly not collected. The authors were very clear – they’re SURE that they didn’t collect a SINGLE duplicate name. And yet they also didn’t record the only unique piece of information that would provide an obvious check on that. I find this curious, and I want more information on what they did.

    If that makes me a right-wing lunatic, then fine, I’m a right-wing lunatic.

    Comment by Robert — October 22, 2006 @ 6:31 pm | Reply

  16. Criticisms of Robert’s criticisms…

    1. The first is a pendantic and possibly obnoxious one, but…calling your post substantive criticisms of the Lancet report rather suggests the opposite, at least to me. Sort of like a used car salesman calling himself “Honest Bill”. Why not just call it criticisms of the Lancet report and let the reader decide if the criticisms are substantive or not? And while we’re at it, the Lancet is the publisher, not the author and this was not an invited publication. In short, the Lancet was not the prime mover and shouldn’t be considered primarily responsible for the paper. It would be more proper to call it the Burnham report or the Roberts report. Yeah, I know, no one does that, but it would.

    2. Robert is not an epidemiologist and is making his criticisms as an intelligent lay person. Fair enough, and I would never claim that just because someone is an “expert” that their opinion is always right. However, a number of experts were interviewed as to their opinion of the article and the results of these interviews was published in Nature, 19 October, 2006. (Not a free article, unfortunately.) They had several criticisms of the article, but nonetheless concluded that none of these criticisms invalidated the results and that the results were clearly the most reliable available. Thus, the consensus among experts in the field seems to be that the report is not flawless but reasonably reliable.

    3. When the Iraq War started is something of a semantic question. The VA considers the 1990 and 2003 Gulf Wars to be separate wars, but I’m willing to agree that this war has its roots in the previous conflict. Of course, that war had its roots in Gillespie’s statement that the US had no interest in Iraqi-Kuwaiti conflicts, former US support for Hussein, etc.

    4. Passive data collection is always incomplete, even under the best of circumstances–which these are not. Burnham et al cite a study is Bosnia suggesting that passive data collection there identified 20% of deaths as the best yet obtained. It is not reasonable to claim that any passive data collection method would be more accurate than the methods the Roberts group used.

    5. Household statuses in war zones are not always fixed and immutable.
    This is certainly true, but the authors did not use data from people who had lived in their current location for less than three months, so it is probably not particularly relevant.

    6. I also have problems with the finding that the non-violent death rate has not changed. I think that this suggests an undercount of all deaths. Certainly the Iraqi public health system was already in bad shape before the war between bombings, sanctions, and Hussein’s management of the country, but it seems unlikely that there was no increase in the death rate when the sewage, electical, and transportation systems were essentially destroyed. The decaying trend does not bother me since almost no clinics or hospitals have been rebuilt, according to various media reports. Also a large percentage of Iraq is in a state of civil war, which can’t be good for the public health. I also note that violent deaths account for relatively few of the deaths in women and children under 15. Yet the death rate has increased. Deaths from pregnancy and childbirth and neonatal deaths perhaps? I’d like to see the authors break out that data. However, that doesn’t take away from the main finding and the methodology being used does tend to undercount, so an undercount is not particularly suprising.

    I’ll stop now because this is clearly too long a comment.

    Comment by Dianne — October 23, 2006 @ 4:18 pm | Reply

  17. Of course, that war had its roots in Gillespie’s statement that the US had no interest in Iraqi-Kuwaiti conflicts, former US support for Hussein, etc.

    Right. And domestic violence has its roots in the way she didn’t make dinner right, or looked at a man for two seconds two long.

    Nothing to do with the guy being a violent thug!

    Comment by Robert — October 23, 2006 @ 6:16 pm | Reply

  18. FYI Dianne – my intention is saying “substantive” criticism was to signal that I will not be opining about the political affiliations or partisan aims of the study authors, if any. I will not be discussing when it was released, or referencing other people’s opinions. I will only go on what the study itself says.

    Comment by Robert — October 23, 2006 @ 7:40 pm | Reply

  19. Nothing to do with the guy being a violent thug!

    And the guy being a violent thug. And maybe his need to have a war to keep his population under control. And probably how well he slept the night before. But if ambassadors have no influence over the actions of foreign leaders, what do we have them for?

    Comment by Dianne — October 24, 2006 @ 10:24 am | Reply

  20. I will not be discussing when it was released,

    Not under the authors’ control. It is released whenever the journal has space for it.

    Comment by Dianne — October 24, 2006 @ 10:26 am | Reply

  21. All of the criticisms are pretty insignificant.

    1. A style point. And, more importantly, they are looking at post-U.S. invasion deaths, not all conflict deaths. No one disputes that the first conflict had gone cold until March 2003.

    2. The sectarian conclusion would be problematic if it weren’t written by people who had access to all the interviewers. But, one infers from the repeated statements that descriptions of the incidents reported described sectarian deaths. Also, violent death is violent death, and sectarian violence has, from a variety of sources been shown to be the primary type of violent activity in Iraq.

    3. It may be possible to do that kind of study somewhere, sometime. The implication and unstated premise of the cited sentence is that it is not possible to do that right now in Iraq.

    4. The difference is not material. It likely involves rounding issues.

    5. One suspects that the ethical reason was to not put the subjects at risk of harm if the investigator’s data was taken. No harming your subjects is a basic type of ethical data. Also, individual survey respondent data is almost always kept strictly confidential. Many U.S. surveys don’t even ask the respondent’s name.

    6. Apples and oranges. Only a third of death certificates issues were reported to the central government in peace time. The notion is that many death certificates might be issued but not reported statistically. In the same way, while notaries are supposed to keep a record in a book of every real estate deed they notarize, many fail to do so, but that does not mean that the notarization of the deeds is invalid. The report also explains at some length why people might go to great lengthss to obtain death certifications even now.

    7. There is strong coroboration of the notion that infrastructure is deteriorating in the form of records on the number of hours per day that electricity service is available. This has gradually fallen since March 2003 and is currently at an low much lower than most of the previous three years.

    8. There are two kinds of problems you can have in a survey, design issues and sampling errors. Looking at similar surveys does not catch design issues except trivial ones (e.g. differences due to minor differences in question form). But, looking at similar surveys does serve as a check against sampling error (e.g. getting an unusually high or low number of deaths due to dumb luck from the particular locations or families you ended up interviewing). The “margin of error” that you see with any commercial survey is a measure of theoretically predicted sampling error. If multiple similar studies reach the same result you can learn that you don’t have a sampling error related outlier.

    7.

    Comment by ohwilleke — October 24, 2006 @ 1:25 pm | Reply

  22. sectarian violence has, from a variety of sources been shown to be the primary type of violent activity in Iraq.

    Other than coalition violence.

    Comment by Daran — October 24, 2006 @ 1:45 pm | Reply

  23. Did you read the study, Daran? Coalition violence as a fraction of the violence in Iraq is at an all time low.

    Comment by Robert — October 24, 2006 @ 1:54 pm | Reply

  24. According to the study, violence attributed to the coalition accounts for 16% of all deaths between June 2005 and June 2006, violence attributed to other sources accounts for 19%, and violence of unknown origin for 27%, the remaining 39% is non-violent.

    It’s a fair bet that the overwhelming majority of violence from “other sources” is sectarian, but I don’t see that we can conclude from these figures that it is the primary type if coalition violence is included.

    Comment by Daran — October 24, 2006 @ 2:32 pm | Reply

  25. […] The article is much briefer than the study, which I examined here. So this review will also, theoretically, be briefer (cheers from the gallery). In fact, I only found three issues. However, one of them is potentially damaging to the study’s methodological choices (although I lack the mathematical skills to make a determination of that point), another casts direct doubt on the reliability of the authors’ reporting, and the third makes it clear that the study’s sampling method was not, in fact, random. These are major issues, in other words. […]

    Pingback by Substantive Criticisms of the Lancet Report: Part 2 « Creative Destruction — October 27, 2006 @ 10:06 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: