Jeffrey LewisTwo DPRK Nuclear Tests in 2010?

By now, you have undoubtedly seen press reports claiming that North Korea may have conducted a pair of clandestine nuclear tests in April and May 2010.  The reports are based on a forthcoming paper by a well-known Swedish radiochemist, Lars-Erik De Geer.

I don’t buy it. At least not yet.

Look, I would be the first person to jump at the possibility that the CTBTO’s IMS detected a well-hidden nuclear test. I am one of the few cranks out there who believes the  DPRK may explore boosted fission weapons, which De Geer believes accounts for the pair of alleged tests.  But, as I told Nature’s Geoff Brumfiel, the paper  “doesn’t feel right to me.” (Science & Global Security has made available an advance copy to me; the issue will be published in March.)

What follows is my best accounting of what I see as some methodological problems with a very interesting, but ultimately unpersuasive paper.

Let’s get a bunch of stuff out of the way first. De Geer is a well-respected Swedish radiochemist with strong ties to the CTBTO.  He’s also a pretty nice guy and has been generous in sharing a bunch of radiochemistry on the Chinese atmospheric nuclear testing program with me.  He’s not a bad sort, even if there are a lot of people in Vienna wondering why he just published this paper without workshopping it a bit at the VIC.

The paper was also peer-reviewed.  Although I believe some of the problems I will outline ought to have been raised in peer review, it seems plausible that one or more peer-reviewers were so focused on the very difficult radiochemistry calculations that they didn’t step back and think about the paper in context.  I don’t know anything about radiochemistry, so it’s easy for me to think about the paper in context.  That’s all I have.

Questionable Methodology

My concerns about the paper are simple to explain.  The paper relies on radionuclide monitoring to detect a nuclear explosion, but the general view among experts has been that radionuclide monitoring is imprecise enough that it should only be used to screen events. So, for example, if there is a seismic event, then the presence of xenon or other fission products might help persuade states to seek a special inspection.  But it doesn’t work the other way around. That is why, for example, the South Korean government cited the lack of seismic activity as a reason to dismiss the xenon measurements when they were initially reported in 2010.

De Geer revisited the 2010 debate and found two interesting sorts of data: xenon measurements at a national radionuclide monitoring site near Geojin (South Korea) and an IMS site near Takasaki (Japan) and barium/lanthanum measurements at CTBTO IMS sites near Usurriysk in Russia and Okinawa in Japan.  (Only lanthanum was detected at Ussuriysk.)  All these measurements occurred between 13-18 May 2010.

De Geer, in general terms, makes two arguments — one relating to analysis of xenon isotope ratios at Geojin and Takasaki, the other relating to the presence of fission products barium/lanthanum at Ussuriysk and Okinawa.

My understanding, based on conversations with radiochemists and a review of the pertinent literature, is that the backgrounds for xenon releases are so bad (and getting worse) that atmospheric mixing essentially eliminates the possibility of using isotopic ratios to
discriminate among xenon sources. Japan and South Korea have large numbers of nuclear reactors.  The background should be quite poor. Even the most encouraging results — studies in 2006 and 2010 that list Martin Kalinowski as the lead author — indicate that it is not possible to discriminate xenon from a explosion against that from a load of fresh fuel that has been exposed for only a few days.  As we will see, there is a plausible scenario for a fresh fuel load at the same time.

I am also uncomfortable with how De Geer approached the task of modeling the xenon ratios.  De Geer clearly modeled a hypothesis of a single test — but the isotopic ratios indicated rejection of his hypothesis.  So he then postulated a second test, placed it in the same chamber to explain the unusual xenon ratio, and adjusted the time between tests to produce the correct xenon cocktail for release.

There is no a priori reason to assume North Korea would conduct a pair of tests separated by a month in the same chamber — previous DPRK tests branched off a main tunnel into separate chambers —  other than that just happens to fit the data.  As one colleague noted, this is rather less like Occam’s Razor than Occam’s Toothbrush.

Finally, De Geer places enormous confidence in atmospheric transport modeling — using weather data to infer the location and source term of the radionuclides.   De Geer was a coauthor on a paper claiming that the CTBTO station in Yellowknife, Canada, had detected xenon from the 2006 DPRK test.  There is some discussion within the technical community about whether it is possible to exclude other sources, including the relatively nearby medical isotope production center at Chalk River. (There are many sources of xenon, including routine reactor operations and the  production of medical isotopes.  Chalk River is a massive producer of medical isotopes and some experts think the xenon detected at Yellowknife might have been from the 2006 DPRK test, Chalk River or some combination of both.)

De Geer’s observation that the station at Okinawa detected the fission product barium is intriguing.  (Lanthanum alone is not — a spike in Germany in 2004 turned out to be from a military contamination exercise.)  Taken together the barium/lanthanum readings at Okinawa and Ussuriysk  do seem to indicate fission.

If the reading at Okinawa is not a false positive, then something interesting happened.  That appears to be one reason why Frank von Hippel, who is quoted skeptically in the Nature article, notes that there must have been some sort of fission explosion.

Modeling Alternative Hypotheses

My colleague Ferenc Dalnoki-Veress and I are currently working to formulate and test a series of these alternate hypotheses.  The most promising candidate so far is Japan’s fast breeder reactor at Monju, which began operations with a fresh load of fuel on May 6.  Shortly thereafter, on Thursday and Friday, there were a number of alarms — reports differ about how many and what type — that seem to indicate problems with the fuel and leaks of radioactive gas.

Japanese authorities reassured the public that these were false alarms, but perhaps they were mistaken.  Monju suffered a serious accident in 1994 that Japanese officials attempted to cover up.  The resulting scandal kept Monju shuttered for fifteen years — until May 6, 2010.  The pressure on certain Japanese officials not to admit further problems must have been immense.  As it was, Japanese officials delayed announcing the false alarms and were issued a verbal reprimand.  What if the alarms weren’t false?

I am not saying this is what happened.  Ferenc and I are going to model this and other scenarios.  Perhaps, at the end of everything, a DPRK test will still be the most likely source.  But the existence of a plausible scenario that would be very difficult to distinguish from a nuclear explosion — a fresh load of unusual fuel exposed for only a few days — that was not examined in De Geer’s paper suggests that perhaps it would have been best to delay publication.

Ferenc and I are going to start churning through a series of questions.  Once the paper is released, you are invited to participate! Our work  is focusing on three questions:

1.  Modeling a series of leaks from Monju that might account for the fission products and the xenon, as well as continuing to develop other plausible hypotheses such as radioisotope production at the DPRK’s IRT-2000 reactor.

2. Attempting to recreate De Geer’s atmospheric transport model with different software and data packages to try and gauge the uncertainty in the modeling.

3. Determining how much freedom De Geer permitted himself by allowing two tests in a single chamber separated by a month.  With tests separated by anywhere from 1 day to 1 year, is there any xenon outcome one couldn’t engineer?

The overall goal is to try to assign some sort of confidence judgement for the hypothesis of a pair of DPRK tests in a single chamber, relative to other explanations.  Nuclear testing may turn out to be the most likely explanation.  But  policy-types should not take this at face value just yet.

Why Didn’t the USG Reach the Same Conclusion?

I should say, in closing, that I am also worried about publication bias.  Shortly after the xenon detection at Geojin, the ROK dismissed the possibility of a North Korean test on the basis of a lack of any seismic data.  The  United States looked into the issue as well and also dismissed a North Korean test, though on what grounds I do not know. Of course, no one publishes negative results and, in this case, there is good reason official inquiries were conducted on a classified basis.  Still, I would like to understand why other competent radiochemists reached a different conclusion than De Geer.  Perhaps De Geer’s work is better, but perhaps it is also simply an artifact of his very carefully engineered scenario and choice of modeling tools.

As a policy analyst, rather than a technical expert, I can’t referee debates about atmospheric transport modeling or the analysis of xenon isotope ratios.  But a policy analyst should be sensitive to areas where technical experts disagree about the confidence of certain tools and models.   We can observe that there are significant uncertainties in the data and tools brought to bear on this problem.  De Geer concludes “The probability … that a low-yield underground nuclear explosion was carried out on 11 May 2010, or possibly, the day before, is significant.”  I think our task now is to ask “Significant compared to what?”

Sorry to poor Josh Pollack for my stealing of his inspired image choice when this controversy first appeared in 2010.


  1. joshua (History)

    Always happy to lend an image choice.

  2. John Hallam (History)

    Seems to be some putting of two and two together to make five.

    Just exactly as in the ‘was India the AQ Khan fourth customer’? the possibility is there, but not the certainty.

    Lack of a seismic signature is think, a real problem for a test hypothesis.

    John Hallam

    • joshua (History)

      Hmm. I resemble that remark.

      It’s good to express uncertainty when you don’t know for sure…

  3. Stephen Young (History)

    Josh, I’m surprised. Surely you’ve learned that if you want to get ahead in DC, you can’t be uncertain about anything. Well, at least I think that’s the case, but I’m not actually sure.

  4. Jeffrey (History)

    The lack of comments is disappointing. This is a big, messy question where almost any answer is possible. Why all the silence?

  5. Cheryl Rofer (History)

    Very well, I’ll try to supply something substantive. I almost commented at Nuclear Diner when I saw the first reports, but then I decided to wait for the paper. I was also a bit confused on what they were talking about because of the lack of seismic evidence.

    But you’ve given almost enough* information to make sense of this, and I tend to agree that the lack of a seismic signature undercuts the other claims. And using a chamber twice seems questionable.

    I’m trying to think of an alternative explanation for production of fusion-related isotopes. I’m not particularly expert on xenon isotopes. Laser fusion? Probably a device would be simpler.

    Not very substantive, hey?

    *Will feel more confident when I can see the paper.

    • Pavel (History)

      There is a way to get a copy – the contact information is on the S&GS site

      Regardless of the merits of the paper, I find it really surprising that people insist that the lack of seismic signature is a deal breaker. I’ve seen this argument a few times already and I am really puzzled.

    • Jeffrey (History)

      I am not sure I would characterize it as a “deal breaker” but bad backgrounds, the complexity of atmospheric transport models and the possibility of atmospheric mixing from other sources mean using radionuclide monitoring for detection may result in a significant number of false alarms. It seems more promising for screening ambiguous events.

      That said, certain fission products are attention getting. So, I wouldn’t say it is a deal-breaker. But I also think De Geer’s paper demonstrates the downsides of using radionuclides in isolation. Other data is necessary.

    • Pavel (History)

      I’m talking about statements like “the lack of a seismic signature undercuts the other claims.” That seems quite a bit different from what you are saying – that the complexity of radionuclide analysis and a lot of factors that have to be taken into account undercut the claim. That may or may not be a valid point in this particular situation. But I’m not sure it is correct (or helpful) to say that “the lack of seismic signature” undercuts anything.

    • Cheryl Rofer (History)

      I wouldn’t (and didn’t) call it a deal breaker.

      Perhaps “the lack of a seismic signature undercuts other claims” is too strong. I agree that that lack doesn’t directly invalidate other claims, but I didn’t say that either.

      As Jeffrey says, certain fission products are attention getting. And what’s the seismic detection limit? 50 kT?

    • John Schilling (History)

      The seismic detection limit is closer to 0.05 kT than to 50 kT. Depending on the local geology, and the extent to which the party doing the testing can mask the seismic signature, you can get an order of magnitude or so variation either way. That still doesn’t leave you much room to hide an actual nuclear explosion, and there are lots of things other than explosions that produce fission-product isotopes.

      So, attention-getting yes, earth-shattering no.

    • Jeffrey (History)

      De Geer set the yield to the 50 ton detection threshold, then allows himself up to 200 tons to account for decoupling.

      His threshold estimate is from this nifty little paper:

      Tormod Kvaerna, Frode Ringdal, Ulf Baadshaug, North Korea’s Nuclear Test: The Capability for Seismic Monitoring of the North Korean Test Site, Seismological Research Letters, Vol. 78, No. 5. (1 September 2007), pp. 487-497.

  6. Spruce (History)

    Well, I think that the lack of comments is mostly explained by one thing: the De Geer paper isn’t out yet (at least I’m not aware of it). It’s hard to give any kind of meaningful comments to either the paper or this post, the sbulks of which is the critique of that paper, when one hasn’t read the original paper. Without seeing it, there’s little to comment except for the usual light banter.

  7. Davey (History)

    De Geer concludes “The probability … that a low-yield underground nuclear explosion was carried out on 11 May 2010, or possibly, the day before, is significant.”

    The wording contradicts the conclusion. If the probability WAS significant, he should have written, “The probability … that a low-yield underground nuclear explosion was carried out on 11 May 2010, or possibly, the day before, is 0.7”

    Them’s weasel-words.

  8. kme (History)

    For me, possibly the most interesting part of this story is the notion that the CTBTO could find itself in the business of revealing civil nuclear accident coverups.

    • Jeffrey (History)

      We are really hoping, I might add, that this turns out not to be the case. After Monju in 1994, one guy committed suicide. I don’t need anything quite that heavy in my professional life.