Jeffrey LewisB61 Mod 12 Revisited

Question: What’s a national lab to do when the  GAO releases a report warning that a major effort, the B61 Lifetime Extension Program, is headed for trouble?

Answer: Write an article about how essential the B61 LEP is to “21st Century Deterrence” with a seriously creepy patch that shows a Genie going bowling.

I can’t make this stuff up.

Actually the article, 21st Century Deterrence by Dan Borovina and Michael Port, isn’t that bad. I had a longer post in mind about how the authors, by focusing on the role of LEPs in improving recruitment and retention, are perpetuating a myth that harms stockpile stewardship without actually improving morale — but for now I just want to focus on that damned car analogy.

You know the one — the habit of labbies the world over to compare nuclear weapons under a test ban to cars you aren’t allowed to start. I went so far as to propose a bilateral US-Russia T-BAM: Treaty Banning Automotive Metaphors.

I hate that analogy because cars don’t f’ing explode when you start them.  (Unless you are in the casino business.)

The United States has never deployed an explosively tested warhead to the stockpile because an explosively tested warhead, by definition, has already exploded.  What the labs exploded were weapons like the ones in the stockpile, which raises uncomfortable questions about (1) how designers infer confidence in a stockpile even with testing and (2) what it means to be like a warhead in the stockpile.

As the National Academies Committee on the explained in 2002, “only a small fraction of the warhead designs subjected to nuclear testing were identical to the warhead designs that actually entered the stockpile. … [T]he number of tests was too small to provide a statistical basis for confidence, and it did not allow coverage, for each design, of the range of stockpile-to-target sequence (STS) conditions called for in the military specifications.”

The proper analogy for nuclear testing was always: “Here is a Jaguar.  You can’t start it until you need it.  But, don’t worry, we started a similar engine on a stand back at the factory to validate the design principles before the beginning mass production. Happy Off-Roading!” You would never drive a car model that had been tested like a nuclear weapon design, either under a test ban or in the heyday of explosive testing. Because cars aren’t like nuclear weapons, or vice-versa.

Again, unless you work in the casino business.  Or have a nickname like “Ace” or “Lefty”.

Ok, here is the clip you wanted to see:


  1. John Bragg (History)

    I really doubt that anyone anywhere in the world doubts that American nuclear weapons are well designed and will explode violently if called upon to do so.

    Except maybe for some American weaponeers.

    Stupid question–how many American nuclear tests did not succeed? Or am I asking a stupid question/asking a good question stupidly?

    • John Schilling (History)

      You ask a good question, for which a complete answer may not be possible from the unclassified record. But by rough count, of 262 atmospheric tests for which a nuclear yield was expected, 12 were complete failures and 5 saw only the primary of a thermonuclear weapon fire.

      Extrapolating to the enduring US stockpile, it is somewhat plausible that one of the half-dozen or so warhead designs is prone to misfires, but not that all of them are. And each leg of the triad has at least two warhead designs available. I would hope that this is sufficient to establish the overall credibility of the US nuclear deterrent. I would not go so far as to doubt that “anyone in the world” doubts this; I’ve actually encountered e.g. someone who seems to sincerely believe that the US failure to respond in kind to the 1998 Indian tests proves that the US arsenal doesn’t work and India is thus the new superpower of the 21st century. But I don’t believe people like this are particularly numerous or influential.

      My major concern in this matter is that, should pure deterrence fail (and I’m not optimistic that it will hold over decades), we will have to fall back on intrawar deterrence and hope that a very limited use of nuclear weapons on our part – maybe a single bomb – will suffice to end the crisis. If that bomb fails, the fact that we have thousands more that work just fine will be small consolation. The objective, hypothetically, is to end the conflict with one bomb, not many, and one conspicuous fizzle at the wrong time could seriously impair the necessary credibility.

      The B-61 is of particular interest in this context, as the combination of reduced yield options and manned delivery system may make it the weapon of choice for such limited strikes. It would be nice if we could be really certain it will work if needed. Which is not to be confused with carte blanche for anyone proposing B-61 related stockpile stewarship activities.

    • Jeffrey (History)

      Please keep in mind, as I noted in the text, the most nuclear weapons tests were “focused on the development of new designs” and “only a small fraction of the warhead designs subjected to nuclear testing were identical to the warhead designs that actually entered the stockpile …” It is much more accurate to think about nuclear testing as process for validating design concepts and not, repeat not, for establishing statistical confidence in the reliability of the stockpile.

      One cannot infer any sort of statistical confidence in the stockpile from the US testing record.

      Rather, confidence is generated by the understanding of nuclear weapons phenomenology, generated largely on the basis of data collected during testing. It is possible, in the absence of testing, for our confidence in certain designs to increase as new analytic capabilities allow better understanding of past test data. So, for example, Dick Garwin has argued that the stockpile stewardship program “has provided the basis over time for increasing, not decreasing, confidence in the performance of these legacy weapons.”

      One thing I find curious: The statistical liklihood of failure in delivery vehicles — like ballistic missiles — is much, much greater than that for warheads. (As a rule of thumb, I usually see missiles treated as .9 and warheads as ONE or 1.) But the Jon Kyl’s of the world never run around screaming about how the United States must increase delivery system reliability, even though that is where largest gains to reliability, and presumably deterrence, are to be found.

    • John Schilling (History)

      I think it is not unreasonable to infer, from the test record, that if the US nuclear weapons design community believes a particular arrangement of nuclear gadgetry will produce a (thermo)nuclear explosion there is on the order of a 17/262 probability that they will turn out to be wrong, and in the absence of anything better if we have to wager on the probability of the first B-61 dropped in anger going “bang”, 93.5% would be a reasonable guess. A better guess would require dozens of test shots of the actual B-61 physics package, which as you note we don’t have and aren’t going to get.

      93.5% is probably good enough; I’d be willing to consider credible plans to boost it to 98-99%, but I’m not sure how one could practically do that.

      W/re delivery systems, I think 90% might be optimistic. But delivery failures aren’t necessarily as damaging, to deterrence or warfighting if it comes to that. Mostly, if you press the button and the missile doesn’t fly, you just press the next button. Even an exploding booster or failed stage separation can be handled in about the same way. Warhead failures are more obvious to the target/audience, and there’s more of a delay involved in a reshoot.

    • John Schilling (History)

      And to clarify on my previous comment, “test failure” is not exactly the same thing as “nothing happens”. So my a priori estimate of the probable outcome of B-61 warshot #1 might be, 2% complete dud, 2% unboosted primary yield of ~0.3 kt, 2% boosted primary yield of ~10 kt, 2% full design yield, and 0.5% wild card.

      A more complete analysis of the available (albeit perhaps classified) data might allow refining that somewhat. But the engineer in me balks at going too far beyond what the test data will support – if you want to claim 99% reliability, I want to see data from a hundred tests.

  2. Andrew Tubbiolo (History)

    Is that logo ‘official’? If so which operators of the Panavia Tornado are admitting a nuclear capability? The UK, yes, however Germany, Italy, and the Saudis also use the craft. Or am I reading too much into this?

    • Jeffrey (History)

      German and Italian Tornado aircraft are dual-capable aircraft available to carry US B61s. You are not reading too much into the patch. That is exactly what it means.

  3. Bert (History)

    I’ve tried (within the complex) to advocate for a wedding cake analogy, which I feel is much more appropriate than the car analogy, but a) it takes too long to explain, b) it only works if you believe that wedding cake should taste good, and c) engineers really like cars and physicists act like they really like cars when they hang out with engineers.

    A good baker knows how to make a good cake–they have made and tested cakes, they understand their ingredients, they follow a process, they can reasonably predict the expiration date of the cake. but the cake that goes to the wedding has not been tested by the baker. Wedding cakes also often explode on the faces of the bride and groom (but taste isn’t a real factor right then, is it).

    While testing does not establish the reliability of the delivered cake, it does make a significant impact on the confidence that the baker has on his product. Without testing, small changes to the recipe don’t look like a concern to most of us, but could seriously degrade the confidence of the baker, no matter how much he licks the batter.

    • Jeffrey (History)

      That is a much better analogy!

      I can’t get over what a better analogy that is.

    • Cameron (History)

      I just tried to come up with an analogy that brings the cake metaphor into a more technological field, and every time I did I had to admit “nope, the engineers would have tested that.”

      Now however you’ve left me really wanting cake.

    • kme (History)

      Another analogy might be packing parachutes.

      You can test parachutes packed by the method you’re using, but the packing of any one parachute used for real has never been tested.

  4. Alan Tomlinson (History)

    John Schilling wrote, ” It would be nice if we could be really certain it will work if needed.”

    No it would not. I for one, would be amazingly happy, if, when called upon, all American nuclear weapons would fail to function. I appreciate that there are those who feel differently, but I think that the desire to use nuclear weapons under any conditions, is a sign of short-sighted thinking.


    Alan Tomlinson

  5. John Bragg (History)

    One thing I’ve learned today is that, even back in the day when we were having nuclear tests all the time, we didn’t take a new-model or experimental model warhead, lower it into the shaft, and push the button.

    So, if I understand today’s lesson correctly, the right comeback to “We’ve never tested a 25-year old B-61!” is “We never tested a brand-new B-61, either, dingus.”

    Perhaps one should leave off the last part when talking to Congress.

    • John Schilling (History)

      I suspect that if we were to go through the complete (i.e. classified) archives of the underground nuclear tests, we would find at least one B-61 physics package, possibly even a complete B-61 bomb. Not enough to provide real statistical confidence. Mostly, we are using the hundreds of mostly-successful nuclear tests in general to provide statistical confidence of the nuclear weapons design process in general, and maybe a single proof shot to prove that we didn’t completely screw up this time.

      Per the new and improved analogy, the baker has made several hundred cakes, only a dozen or so of which were inedible and a few more decidedly subpar. He’s used the current recipe exactly once, and it tasted wonderful. Now there’s a whole batch of cakes in the refrigerator, baked by the same baker following the same recipe as best he can, but none of them have been tasted yet.

  6. Tom (History)

    I am wondering if this emblem was designed to subtly convey a message to the audience, like how futile it is to even try to put this symbolic “Genie” back in the bottle…

  7. yousaf (History)

    I wrote this in regard to RRW but some of the arguments apply to the LEP also:

    “Consider Chinese or Russian nuclear weapons. Do we know their technical reliability numbers? No. Yet, we still take them very seriously.”

    • shaheen (History)

      Yousaf: that’s only one half of the picture. The US president’s (or British PM, Chinese Premier…) self-confidence in the reliability of “his” stockpile is a non-trivial aspect of deterrence management.

    • yousaf (History)

      If that is the case (though I don’t buy its alleged importance), then I suggest concentrating on improving the reliability of the delivery system — as I clearly mention in my article:

      “the overall reliability of the weapon system is dominated by the intercontinental ballistic missile delivery system–of 2,160 test launches, approximately 15 percent resulted in some type of delivery system failure that would have prevented the warhead from reaching its target.9”

      And (in the case of RRW) one has to make a deterrence tradeoff between fielding an untested new warhead versus a legacy one that, in some form, went “bang!” once upon a time. e.g.

      “Would you fly on an airliner that had never had a test flight, even though its aerodynamics may be well understood? So why would you — or more importantly our enemies — believe untested new weapons would work better than the tested ones we have?

      On the other hand, if the proposed RRWs are eventually tested, it will be more difficult to stop other adversarial nations from doing the same. Either way, the RRW program is detrimental to U.S. security vis-à-vis proliferation and deterrence calculus.”

  8. Daryl Press (History)

    Three thoughts:

    1) Regarding the first half of the National Academies quote from 2002: “only a small fraction of the warhead designs subjected to nuclear testing were identical to the warhead designs that actually entered the stockpile.”

    Just to be clear: what this says is that most designs that were tested were never fielded. It doesn’t say that most designs that were fielded were never tested. A couple of the comments suggest that readers may have jumbled those two very different statements.

    (2) Re: the second part of the National Academies quote: “[T]he number of tests was too small to provide a statistical basis for confidence, and it did not allow coverage, for each design, of the range of stockpile-to-target sequence (STS) conditions called for in the military specifications.”

    The first half of that sentence seems wrong. Even a small number of tests of a given design — say 4 or 11? — can be the basis for valid statistical inferences about reliability. On the basis of single-digit numbers of tests, one cannot precisely pinpoint the actual “reliability” rate (eg, perhaps defined as the actual percent that will detonate with 90%+ of expected yield).

    But one **can** draw very important bounds as to what that percent might be, even with small numbers. Geoff Forden did something like this a few years ago by asking: given a number of observed tests (say, 6 for 6), what is the lowest actual warhead reliability that is consistent with that testing record (meaning, what is the lowest reliability rate that would, 95% or 98% of the time, produce 6 for 6). So small numbers of tests can yield reasonable, statistically valid inferences — if you’re careful about what you’re asking, and how you set up the problem.

    3) Most importantly. Jeff’s initial post is poking fun of the argument that says: you wouldn’t buy a car that was never tested. He points out that even full testing is like testing the drive train at the factory, rather than starting the engine of the assembled car. True. But wouldn’t you rather buy a car whose drive train was tested at the design stage than one that wasn’t — especially if you can’t fire up the engine on the lot?

    Operationally-realistic tests >> less-realistic tests >> no tests at all

    • yousaf (History)

      And there are some fundamental unaddressed issues in deviating from tested designs.

      See, e.g., the National Academies’ study on margins and uncertainties (QMU) quoted below: it pointed out it is not clear “how close” to the tested design one needs to be to make a viable RRW warhead (in the absence of testing of course), thus the notion that new warheads could be “based upon tested designs” in the new N.P.R. is an unclear formulation:

      “Finding 4-2. Any certifiable RRW weapons design will have to be “close” to the archival underground nuclear test base, while meeting reasonable criteria for adequate margin.

      The design and certification of new nuclear weapons that are sufficiently “close” to particular legacy designs could, in principle, be accomplished without nuclear tests, based on the existing nuclear test archive, on new experiments with no nuclear yield, and on modeling and simulation tools supported by a QMU methodology more mature than at present.

      For a certifiable RRW, the design labs will have to make the case that a new design is “close enough” to tested designs. The case would depend on establishing that the design is based on well-understood principles of nuclear warhead physics and engineering, that the design is related in key ways to designs that were successful in archived historical nuclear testing, and that any gaps between the knowledge of physics and engineering and the archival underground nuclear test base are bridged by experiments. Interpolation is highly preferable to extrapolation.

      “Recommendation 4-2. The design laboratories should lay out in detail their arguments for the relevance and closeness of archival underground tests to any proposed RRW design.

      These laboratories should investigate methodologies for helping address the problem of quantifying closeness.”

      ”How to transparently define and quantify “closely related” is a difficult issue to which the labs should devote sufficient effort. “Close enough” depends on the direction of the change as well as the magnitude—the direction should be away from “cliffs,” and expert designer judgment must go into assessing “close enough.” Prior warhead anomalies and their “fixes” should be used to validate the definition of “close enough.” The goal is to increase the critical margins while controlling the uncertainties so that M/U ratios are greater than 3 or so. The margins and cliffs here are intentionally spoken of in the plural because there are multiple failure modes, and increasing one margin might decrease another—for example, increased Pu mass might endanger one-point safety, so all must be considered together. A primary lying between two successfully tested designs (i.e., interpolated rather than extrapolated) can provide additional confidence. The design and certification of new nuclear weapons that are sufficiently “close” to particular legacy designs could, in principle, be accomplished without nuclear tests, based on the existing nuclear test archive, on new experiments with no nuclear yield, and on modeling and simulation tools supported by a more mature QMU methodology.

      It must be noted, however, that there is no commonly accepted quantification of closeness in the laboratories. While closeness will always have a substantial qualitative component based on expert judgment, a quantitative measure is clearly needed. This is not a trivial problem. “

    • Jeffrey (History)


      I think you have the wrong picture of the historical role of testing — which is why I find the car analogy so pernicious. It perpetuates an inaccurate view of the role of testing in building stockpile confidence. Note the wedding cake analogy, which I think gets much closer to the thing. The purpose of testing was to validate design concepts and approaches. These tests of the nuclear explosive package would have been heavily instrumented and need not have resembled a weapon in its operational configuration.

      Once a nuclear weapon type was accepted into the stockpile, there was no testing of production units. (Only in the last decade of testing, did the laboratories begin devoting one test annually to a stockpiled system.) That is it: a single test of a given warhead type. (Depending on how you count it, there were either 13 or 17 confidence tests of production units, plus explosive tests of components.)

      I know it “seems” like it must be wrong, but Holdren, Agnew, Garwin, Jeanloz, Panofsky, Sack, etc, weren’t mistaken about the role of testing. The other helpful resource is Stockpile Surveillance: Past and Future, September 1995.

      You know what is also true, but “seems” like it should be wrong? That the resulting NEPs were assigned confidence of 1.0 — sometimes I hear folks say Oh En Ee. A good idea in recent years has been the quantification of margins and uncertainties, which attempts to quantify the margins and uncertainties associated with certain performance “gates” during the explosion of a nuclear weapon.

      On (3), no one prefers designs without testing pedigrees. Even RRW advocates tried to minimize deviation from design heritage. But the value of the test pedigree is not that we would ever bother to test a stockpiled weapon as whole, but rather testing of concepts and components produced a rich catalogue of data that helps us understand the phenomenology of nuclear explosions. It is that deep knowledge that is the enduring value of testing and the key to sustaining confidence in the stockpile. I certainly don’t mean to say that how we assess nuclear weapons is better or worse than how we assess car performance, merely that the two fundamentally incomparable. The car analogy obscures more than it reveals, impoverishing important debates about nuclear weapons like the CTBT and the RRW

    • Allen Thomson (History)

      FWIW, the space launch community finds itself asking similar questions about reliability and, for reasons I don’t fully understand, seems to have fallen into the habit of using something called a “first level Bayesian estimate.” Not being a statistician, I have no idea how soundly based this practice is. But using it on the six-for-six example says that one should assign a probability that #7 will succeed of (6+1)/(6+2) or 87%. I’ve never seen error bars/uncertainties associated with such estimates and don’t know whether the underlying theory can generate them.

      “First level Bayesian estimate of mean predicted probability of success for next launch attempt [is](k+1)/(n+2) where k is the number of successful events and n is the number of trials.”

    • yousaf (History)

      For space launch reliability is a much more important issue since those rockets are meant to be used.

      Reliability is not as big a deal in matters of deterrence where the nuclear weapons are meant to look scary and not be used. e.g. If China’s warheads are 87% or 97% reliable does not affect the US’ or Russia’s deterrence calculus substantially. I doubt this reliability is even known or factored into US calculations: It is likely that we assume Russian and Chinese warheads are 100% reliable. I am fairly certain they do the same.

      A good unclassified summary is:

      The footnotes are especially interesting. e.g. a warhead that explodes with a yield 10% different from its design yield may be considered “unreliable” in internal USG doc’s: even though it may properly annihilate a city.

  9. bradley laing (History)

    I assume that a yeild of 10% difference *more* is also called “Unreliable.” Because, plutonium is so expensive that war heads should not carry more than is needed?

    • yousaf (History)

      Would not be surprising if it was so designated, though Pu mass is not the only thing affecting yield. (And Pu expense is not a major factor.)

      For more on this definitional issue see:


      R. L. Bierbaum et al., “DOE Nuclear Weapon Reliability Definition: History, Description, and Implementation,” Sandia National Laboratories, April 1999, available at:


      “Defining reliable”, Bulletin of the Atomic Scientists, March, 2001 by Stephen Schwartz

    • Daryl Press (History)

      I think Bradley must be correct. Though the primary concerns about “too high” yield are probably not simply waste of plutonium, but rather:

      * for bombs, the survivability of the aircraft that released the weapon
      * fratricide of other, incoming bombs or warheads
      * and in some circumstances, a desire to minimize collateral damage (eg, destroy airfield X, but minimize damage to nearby city)

  10. Daryl Press (History)


    I really disagree with the notion that reliability doesn’t matter much because these are weapons that are “meant to look scary and not be used.”

    As you point out correctly, for *deterrence* all that matters (with respect to reliability) is adversaries’ perception of reliability. And that’s probably not too difficult, as you say.

    But — politically correct or not — we deceive ourselves if we ignore that these weapons have TWO roles: (1) deterrence, and (2) military operations if deterrence fails.

    The political purpose behind #2 might be (a) reestablishing deterrence, or (b) disarming an adversary through counterforce, or (c) retribution, or (d) ??? , but mission #2 is real, and reliability matters for it.


  11. yousaf (History)

    I quite clearly said that in matters of deterrence, reliability of the nuclear warheads is not all that relevant and I stand by, and further underscore, that viewpoint.

    1. In the new N.P.R. our leaders have spelled out what the fundamental role of nuclear weapons is — and it is not military warfighting:

    “The fundamental role of U.S. nuclear weapons, which will continue as long as nuclear weapons exist, is to deter nuclear attack on the United States, our allies, and partners.”

    2. As I mentioned above, reliability is very narrowly defined — with a yield that is just 10% off from the design yield being tagged as unreliable even though a 300 kT versus a 440 kT nuclear warhead would make for an equally bad day for a person tasked with measuring this yield within a few km of ground zero.

    Further nuclear targeting practice puts more than a few warheads on any militarily important target.

    And military planners have traditionally undervalued the effects of fire in comparison to blast.

    3. And as I mentioned in my Bulletin piece above: “Perhaps warhead reliability would be an issue worthy of serious discussion if the current warheads were found to be critically flawed. But from 1958 to 1996, the Stockpile Evaluation Program sampled nearly 14,000 weapons; of these, only about 1.3 percent were found to have failures that would have prevented them from operating as intended.” ref: General Accounting Office, “Nuclear Weapons: Improvements Needed to the DOE’s Stockpile Surveillance Program,” PDF GAO/RCED-96-216, 1996.

    4. If deterrence fails and you are in a situation that you really needed 1000 nuclear weapons — and not “just” 989 to “win” — the “win” would not be a win.

    5. If you are really concerned with weapons’ reliability then I suggest concentrating on improving the reliability of the delivery system — as I clearly mention in my Bulletin article:

    “the overall reliability of the weapon system is dominated by the intercontinental ballistic missile delivery system–of 2,160 test launches, approximately 15 percent resulted in some type of delivery system failure that would have prevented the warhead from reaching its target.9″

    Almost all arguments in favor of increased reliability of nuclear warheads are fatuous, misleading and/or irrelevant — both to deterrence and warfighting.

  12. Daryl Press (History)


    * for the mission of deterrence, you and I agree that the reliability of warheads or delivery systems is not much of an issue at the current time, for exactly the reasons you suggested. I also think you’re probably right, that there is probably more to be gained in overall reliability from delivery systems rather than warheads.

    * BUT — your post encourages an unfortunately widespread fiction that “ability to deter” is the only metric against which we should evaluate U.S. nuclear capabilities. You know as well as I do that current policy tasks STRATCOM with providing nuclear capabilities to achieve (at least) FOUR objectives: (i) deter adversaries, (ii) assure allies, (iii) defeat adversaries, and (iv) encourage stability.

    On a public board like this, where people with varying levels of expertise go for information, we should clarify current policy (and then critique it as we feel appropriate). Suggesting that reliability is not much of an issue because the “fundamental role” of U.S. nuclear weapons is deterrence may repeat NPR language, but it is not clarifying, because whatever “fundamental” means in this context, in reality there are at least 4 real roles for US nuclear weapons, and reliability matters for at least one of them.

    * Similarly, your point #2 — about the “bad day” that would occur for a person subjected to either 300kt or 400 kt — muddies things further. For the purpose of busting cities, big yield variations don’t matter too much. But the “defeat” mission would probably involve attacks on hardened targets, and substantial variations in yield do matter. Yes, we would probably put more than one warhead on an important target — but the actual number of warheads required, the yield selected, the height of burst chosen, etc… all depend on having fairly accurate pre-attack estimates of weapon system performance — ie, yield, reliability, accuracy, HOB uncertainty, etc…

    I gather that you don’t like current policy and would prefer that the word “fundamental” were changed to “only” in the NPR (although even then, the retaliatory missions would be rolled into “deterrence”). But the reality — as I know you understand — is that current policy requires nukes to do more than simply deter, and hence weapon system performance matters.


    • yousaf (History)

      No, my post does not encourage the “fiction” that “ability to deter” is the only metric against which we should evaluate U.S. nuclear capabilities.

      My post makes clear that ability to deter is the FUNDAMENTAL metric by which to judge the nuclear weapon systems, as also made clear by the US Government’s Nuclear Posture Review.

      My post also makes extremely clear that if you are interested in increasing the reliability of the weapon system for warfighting then the very first thing to do is to address is the reliability of the delivery systems.

      The fact that the warheads are >98% reliable (with high confidence) is sufficient to:

      (i) deter adversaries, (ii) assure allies, (iii) defeat adversaries, and (iv) encourage stability.

      It ain’t broke, and doesn’t need fixing:

      I reiterate, almost all arguments in favor of increased reliability of nuclear warheads are fatuous, misleading and/or irrelevant — both to deterrence and warfighting.

      Any people with “varying levels of expertise” confused about the issue are welcome to write me at ybutt -at- and I can help explain in plainer English why this is a non-issue.

  13. yousaf (History)

    The B61 LEP (mod 12) — the subject of this post — is an especially bad idea since it is in conflict with US policy that “Life Extension Programs…will not support new military missions or provide for new military capabilities.”

    See Hans’ issue brief from FAS:


    “According to this policy stated in the Obama administration’s Nuclear Posture Review (NPR), the B61-12 cannot have new or greater military capabilities compared with the weapons it replaces.

    Yet a new report published by the U.S. Government Accountability Office (GAO) reveals that the new bomb will have new characteristics that will increase the targeting capability of the nuclear weapons deployed in Europe.

    It is important at this point to underscore that the official motivation for the new capabilities does not appear to be improved nuclear targeting against Russia or other potential adversaries. Nonetheless, that will be the effect.”


    “The beauty of the B61-12 program is that it avoids a controversial decision to develop a new low-yield nuclear warhead but achieves many of the PLYWD mission goals by combining the existing lower-yield options of the B61 (down to only 0.3 kt) with the increased accuracy…”


    “The B61 LEP appears to be much more than a simple life-extension of an existing warhead but an upgrade that will also increase military capabilities to hold targets at risk with less collateral damage.

    It is perhaps not surprising that the nuclear laboratories and nuclear warfighters will try to use warhead life-extension programs to increase military capabilities of nuclear weapons. But it is disappointing that the White House and Congress so far have not objected.

    The NPR clearly states that, “Life Extension Programs…will not…provide for new military capabilities.” I’m sure we will hear officials argue that the B61 LEP doesn’t provide new military capabilities because it doesn’t increase the warhead yield beyond the maximum of the existing four types.

    But this narrow interpretation misses the point. Mixing precision with lower-yield options that reduce collateral damage in nuclear strikes were precisely the scenarios that triggered opposition to PLYWD and mini-nukes proposal in the 1990s. Warplanners and adversaries could see such nuclear weapons as more useable allowing some targets that previously would not have been attacked because of too much collateral damage to be attacked anyway. This could lead to a broadening of the nuclear bomber mission, open new facilities to nuclear targeting, reinvigorate a planning culture that sees nuclear weapons as useable, and potentially lower the nuclear threshold in a conflict.

    Such concerns ought to be shared by the Obama administration, which has pledged to reduce the role of nuclear weapons and work to prevent that nuclear weapons are ever used. The pledge to reduce the role of nuclear weapons has received widespread international support but will fall flat if one of the administration’s first acts is to increase the capability of nuclear weapons.

    How Russia and NATO allies will react remain to be seen, but increasing NATO’s nuclear capabilities at a time when the United States is trying to engage Russia in talks about limiting non-strategic nuclear weapon seems counterproductive.”


    “The administration should also direct that the portion of the B61-12s that are earmarked for deployment in Europe be deployed without the new guidance tail kit but retain the accuracy of the exiting weapons currently deployed in Europe. Otherwise the B61-12 should not be deployed in Europe.

    Finally, the administration’s ongoing nuclear targeting review should narrow the role of nuclear weapons to prevent that numerical reductions become a justification for increasing the capabilities of the remaining weapons. The new guidance must depart from the “warfighting” mentality that still colors nuclear war planning and is so vividly illustrated by the precision low-yield options offered by the B61-12.”

  14. George William Herbert (History)

    A little late to the party, but…

    It’s instructive to consider the different basic failure types that are credible to happen.

    The three most likely categories are construction flaw (particular warhead unit is off-nominal enough in some undetected way that it fails to operate properly), design flaw (inherent problem with the design), and untested operating condition flaw (something about the test conditions tested doesn’t match the actual operating condition, and the unit fails in a way that may be fleet-common if they are employed in similar manners).

    It’s not widely discussed, but I have heard that there were indeed single unit flaws during testing (and the various in-service quality programs have found low rates of same).

    There have been design flaws, such as tritium cross section goofs and various safety flaws, explosive aging, etc.

    There have also been operating conditions goofs, such as the W80 cold-soak problem (that turned out to be common back into the parent B61 IHE models design, which is at least somewhat specifically relevant to the discussion at hand…).

    What worries me – seriously worries me – is the last category. I suspect that we know enough now to reliably do medium-lightweight weapons designs with very very high likelyhood of avoiding inherent design flaws. But that we got as far as having produced quite a number of W80-1 units (and who knows how many B61s before them…) before cold-soak testing one and finding out that TATB doesn’t do cold that well.

    I would hope that (before, and if not certainly after that W80 failure) ICBM warheads have been “test fired” with vacuum, Gs, cold soak, then reentry heat soak simulation leading up to a firing, to determine if there’s a similar type of failure mode lurking there.

  15. bobbymike (History)

    Let’s just start building new warheads and testing them and then replacing the aging stockpile.

    OK that isn’t going to happen but an unrepentant Cold Warrior can dream – throw in a new ICBM design to keep the industrial base warm would be nice as well.