Information Security:

National Nuclear Security Administration Needs to Improve Contingency Planning for Its Classified Supercomputing Operations

GAO-11-67: Published: Dec 9, 2010. Publicly Released: Dec 9, 2010.

Additional Materials:

Contact:

Gregory C. Wilshusen
(202) 512-6244
contact@gao.gov

 

Office of Public Affairs
(202) 512-4800
youngc1@gao.gov

In the absence of underground nuclear weapons testing, the National Nuclear Security Administration (NNSA) relies on its supercomputing operations at its three weapons laboratories to simulate the effects of changes to current weapons systems, calculate the confidence of future untested systems, and ensure military requirements are met. GAO was requested to assess the extent to which (1) NNSA has implemented contingency and disaster recovery planning and testing for its classified supercomputing systems, (2) the laboratories are able to share supercomputing capacity for recovery operations, and (3) NNSA tracks the costs for contingency and disaster recovery planning for supercomputing assets. To do this work, GAO examined contingency and disaster recovery planning policies and activities, and analyzed classified supercomputing capabilities at the weapons laboratories, and NNSA budgetary data.

All three NNSA weapons laboratories--Los Alamos, Sandia, and Lawrence Livermore--have implemented some components of a contingency planning and disaster recovery program. NNSA, however, has not provided effective oversight to ensure that the laboratories have comprehensive and effective contingency and disaster recovery planning and testing. Further, due to lack of planning and analysis by NNSA and the laboratories, the impact of a system outage is unclear. Only one of the three laboratories--Los Alamos--had conducted a business impact analysis to assess the criticality of resources and acceptable outage time frames; yet, NNSA and all three laboratories consider the consequence associated with the loss of system availability to be low impact and do not consider the classified supercomputers to be mission critical. Nonetheless, NNSA classified supercomputing capabilities serve as a computational surrogate to nuclear weapons testing and are used to address other areas of national security. Despite the absence of business impact analyses, all laboratories had key components of a contingency planning program in place. However, shortcomings existed. For example, all laboratories had backup processes in place and had developed contingency plans, but the plans were not comprehensive. Specifically, one plan did not address the supercomputing operations, and none of the plans had been tested at the time of GAO's review. In addition, the laboratories addressed disaster recovery to a limited extent, but not specifically for the supercomputers. These shortcomings existed, at least in part, because NNSA's component organizations, including the Office of the Chief Information Officer, were unclear about their roles and responsibilities for providing oversight in the laboratories' implementation of contingency and disaster recovery planning. Until the agency fully implements a contingency and disaster recovery planning program for its weapons laboratories, it has limited assurance that vital information can be recovered and made available to meet national security priorities and requirements. Although the laboratories have the technological capability to share supercomputing capacity across all three weapons laboratories, barriers exist that could impede recovery operations. For example, the laboratories do not know the minimum supercomputing capacity needed to meet program requirements, such as simulating the effects of changes to weapons systems, should a disruption occur. In addition, the laboratories have not tested the technological capability to share the capacity on an on-demand basis for recovery operations. Without having an understanding of capacity needs and subsequent testing, the laboratories have little assurance that they could effectively share capacity if needed. Although NNSA obligated approximately $1.7 billion to help implement its classified supercomputing program from fiscal years 2007 through 2009, the agency has not tracked costs for contingency and disaster recovery planning and is uncertain of actual funds that were spent toward these efforts. GAO recommends, among other things, that NNSA clearly define roles and responsibilities for its component organizations in providing oversight for contingency and disaster recovery planning for the classified supercomputing environment. NNSA agreed with most of GAO's recommendations, but did not concur with recommendations relating to capacity planning and cost tracking.

Recommendations for Executive Action

  1. Status: Open

    Comments: According to NNSA, it will leverage current Business Impact Analysis efforts underway at Lawrence Livermore, Los Alamos and Sandia to perform a national level impact analysis.

    Recommendation: To improve the effectiveness of contingency and disaster recovery planning for NNSA's classified supercomputing capabilities, the Administrator of NNSA should, where not already implemented, direct the weapons laboratories to develop business impact analyses that, among other things, (1) identify and prioritize critical systems, data, and supporting resources; (2) identify allowable outage times and impacts for classified supercomputing capabilities; and (3) identify recovery priorities and strategies.

    Agency Affected: Department of Energy: National Nuclear Security Administration

  2. Status: Open

    Comments: According to NNSA, plans will be developed based on business impact analyses.

    Recommendation: To improve the effectiveness of contingency and disaster recovery planning for NNSA's classified supercomputing capabilities, the Administrator of NNSA should, where not already implemented, direct the weapons laboratories to develop and implement comprehensive contingency and disaster recovery plans for all classified supercomputing systems that identify how each weapons laboratory's classified supercomputing capabilities will be recovered following service disruptions.

    Agency Affected: Department of Energy: National Nuclear Security Administration

  3. Status: Open

    Comments: According to NNSA, it will implement contingency and disaster recovery plans that include contingency plan testing.

    Recommendation: To improve the effectiveness of contingency and disaster recovery planning for NNSA's classified supercomputing capabilities, the Administrator of NNSA should, where not already implemented, direct the weapons laboratories to conduct contingency and disaster recovery plan testing.

    Agency Affected: Department of Energy: National Nuclear Security Administration

  4. Status: Open

    Comments: According to NNSA, it will develop contingency and disaster recovery plans that include testing the three weapons laboratories' ability to share classified capacity supercomputers.

    Recommendation: To improve the effectiveness of contingency and disaster recovery planning for NNSA's classified supercomputing capabilities, the Administrator of NNSA should, where not already implemented, direct the weapons laboratories to test the three weapons laboratories' ability to share "on-demand" classified supercomputing capacity to ensure this capability will work in the event of unexpected service disruptions.

    Agency Affected: Department of Energy: National Nuclear Security Administration

  5. Status: Open

    Comments: According to NNSA, it will adapt and apply procedures that are routinely being used for prioritizing workload in capability computing campaigns for use in contingencies and disasters.

    Recommendation: The Administrator of NNSA should document an agencywide means for reprioritizing the workload across NNSA's classified supercomputing systems should a disruption occur.

    Agency Affected: Department of Energy: National Nuclear Security Administration

  6. Status: Open

    Comments: According to NNSA, it will clearly define oversight responsibilities through business impact assessments and development and implementation of contingency and disaster recovery plans.

    Recommendation: The Administrator of NNSA should clearly define the oversight responsibilities of the NNSA Advanced Simulation and Computing (ASC) program office and the NNSA Office of the Chief Information Officer, as they relate to contingency and disaster recovery planning for NNSA's classified supercomputing operations.

    Agency Affected: Department of Energy: National Nuclear Security Administration

  7. Status: Open

    Comments: No actions planned.

    Recommendation: The Administrator of NNSA should identify, assess, and communicate the minimum classified supercomputing capacity needed to meet Stockpile Stewardship requirements in the event of a service disruption.

    Agency Affected: Department of Energy: National Nuclear Security Administration

  8. Status: Open

    Comments: No actions planned.

    Recommendation: The Administrator of NNSA should develop, document, and implement a process that identifies and tracks expenditures for contingency and disaster recovery planning for NNSA's classified supercomputing assets.

    Agency Affected: Department of Energy: National Nuclear Security Administration

  9. Status: Open

    Comments: No action planned.

    Recommendation: The Administrator of NNSA should develop and document the total anticipated costs for contingency and disaster recovery planning of NNSA's classified supercomputing assets, which includes the replacemnet costs for these assets.

    Agency Affected: Department of Energy: National Nuclear Security Administration

 

Explore the full database of GAO's Open Recommendations »

Dec 16, 2014

Dec 11, 2014

Nov 25, 2014

Nov 14, 2014

Nov 12, 2014

Oct 30, 2014

Oct 20, 2014

Oct 1, 2014

Looking for more? Browse all our products here