Computer Sciences Corporation; HP Enterprise Services, LLC; Harris IT Services Corporation; Booz Allen Hamilton, Inc.

Highlights

Computer Sciences Corporation (CSC), of Falls Church, Virginia; HP Enterprise Services, LLC, of Plano, Texas; Harris IT Services Corporation, of Herndon, Virginia; and Booz Allen Hamilton, Inc., of McLean, Virginia, protest the award of multiple contracts by the Department of the Air Force under request for proposals (RFP) No. FA8771-09-R-0020, for the acquisition of a wide array of information technology services and products. The protesters generally assert that the agency misevaluated the proposals and made unreasonable source selection decisions. HP additionally alleges that the agency conducted inadequate and unequal discussions.

We sustain the protests.

View Decision

DOCUMENT FOR PUBLIC RELEASE
The decision issued on the date below was subject to a GAO Protective Order. This redacted version has been approved for public release.

Decision

Matter of: Computer Sciences Corporation; HP Enterprise Services, LLC; Harris IT Services Corporation; Booz Allen Hamilton, Inc.

File: B-408694.7; B-408694.8; B-408694.9; B-408694.10; B-408694.11

Date: November 3, 2014

Kelly E. Buroker, Esq., Eric J. Marcotte, Esq., Kirsten W. Konar, Esq., and Kyle E. Gilbertson, Esq., Vedder Price, P.C., and Carl J. Peckinpaugh, Esq., for Computer Sciences Corporation; Richard J. Conway, Esq., Andrew E. Smith, Esq., Joseph R. Berger, Esq., Michael J. Slattery, Esq., Dickstein Shapiro LLP, for HP Enterprise Services, LLC; Lori Ann Lange, Esq., and Nick R. Hoogstraten, Esq., Peckar & Abramson, P.C., for Harris IT Services Corporation; and Katherine S. Nucci, Esq., Timothy Sullivan, Esq., Scott F. Lane, Esq., Jayna M. Rust, Esq., and Jeffrey S. Newman, Esq., Thompson Coburn LLP, for Booz Allen Hamilton, Inc., the protesters.
William A. Shook, Esq., The Law Firm of William A. Shook, for InfoReliance Corporation; Jonathan D. Shaffer, Esq., John S. Pachter, Esq., Mary Pat Buckenmeyer, Esq., and Zachary D. Prince, Esq., Smith Pachter McWhorter PLC, for General Dynamics Information Technology; Scott E. Pickens, Esq., and William M. Jack, Esq., Barnes & Thornberg LLP, for SRA International, Inc.; Gerard F. Doyle, Esq., and Ron R. Hutchinson, Esq., Doyle & Bachman LLP, for L-3 National Security Solutions; David W. Burgett, Esq., and Nicole D. Picard, Esq., Hogan Lovells US LLP, for IBM Corporation; Joseph P. Hornyak, Esq., and Gregory R. Hallmark, Esq., Holland & Knight LLP, for Raytheon Company; and Jonathan J. Frankel, Esq., John P. Janecek, Esq., Deborah Raviv, Esq., and Steven D. Tibbets, Esq., Steese, Evans & Frankel, P.C., and William J. Colwell, Esq., and Linda T. Maramba, Esq., for Northrop Grumman Systems Corporation, the intervenors.
Michael G. McCormack, Esq., Lt. Col. James H. Kennedy III, Capt. Adam N. Olsen, and Wilbert W. Edgerton, Esq., Department of the Air Force, for the agency.
Eric M. Ransom, Esq., and Edward Goldstein, Esq., Office of the General Counsel, GAO, participated in the preparation of the decision.

DIGEST

1. Agency’s cost realism analysis is not supported where evaluation record is devoid of any analysis of the sufficiency of the offerors’ proposed labor categories and labor hours, or meaningful explanation concerning the agency’s basis for accepting the proposed labor hours as realistic to complete the work.

2. Protest challenging agency’s technical evaluation is sustained where the record does not establish that the evaluation considered the appropriateness and reasonableness of the offerors’ labor mixes and labor hours, as required by the solicitation.

3. Protest that past performance evaluation was unreasonable is sustained where agency used a methodology to determine performance confidence ratings that significantly overemphasized relevancy-related criteria over quality, and produced misleading results.

4. Agency’s tradeoff decision considering past performance and cost/price was unreasonable where the past performance evaluation methodology produced misleading results, and where the source selection was in part based on considerations not set forth in the solicitation’s best value award criteria.

5. Protest that agency failed to conduct meaningful discussions concerning price reasonableness is denied where the source selection decision did not consider protester unawardable due to price; protest that agency conducted unequal discussions is denied where post-discussions exchanges with one offeror did not permit modification or revision of that offeror’s proposal.

DECISION

We sustain the protests.

BACKGROUND

The Air Force issued the RFP on November 4, 2010, for the acquisition of network operations, infrastructure and service oriented architecture information and transformation services and solutions for the Air Force and other Department of Defense agencies at locations worldwide. This acquisition is commonly referred to as Network Centric Solutions-2 (NETCENTS-2). The RFP anticipated the award of multiple indefinite-delivery/indefinite-quantity contracts to include cost reimbursable elements, fixed-price elements, and labor hour elements. Agency Report (AR), Tab 7, RFP, at 2-24. The RFP contemplated a 3-year base period, and up to four 1-year option periods, during which the agency could award one or more task orders to the successful contractors. Id. at 33. Each awarded contract was to have a minimum guaranteed value of $2,500; the maximum value of the contracts is $960 million. Id. at 27.

The agency was to award contracts to technically acceptable offerors on a “best value” basis utilizing a tradeoff procedure weighing performance confidence and price, where performance confidence was significantly more important than price. Id. at 175. Under this evaluation scheme, the agency would first evaluate proposals for technical acceptability on a pass/fail basis. The agency would then perform a performance confidence assessment and cost/price evaluation. Id. at 175-76. Finally, the agency would consider the total evaluated price (TEP) of each proposal, and make award decisions based on an integrated assessment of performance confidence and cost/price. Id. at 175. The agency anticipated awarding between six and nine contracts, but reserved the right to make more, fewer, or no awards. Id. at 176.

Concerning the performance confidence assessment, the RFP required the agency to perform a detailed and in-depth past performance assessment for each offeror. Id. at 178. This evaluation was to consist of an integrated analysis of recency, relevancy, and quality of past work, and was to focus on the quality of prior efforts in five areas: systems sustainment, web service development, management, cost, and small business participation. Id. at 165, 178. For past performance to be considered recent, the RFP provided that the effort must be ongoing, or have been performed within a period from three years prior to the issuance of the RFP through November 4, 2011; that is, November 4, 2007 through November 4, 2011.

Regarding relevance, the RFP established that the agency would evaluate each past performance reference with respect to the five performance confidence criteria areas set forth above (systems sustainment, web service development, management, cost, and small business participation). Id. at 179-82. Since each criteria area also included several subcriteria, the RFP, in total, set forth 27 subcriteria for the agency to consider in determining a past effort’s relevance, identified as “[c]riteria considered for magnitude and complexity.” Id. at 181. Ultimately, the relevance of each past effort was rated for each of the five criteria areas on a scale of highly relevant, relevant, somewhat relevant, not relevant, or not applicable, based on similarity of magnitude and complexity in the subcriteria areas. Id. at 180.

Once an offeror’s past performance reference was determined to be recent and relevant, the RFP required the agency to perform a quality assessment, rating the quality of the prior effort as exceptional, very good, satisfactory, marginal, unsatisfactory, or not applicable. Id. at 183. Each offeror was then to be assigned an overall performance confidence assessment rating of substantial confidence, satisfactory confidence, limited confidence, no confidence, or unknown confidence. Id. at 179. In addition to the considerations set forth above, the RFP provided, generally, that more recent and relevant past performance would have a greater impact in the evaluation, as would references demonstrating a “sustained track record,” references that covered more of the relevancy criteria areas, and references performed at worldwide locations. Id. at 178-79.

For cost/price, the RFP provided that each offeror’s cost/price proposal would be evaluated on the basis of a TEP composed of the sum of three elements: a fixed price sample task order, a cost-reimbursable sample task order, and a submission concerning total labor-hour price. Id. at 183. For evaluation purposes, the RFP established that the agency would consider the reasonableness of all three elements of the TEP, and the realism of the offerors’ proposed costs for the cost reimbursable sample task order. Id. at 184.

The agency received 21 proposals in response to the RFP. After the initial technical evaluation, the agency conducted two rounds of discussions from January to December of 2012, with final proposal revisions (FPR) received from all 21 offerors in January 2013. During review of the FPRs, the agency noted inconsistencies in the relevancy ratings of certain offerors. In response, the agency began a third round of discussions in March 2013. Second FPRs were received from the offerors on April 29.

On the basis of the second FPRs, 20 of the 21 offerors were evaluated as technically acceptable and were considered for award. AR, Tab 139, Source Selection Decision Document (SSDD) at 4. On July 29, the offerors were informed of the selection of six offerors for awards under the RFP.

Immediately thereafter, six unsuccessful offerors filed bid protests with our Office. In response to these bid protests, the agency notified our Office of its intent to take corrective action to revisit certain past performance ratings and make a new source selection decision. We then dismissed the protests as academic. Computer Sciences Corporation, B-408694.5, Sept. 5, 2013; General Dynamics Information Technology, Inc., B-408694, Sept. 9, 2013; Booz Allen Hamilton, Inc., B-408694.2, Sept. 10, 2013; HP Enterprise Services, LLC, B-408694.3, Sept. 10, 2013; Northrop Grumman Systems Corporation, B-408694.4, Sept. 11, 2013; InfoReliance Corporation, B-408694.6, Sept. 11, 2013.

After completion of the reevaluation, the 20 technically acceptable offerors were rated as follows:

Offeror	Performance Confidence	TEP Ranking	TEP
Lockheed Martin Information Systems & Global Solutions	Substantial	1	$34,805,775.69
Jacobs Technology/TYBRIN	Substantial	2	$37,177,033.96
Harris IT Services Corporation	Satisfactory	3	$37,395,833.85
SRA International	Substantial	4	$39,580,034.63
Raytheon	Substantial	5	$42,330,710.54
L-3 National Security Solutions	Substantial	6	$42,664,964.42
Computer Sciences Corporation	Satisfactory	7	$48,309,098.77
InfoReliance Corporation	Substantial	8	$48,664,422.90
CACI-ISS	Substantial	9	$49,424,393.92
Northrop Grumman Systems Corp	Substantial	10	$49,653,635.60
General Dynamics Information Technology	Substantial	11	$51,113,278.49
IBM U.S. Federal	Substantial	12	$51,385,794.05
Booz Allen Hamilton	Substantial	13	$52,343,988.87
BAE Systems Information Solutions	Satisfactory	14	$56,395,199.97
QinetiQ North America	Satisfactory	15	$58,726,013.10
ManTech Systems Engineering Corp	Satisfactory	16	$63,465,465.12
Hewlett Packard Enterprise Services	Substantial	17	$65,053,167.41
Leidos	Substantial	18	$76,619,358.56
Dynamics Research Corporation	Satisfactory	19	$81,207,242.10
Accenture Federal Services	Substantial	20	$110,088,252.00

AR, Tab 139, SSDD, at 12. Each of the offerors’ TEPs were determined to be reasonable, realistic, and balanced.

On the basis of these evaluation results, the source selection authority decided to make awards to the 10 lowest-priced offerors with substantial confidence ratings: Lockheed Martin Information Systems & Global Solutions; Jacobs Technology/TYBRIN; SRA International; Raytheon; L-3 National Security Solutions; InfoReliance Corporation; CACI-ISS; Northrop Grumman Systems Corp.; General Dynamics Information Technology; and IBM U.S. Federal.

The unsuccessful offerors were informed of the SSA’s award decisions on July 14, 2014. Debriefings were completed on July 23. These protests followed.

DISCUSSION

Collectively, the protesters allege that the agency erred in its evaluation of the proposals in each of the three evaluation areas: technical, performance confidence, and cost/price. More specifically, the protesters assert that the agency failed to conduct an adequate realism analysis of the cost reimbursable sample task order, failed to evaluate the offerors’ cost-reimbursable labor hours for technical acceptability, and failed to reasonably evaluate the offerors’ past performance references in a manner consistent with the terms of the RFP. The protesters further assert that the agency based its source selection decision on considerations not set forth in the RFP, which were in fact inconsistent with the tradeoff process described in the RFP. HP also asserts that the agency conducted inadequate and unequal discussions.

Cost Realism Evaluation

Concerning cost realism, HP and Booz Allen each allege that the agency’s cost realism analysis was inadequate because the record demonstrates that the agency’s cost/price team never analyzed whether the number of hours proposed by each offeror was realistic from a cost standpoint. As explained below, we agree that the agency’s explanation of its cost realism evaluation fails to demonstrate any analysis of the realism of the widely varying labor hours proposed by the offerors.

When an agency evaluates proposals for the award of a cost-reimbursement contract, an offeror’s proposed estimated cost of contract performance is not considered controlling since, regardless of the costs proposed by the offeror, the government is bound to pay the contractor its actual and allowable costs. Magellan Health Servs., B-298912, Jan. 5, 2007, 2007 CPD ¶ 81 at 13; Metro Machine Corp., B-295744, B-295744.2, Apr. 21, 2005, 2005 CPD ¶ 112 at 9; Federal Acquisition Regulation (FAR) § 16.301. As a consequence, a cost realism analysis must be performed by the agency to determine the extent to which an offeror’s proposed costs represent what the contract costs are likely to be under the offeror’s unique technical approach, assuming reasonable economy and efficiency. FAR §§ 15.305(a)(1), 15.404-1(d)(1), (2); The Futures Group Int’l, B-281274.2, Mar. 3, 1999, 2000 CPD ¶ 147 at 3.

Our Office will review an agency’s cost realism analysis, when it has been protested, to determine whether it is reasonably based and not arbitrary. The Warner/Osborn/G&T Joint Venture, B-256641.2, Aug. 23, 1994, 94-2 CPD ¶ 76 at 5. In this regard, it is not sufficient for an agency to simply verify that an offeror has provided all required cost information, rather, the agency is required to take reasonable, documented, steps to assess what costs are likely to be incurred under each offeror’s technical approach, and explain the basis for a conclusion that the proposed costs are realistic for the work to be performed. Nat’l City Bank of Indiana, B-287608.3, Aug. 7, 2002, 2002 CPD ¶ 190 at 10-11; FAR § 15.404-1(d)(2).

Here, the agency contends that it conducted a thorough and appropriate cost realism analysis. According to the contracting officer, the cost/price team assessed the sample task order and labor-hour prices “by conducting price analysis, FAR 15.404-1(b)(2)(i), to determine prices fair, realistic and reasonable.” Contracting Officer’s Statement, Booz Allen Protest, at 21. In this regard, the cost/price team explains that realism was determined by the following process:

(1) Determined adequate price competition by comparing the proposed prices received in response to the solicitation.

(2) Reviewed other than certified cost or pricing data for the Cost Reimbursable Sample Task Order . . . .

(3) Completed cost realism analysis of the individual elements of cost . . . . The labor categories and proposed hours were determined to be realistic if the technical evaluation found the offeror’s proposal to be technically acceptable; assuming the technical and pricing volumes were consistent.

(4) Performed a comparison of the direct and fully burdened rates with published price lists to determine if the proposed prices were unrealistically low.

AR, Tab 135, Price Competition Memorandum, at 18.

Concerning the results of the analysis, the contracting officer first explains that because all of the offerors had been found technically acceptable, each offeror had “a good grasp of the labor categories, wage rates and knowledge of skills required to complete the work.” Contracting Officers Statement, Booz Allen Protest, at 23. The cost/price team also compared each offerors’ proposed hours from the technical proposal to the hours set forth in the cost proposal, “to ensure consistency, realism and reasonableness.” Id. The contracting officer explains that this analysis showed that all offerors’ hours were consistent between the technical and price proposals, and showed no discrepancy.

The contracting officer also presented several tables and charts from the price evaluation in support of the cost/price team’s analysis. The first of these tables compiled the labor hours with the proposed prices:

[TABLE DELETED]

AR, Tab 145, Price Evaluation, at Excel Tab 3, Price/Hours Table.[1] The contracting officer maintains that “[t]hese numbers indicate fairly uniform pricing was achieved through a mix of labor hours and price,” and that “where an offeror had a low [average price per hour], it was offset by proposing more hours.” Contracting Officers Statement, Booz Allen Protest, at 25.

Additionally, the cost/price team used a breakdown of historical labor hour data set forth in the RFP to create a baseline reflecting the historical minimum and maximum number of hours required to complete the tasks comprising the cost reimbursable sample task order. AR, Tab 7, RFP, at 210. This baseline reflected an estimated minimum of 451,000 hours and a maximum of 549,000 hours to complete the work. The cost/price team included these values along with the offerors’ proposed hours into another chart, as follows:

[CHART DELETED]

AR, Tab 145, Price Evaluation, at Excel Tab 3, Cost Baseline Chart.[2] The contracting officer asserts that the cost/price team determined that this chart “indicated all offerors proposed labor hours that were consistent with the [cost reimbursable sample task order] baseline.” Contracting Officers Statement, Booz Allen Protest, at 28.

Notwithstanding the agency’s explanations, the record does not demonstrate any analysis of the offerors’ proposed labor hours for the cost reimbursable sample task order to determine whether they were realistic for the work to be performed. See FAR § 15.404-1(d)(1). As an initial matter, the agency’s use of “price analysis”--including determining adequate price competition and conducting a comparison of prices received--was insufficient, as price analysis does not relate to cost realism. Rather, price analysis techniques under FAR § 15.404-1(b)(2)(i) are for the purpose of establishing a fair and reasonable price, while the techniques for cost realism analysis--for the purpose of determining whether proposed costs are too low--are set forth under FAR § 15.404-1(d)(1).

Moreover, the data relied upon by the cost/price team and contracting officer simply do not support the proposition that the offerors provided “fairly uniform pricing,” proposed hours that were consistent with the cost reimbursable sample task order baseline, or demonstrate an assessment of cost realism. Rather, the chart reflects that the offerors’ proposed labor hours ranged from a low of 325,545 hours, to a high of 655,732.58 hours, a total variation of over 100 percent, which is not adequately explained by the contemporaneous record. Further, the data shows that 7 of the 20 technically acceptable offerors proposed hours that were below the minimum baseline calculated by the agency. In fact, two of the awardees, [DELETED] and [DELETED], proposed hours significantly lower than the minimum historical baseline hours, with [DELETED]’s proposed hours being approximately 28 percent below the minimum baseline. No analysis of this variation from the agency’s baseline is apparent in the record.

Additionally, the Price/Hours Table, infra, does not support the contracting officer’s assertion that “where an offeror had a low [average price per hour], it was offset by proposing more hours.” Contracting Officers Statement, Booz Allen Protest, at 25. While the offeror that proposed the highest labor hours ([DELETED]) did have the lowest average price per hour ($[DELETED]), the three offerors that proposed the lowest labor hours ([DELETED]) also had very low average prices per hour ($[DELETED], respectively). Further, the offeror with the highest average price per hour ([DELETED], $[DELETED]), proposed labor hours very near the center of the labor hours distribution. In sum, the agency’s reliance on the above tables and chart, without any analysis or explanation of how the cost/price team determined that the proposed hours were adequate to complete the work in accordance with the offerors’ unique technical approaches, does not support a conclusion that the proposed labor hours were realistic.

Conceding that the cost/price team never independently analyzed whether the labor hours were realistic, the agency explains that the proposed hours were considered realistic where the proposals had been found technically acceptable, indicating “a good grasp of the labor categories, wage rates and knowledge of skills required to complete the work.” Contracting Officer’s Statement, Booz Allen Protest, at 23. However, as pointed out by both HP and Booz Allen, the record of the agency’s pass/fail technical acceptability evaluation does not reflect any analysis of whether the proposed hours were realistic for the work to be performed. Instead, it is apparent that the technical evaluator interpreted evaluation criteria relating to identification of proposed hours for the cost reimbursable sample task order as merely requiring the agency to verify that the labor hours, descriptions and qualifications were provided in the technical proposal, and that the hours set forth in the technical proposal matched those set forth in the cost proposal.

For example, concerning [DELETED], the record shows that the technical evaluator initially had a concern because [DELETED] had not listed the qualifications of the proposed personnel, and because the labor hours listed in the technical proposal did not match the price/cost proposal hours. The agency considered the concerns resolved during discussions, after [DELETED] provided the qualifications of the personnel, and modified the hours set forth in the technical proposal to match the cost proposal. The technical evaluation did not, however, contain any discussion whatsoever of whether the proposed personnel, qualifications, and proposed hours were realistic or appropriate for the work to be performed. Instead, the final technical evaluation consists of no more than a list of the proposed hours and labor categories, with the statement confirming that the labor hours matched the cost proposal--despite the fact that [DELETED]’s proposed hours were [DELETED] hours below the minimum historical baseline for the work. AR, Tab 135, Price Competition Memorandum, at 189-91, 194-98.

The evaluation record is simply devoid of any independent assessment of whether the offerors’ proposed labor hours, skill mix, and labor mix were sufficient to successfully perform the requirements of the cost reimbursable sample task order. Accordingly, we agree with HP and Booz Allen that the agency failed to conduct a reasonable cost realism analysis as required by the RFP and the FAR, and we sustain the protests on these grounds.[3]

Technical Acceptability Evaluation

HP further asserts, based on the same facts discussed above, that the agency did not conduct a reasonable evaluation of technical acceptability in accordance with the terms of the RFP. Specifically, HP contends that these facts show that the agency did not consider whether the offerors’ technical responses to the cost reimbursable sample task order demonstrated “an understanding of the work to be performed . . . an appropriate mix of labor categories, reasonableness of labor hours, and an understanding of the knowledge, skills, abilities, and products needed to meet the requirements.” AR, Tab 7, RFP, at 177.

The evaluation of technical proposals is a matter within the discretion of the contracting agency, since the agency is responsible for defining its needs and the best method for accommodating them. SRA Int’l, Inc., B-408624, B-408624.2, Nov. 25, 2013, 2013 CPD ¶ 275 at 4. In reviewing an agency’s evaluation, we will not reevaluate technical proposals, but instead will examine the agency’s evaluation to ensure that it was reasonable and consistent with the solicitation’s stated evaluation criteria and with procurement statutes and regulations. Id. However, in order to facilitate our examination, contracting agencies are required to adequately document their evaluation results, and sufficiently support the findings on which award determinations are made. Savvee Consulting, Inc., B-408416, B-408416.2, Sept. 18, 2013, 2013 CPD ¶ 231 at 7. In this case, we cannot conclude that the agency’s evaluation record supports the conclusion that the offerors provided an appropriate mix of labor categories, and reasonable labor hours.

As described above, our review of the agency’s technical evaluation reveals that for the cost reimbursable sample task order, the agency evaluator apparently interpreted the evaluation criteria as merely requiring the agency to verify that the proposed labor hours, descriptions and qualifications were listed in the offerors’ technical proposals, and that the hours set forth in the technical proposals matched those set forth in the cost proposals. This mere confirmation that the offerors provided the required information in their proposals, and that the technical and cost proposals were consistent, is no substitute for an evaluation of whether the information provided demonstrated an understanding of the work to be performed, an appropriate labor mix, and reasonable labor hours, as required by the RFP. Where the record does not demonstrate that the agency conducted the required analysis, we sustain the protest.

Performance Confidence Evaluation

The four protesters next present a variety of allegations concerning the elements of the agency’s performance confidence evaluation. Generally, the protesters assert that the performance confidence analysis was unreasonable and inconsistent with the RFP where the agency overwhelmingly emphasized the recency, relevance, and location of past performance references over the quality of their performance. More specifically, CSC and Booz Allen argue that the agency’s recency assessment was arbitrary and unreasonable where several of the protesters’ past performance references were discounted as “less recent,” while awardees’ similarly recent past performance references were considered “more recent.” Additionally, all four protesters object to the agency’s process of assessing “concerns” against their records of past performance--which were used in the SSDD as the principal differentiator between offerors with the same overall performance confidence rating.[4]

As a general matter, the evaluation of an offeror’s past performance is within the discretion of the contracting agency, and we will not substitute our judgment for reasonably based past performance ratings. Clean Harbors Envtl. Servs., Inc., B-296176.2, Dec. 9, 2005, 2005 CPD ¶ 222 at 3. However, we will question an agency’s evaluation conclusions where they are unreasonable or undocumented. OSI Collection Servs., Inc., B-286597, B-286597.2, Jan. 17, 2001, 2001 CPD ¶ 18 at 6. The critical question is whether the evaluation was conducted fairly, reasonably, and in accordance with the solicitation’s evaluation scheme, and whether it was based on relevant information sufficient to make a reasonable determination of the offeror’s past performance. Id. Here, we agree with the protesters that the agency’s performance confidence evaluation was unreasonable and inconsistent with the terms of the RFP.

Overall, our consideration of the record demonstrates that the agency’s evaluation was inconsistent with the RFP’s evaluation criteria concerning the performance confidence assessment. In this respect, the RFP provided that the performance confidence assessment was to “evaluate past performance through an integrated analysis of recency, relevancy, and the quality of work performed,” “focusing on the quality of the work performed (as described in M3.6 and Table 3) and the relevancy (as described in M3.5 and Table 2) to the acquisition.” AR, Tab 7, RFP, at 165, 178.

However, the five-step evaluation process instituted by the agency, as discussed in greater detail below, resulted in a misleading view of the actual past performance records of the various awardees by substantially elevating the importance of “more recent” performance within the recency window, relevance as defined by references’ application to the 27 relevance subcriteria areas, and worldwide performance. While these elements were set forth in the RFP in the context of establishing “greater weight” to be afforded to past performance references, the agency’s evaluation process elevated their importance to minimum threshold criteria for consideration of an offeror’s references in various elements of the evaluation, without respect to the reference’s assigned relevance (i.e., relevant/highly relevant) or quality. Additionally, because the agency only considered whether an offeror’s references were “favorable” in determining whether the references would count towards earning a substantial confidence rating, the evaluation process led the agency to misunderstand the specific quality distinctions between the offerors.

Overall, by severely minimizing the impact of quality, and even the overall relevancy of past performance references, in favor of the mechanical application of the agency’s “more recent,” sustained, worldwide performance and subcriteria thresholds, the agency overlooked meaningful differentiators, such as performance quality, while emphasizing misleading differentiators, such as “concerns.” This evaluation process produced misleading results, and was unreasonable. We sustain the protests on this basis, as discussed in more detail below.

Five-Step Performance Confidence Evaluation Process

The agency explains that its performance confidence assessment group (PCAG) conducted its evaluation process in five steps. First, the agency considered whether each past performance reference was recent and sustained. This evaluation considered (1) whether the reference was performed within the recency evaluation window (November 4, 2007 – November 4, 2011), (2) whether it was more recent or less recent within that window, and (3) whether the reference demonstrated sustained performance of more than one year within the window. See Contracting Officer’s Statement, Booz Allen Protest, at 39-40. If the reference was both “more recent” within the window and sustained, the reference was considered to have “greater impact.”[5]

Second, the agency considered relevancy by assessing whether each reference addressed each of the 27 subcriteria areas of the five performance confidence areas (systems sustainment, web service development, management, cost, and small business participation). Id. at 35-36. The result of this process was that each reference was assigned a relevancy rating for each of the five confidence criteria areas--that is, a reference could be considered highly relevant for systems sustainment, but only somewhat relevant for web service development.

Third, the agency considered quality by rating the quality of each reference under each of the five confidence areas, based on quality assessment questionnaires and data from the Contractor Performance Assessment Reporting System (CPARS). Id. at 38. If the quality ratings under a confidence criteria area were “positive” (apparently satisfactory or better), the reference was considered “favorable” for that confidence area.[6]

Fourth, the agency assigned each offeror its overall performance confidence rating. As relevant here, in order to receive the highest performance confidence rating of substantial confidence, an offeror was required to demonstrate, in each of the five confidence criteria areas, two or more past performance references that were (1) “greater impact” (“more recent” and sustained for more than one year), (2) relevant or highly relevant, (3) favorable, and (4) performed at worldwide locations. Id. at 40. If an offeror failed to demonstrate two references meeting this threshold for any of the five criteria areas, that offeror was limited to a satisfactory confidence rating.

Fifth, after assigning the overall performance confidence ratings, the agency returned to the 27 relevancy subcriteria, and reanalyzed the offerors’ references to determine whether each offeror had provided two references meeting the “substantial confidence” thresholds--as set forth above--for each of the 27 subcriteria.[7] Where an offeror’s proposal did not demonstrate two or more of these references for a subcriteria, that subcriteria area was labeled as a “concern.” Id. at 41. In the SSDD, these subcriteria areas were discussed as areas in which the PCAG questioned the offerors ability to perform.[8] The “concerns” also served as the basis for the agency’s ranking between offerors that shared the same overall performance confidence assessment rating.[9]

Recency Assessment

CSC and Booz Allen each assert that the agency’s assessment of recency was arbitrary and unreasonable, because several of the protesters’ references, performed at least in significant part within the recency window set forth in the RFP, were considered “less recent” and therefore “less impactful,” while similarly recent references provided by the awardees were considered “more recent” and “more impactful.” The protesters argue that this error was significantly prejudicial where the “less recent” determination prevented the references from being considered in the agency’s analysis of whether an offeror warranted a substantial confidence rating, or in the assessment of “concerns,” without regard to the overall relevance or the quality of the reference. The protesters argue that their references were sustained, relevant, and high quality, but were not given appropriate consideration in the agency’s analysis.

As a specific example, CSC points to three of its own references, each of which started prior to the recency window and ended in late 2009 or early 2010, as follows:

Offeror	Reference	Start	End	Sustained
CSC	CSC, Reference 1	11/2004	12/2009	2.15 Years
CSC	CSC, Reference 9	03/2005	03/2010	2.40 Years
CSC	CSC, Reference 10	03/2007	05/2010	2.57 Years

AR, Tab 137, SSDD Briefing, at 63. Although performed substantially within the recency window of November 4, 2007 through November 4, 2011, these references were each considered to be “less recent” and were not further considered for the purposes of determining whether CSC warranted a substantial confidence rating, or for the purpose of assessing “concerns” against CSC’s record of past performance. In contrast, CSC points to two references, provided by awardees, which demonstrate similarly recent end dates but were considered “more recent” by the agency:

Offeror	Reference	Start	End	Sustained
[DELETED]	[DELETED], Reference 9	07/2009	06/2010	1.00 Years
[DELETED]	[DELETED], Reference 10	02/2009	08/2010	1.58 Years

Id. at 140, 85. CSC asserts that the agency treated these offerors unequally, because there is no substantive difference between the end dates of CSC’s references, versus those of [DELETED] and [DELETED], which would rationally support permitting the awardees’ references to contribute to a substantial confidence rating and to resolve “concerns,” while preventing CSC’s references from being considered for either purpose.[10]

Concerning Booz Allen, four of Booz Allen’s past performance references were considered “less recent.” Booz Allen alleges that two of these four references ended in the latter half of the recency evaluation window (both ending in December 2009), and should have been considered “more recent.” Booz Allen asserts that if consideration had been given to these two references in the “concerns” evaluation, Booz Allen’s evaluation would have reflected zero concerns, rather than [DELETED] concerns. Booz Allen Comments at 32. Booz Allen further alleges that the recency criteria, as applied, were unreasonable where four of its references were eliminated from consideration in the “concerns” evaluation, despite three of those references demonstrating highly relevant past performance in four of the five confidence criteria areas, exceptional quality ratings, and worldwide performance.

The contracting officer responds that the recency evaluation was not unreasonable or unequal because the agency selected a common cut-off date for more recent versus less recent past performance, and applied that cut-off date to all offeror’s references. Although the common cut-off date is not apparent from the record, the contracting officer asserts that the agency selected June of 2010 as the threshold for demonstrating “more recent” performance, with references ending during or subsequent to that time considered more recent and therefore “greater impact.” Supplemental Contracting Officer’s Statement, CSC, at 4; Contracting Officer’s Statement, Booz Allen, at 39. The contracting officer contends that the application of this cut-off date explains the determination that the protester’s references were “less recent,” while the highlighted awardee references were “more recent.”

As an initial matter, we afford little weight to the contracting officer’s assertion that the PCAG applied a common cut-off date of June 2010 to all offerors’ past performance references. We will generally consider post-protest explanations that provide a detailed rationale for contemporaneous conclusions and simply fill in previously unrecorded details so long as those explanations are credible and consistent with the contemporaneous record. ITT Fed. Servs. Int’l Corp., B-283307, B-283307.2, Nov. 3, 1999, 99-2 CPD ¶ 76 at 6.

In this case, the contracting officer’s statement does not meet the above standard, since it is contradicted by evidence present in the contemporaneous record. Specifically, the contemporaneous record contains meeting notes from the SSDD Briefing which quote a discussion of the recency analysis. The discussion is transcribed as follows:

Question--[SSA]: Was there a numerical cut-off for determining recency, or for determining a citation to be more recent?
Response--[PCAG Chairperson]: Approximately one year was used for recent.
[SSET Chairperson]: For more recent we looked for citations towards the end of the window.
[PCAG Chariperson]: There wasn’t any specific cutoff though.

AR, Tab 138, SSDD Briefing Notes, at 3. Where the contemporaneous record does not support the assertion that the PCAG utilized a common cutoff date of June 2010 to determine “more recent” past performance, we accord the contracting officer’s assertion little weight.

Absent the application of the cut-off date of June 2010, which is not supported by the contemporaneous record, there is no discernable basis for why CSC’s reference ending in May 2010 was materially “less impactful” than the awardee [DELETED]’s reference ending in June 2010. In this regard we see no reasonable explanation for why [DELETED]’s reference was viewed as supporting a substantial confidence rating and resolving “concerns,” while CSC’s reference was excluded from consideration for those purposes, notwithstanding the fact that CSC’s reference was superior to [DELETED]’s reference in all aspects of the past performance evaluation other than recency.

In this connection, CSC’s reference was evaluated as relevant in one past performance area and highly relevant in three areas, with exceptional quality, while [DELETED]’s reference was evaluated as relevant in two areas and highly relevant in two areas, with very good quality. AR, Tab 137, SSDD Briefing, at 64-67, 141-44. CSC’s reference was also more sustained than [DELETED]’s reference, and was performed at worldwide locations, while [DELETED]’s reference was performed only within the continental United States. Id. at 63, 140. Nonetheless, because [DELETED]’s reference contract ended one month more recently, only [DELETED]’s reference was “higher impact” in the agency’s analysis. We cannot conclude that the agency had a reasonable basis to support this result.

In the final analysis, even to the extent the PCAG had utilized a common cutoff date, the agency’s approach to considering “more recent” versus “less recent” past performance was not reasonable. In this regard, we conclude that it was not reasonable to wholly exclude an offeror’s references from consideration in assigning a substantial confidence rating, or in resolving “concerns,” based on minor differences in recency without consideration of whether less recent references demonstrated highly relevant, exceptional quality, and/or worldwide past performance. Accordingly, we sustain the protests.

Assessment of “Concerns”

All four protesters allege that the agency’s assignment and application of “concerns” in the performance confidence assessment was significantly flawed. First, CSC, HP, and Booz Allen argue that the agency unreasonably relied on “concerns” to differentiate the quality of offerors assessed the same overall performance confidence rating where the “concerns” evaluation process did not meaningfully consider, and in fact misrepresented, relative quality. Second, Booz Allen and Harris allege that the assessed “concerns” could not reasonably stand in for or represent an offeror’s ability to perform elements of the work, and wholly ignored the offerors’ substantive records of past performance set forth in their “lower impact” references. We agree with the protesters on both issues.[11]

First, as explained above, in order to determine areas of concern, the agency conducted a reevaluation of the offerors’ past performance references against the 27 performance confidence relevancy subcriteria, looking for two references meeting the agency’s substantial confidence thresholds for each subcriteria. For each subcriteria in which an offeror did not provide two references meeting the thresholds, the offeror was assigned a concern. In the SSDD briefings and SSDD, these concerns were used as the primary differentiator between offerors with equivalent overall performance confidence ratings. For example, in comparing the offerors with substantial confidence ratings, the SSDD reads as follows:

Moreover, not all offerors with a Substantial confidence assessment are the same. For example, Booz Allen was rated Substantial despite having [DELETED] areas of concern (where the offeror did not address the [DELETED] criteria with at least two citations with greater impact at worldwide locations) while [DELETED] was assessed as substantial with [DELETED] areas of concern.

AR, Tab 139, SSDD, at 14. Further, it is apparent that the SSA used the differences in the number of “concerns” as reflective of a difference in the relative quality of the offerors’ past performance. For example, the SSA concluded, based on the number of “concerns,” that “[DELETED].” Tab 138, SSDD Briefing Notes, at 12.

However, because the substantial confidence thresholds established by the agency only considered whether an offeror’s past performance was “favorable,” and discarded references based on non-quality related bases, an offeror’s number of “concerns” did not actually indicate whether the offeror had a record of higher quality past performance. As a consequence of this disconnect, it is clear that many of the SSDD’s conclusions concerning the relative merits of the offeror’s past performance quality were unsupported.

For example, regarding Booz Allen, an analysis of the actual past performance quality ratings assigned to Booz Allen’s references, versus those of [DELETED] and [DELETED], demonstrates that, notwithstanding its [DELETED] “concerns”, Booz Allen’s overall past performance quality was roughly equal to [DELETED]’s, and equal to or slightly higher than [DELETED]’s. See AR, Tab 137, SSDD Briefing at 39 (Booz Allen), 116 ([DELETED]), 149 ([DELETED]). Accordingly, the agency’s use of the assessed “concerns” to differentiate between the past performance quality of offerors with equal performance confidence ratings was misguided where the “concerns” did not reflect on the relative quality ratings of the offerors’ past performance. Nor can we conclude that the “concerns” served as a reasonable proxy for an integrated analysis of the underlying factors contributing to the offerors’ overall performance confidence assessments. Accordingly, we sustain the protests.

Second, we agree with the protesters that it was unreasonable for the agency to interpret the assessed “concerns” as reflecting on the offeror’s ability to perform in the subcriteria areas. As noted by the agency, all 20 of the offerors considered for award under the RFP were considered technically acceptable. Further, to the extent the agency viewed the “concerns” as reflecting gaps in the offerors’ past performance concerning the subcriteria areas, this conclusion was also unreasonable because the agency’s process for assessing concerns wholly discarded substantial evidence of relevant, high quality past performance contained in references that did not meet the agency’s supplemental “greater impact” and performance location thresholds.

As examples, again turning to Booz Allen’s evaluation, the agency’s performance confidence methodology led it to conclude that there were “concerns” regarding Booz Allen’s proposal under the “[DELETED]” and “[DELETED]” subcriteria of the system sustainment performance confidence criteria area. As addressed above, these “concerns” were assessed because Booz Allen’s evaluation did not demonstrate two or more references addressing the subcriteria areas that were (1) “greater impact” (“more recent” and sustained for more than one year), (2) relevant or highly relevant, (3) favorable, and (4) performed at worldwide locations. Based on Booz Allen’s lack of qualifying references, the agency recorded that “the PCAG had concerns about [Booz Allen’s] ability to perform” in the [DELETED] and [DELETED] areas, among other areas. AR, Tab 137, SSDD Briefing, at 40.

However, the underlying record of the agency’s evaluation shows that Booz Allen in fact provided four past performance references that addressed the [DELETED] subcriteria. Of these four references, three were rated highly relevant to the system sustainment criteria area with exceptional performance quality, while the fourth reference was rated relevant in the criteria area with very good quality. AR, Tab 137, SSDD Brieifing, at 41-42. Additionally, two of the highly relevant/exceptional references, and the relevant/very good reference were sustained, and all four of the references were performed at worldwide locations. Id. However, because these references were not “more recent” as defined by the agency, they did not meet the agency threshold for consideration and were discarded, resulting in the agency’s conclusion that there was a concern about Booz Allen’s ability to perform in the [DELETED] subcriteria area.[12]

Similarly, the record shows that Booz Allen’s proposal provided six past performance references addressing the [DELETED] subcriteria area. All six references were relevant or highly relevant and exhibited very good or exceptional quality. In addition, two of the six references were both recent and sustained as defined by the agency. Despite this substantial record of past performance in the subcriteria area, the agency concluded that there was a “concern” in this area because, of the two references that were recent and sustained, only one reference was for work performed at worldwide locations. Accordingly, because Booz Allen did not demonstrate two or more references meeting all agency thresholds for consideration, the SSA concluded that there were “concerns about Booz Allen’s ability to perform” in the area of [DELETED]. AR, Tab 139, SSDD, at 6.

We cannot find the agency’s “concerns” assessment process, or the SSA’s conclusions, to be reasonable where Booz Allen in fact presented an expansive history of relevant, high quality performance in subcriteria areas that were flagged as “concerns.” The record confirms that in the assessment of “concerns,” the agency disregarded any references that did not meet all of the agency’s “greater impact” thresholds, despite that fact that the references met the requirements for consideration set forth in the RFP, and demonstrated relevant, high quality prior work in the areas.[13]

The fact that an offeror’s references addressing a subcriteria area were of “lesser impact” in the agency’s view did not permit the agency to entirely disregard the contents of the references in the assessment of “concerns.” The agency’s requirement that an offeror demonstrate at least two “greater impact,” worldwide references in each subcriteria area to avoid the assessment of a “concern,” without regard to any combination of less recent, less sustained, or non-worldwide references, was arbitrary and did not reflect a reasonable, integrated assessment of the offeror’s past performance in accordance with the RFP’s performance confidence assessment critera. We sustain each of the four protests on this basis.

Source Selection Tradeoff Decisions

Based on the errors already identified in the evaluation process, the agency’s tradeoff decision cannot stand. However, in addition, HP alleges that the agency departed from the RFP’s stated tradeoff criteria by emphasizing price over performance confidence, and improperly based the award decisions on the awardee’s collective “coverage” of the 27 relevance subcriteria areas. Our review of the SSDD reveals that HP’s allegations are correct.

In this connection the SSDD describes the SSA’s selection of the ten awardees as follows:

I am selecting ten offers for award, through the trade-off described below, because we had to go from the lowest priced to [DELETED] to get all the areas of past performance covered with enough offerors to ensure there would be robust competition on task orders. (For purposes of this contract, I believe robust competition means there will be at least four offerors with past performance in the specific criteria areas relevant to an individual task order being solicited at any time in the future.) I added [DELETED] as the tenth awardee because of the marginal price difference for the substantive value added by its past performance record.

AR, Tab 139, SSDD, at 13. Comparison with the SSDD Briefing notes confirms that the award decisions were made in consideration of how many of the lowest-priced substantial confidence offerors were required for “coverage” of the 27 relevance subcriteria areas. Specifically, after reaching consensus on award to the five lowest-priced, substantial confidence offerors, the PCAG Chairperson reasoned as follows:

In order to ensure adequate task order competition, if we start with the top five offerors, how far into the field of offerors do we have to go to get more criteria covered? Using this method, in price order, we have [DELETED] that got awards before, the next Substantials are [DELETED]. We have to go to [DELETED] for Web Service Lifecycle Management, Web Service Information Security, Web Service System Performance, and Web Service Data Exposure. We have to go to [DELETED] for System Sustainment Documentation and Testing, System Sustainment Help Desk, and Management Subcontractors. We have to go to [DELETED] for Web Service System Performance, Web Service Data Exposure, and Web Service Preparing Data. We also have to go to [DELETED] for Management Retention.

AR, Tab 138, SSDD Briefing Notes, at 11. This rationale was then accepted by the SSA, who agreed that “we have to have at least 8 [awardees] because we want at least 4 who can compete for task orders in every criterion area.” Id. at 11-12. After further discussion, the SSA determined to “draw the line to include [DELETED], remove Harris and CSC, and make an award to the 10 low-priced Substantial Confidence offerors.”

This selection process does not appear to have been based on an integrated assessment of past performance in support of a tradeoff considering performance confidence and price. AR, Tab 7, RFP, at 175. While the SSDD reflects substantive consideration of past performance and price between certain offerors, ultimately, it is apparent that the number of awards and selection of the awardees was driven in part by determining which offerors collectively demonstrated coverage of the relevancy subcriteria areas, such that there would be at least four awardees without a “concern” for each subcritera area. In effect, this process elevated the 27 relevance subcriteria--set forth in the RFP as “[c]riteria considered for magnitude and complexity,” Id. at 181--to de facto corporate experience criteria, considered to determine which offerors’ past experience was most valuable to the agency. This source selection approach was inconsistent with the terms of the RFP, and we sustain the protest on this basis.

Discussions

Finally, HP alleges that the agency’s discussions were not meaningful because the record reflects that the agency considered HP’s TEP unreasonable, but the agency did not inform HP of that fact during discussions. When an agency engages in discussions with an offeror, the discussions must be “meaningful,” that is, sufficiently detailed so as to lead an offeror into the areas of its proposal requiring amplification or revision in a manner to materially enhance the offeror’s potential for receiving the award. FAR § 15.306(d); Bank of Am., B-287608, B-287608.2, July 26, 2001, 2001 CPD ¶ 137 at 10-11. While an agency need not inform an offeror that its price is higher than that of its competitors, it must advise an offeror if its price is considered unreasonably high, or unawardable. Id.

Although the record contains at least two references supporting HP’s contention that the agency considered its price unreasonable, AR, Tab 138, SSDD Briefing Notes, at 8, 10, the record also reflects that the agency considered HP in its tradeoff analysis, notwithstanding its price. Thus, the SSDD does not reflect a conclusion that HP’s TEP was unreasonable and awardable, and supports the contracting officer’s representation that HP’s TEP was not considered unreasonable.[14]

Additionally, HP asserts that the agency conducted unequal discussions where it allowed Raytheon to modify its past performance references subsequent to the first award decision, without allowing other offerors the opportunity to modify their proposals. We disagree that unequal discussions occurred. Rather, we agree with the agency that the exchanges with Raytheon did not constitute discussions, and that the agency, therefore, was not required to reopen discussions with all offerors.

In this connection, during its debriefing following the first award decision in this procurement, Raytheon identified an error in one of its past performance questionnaires. Specifically, Raytheon informed the agency that the questionnaire incorrectly listed the start date for the reference in 2010, when in fact the reference contract began in the year 2000. Although the start date was correctly listed in Raytheon’s proposal as 2000, the agency had evaluated the reference based on the start date listed in the questionnaire, which resulted in the reference being evaluated as “lesser impact.”

During the corrective action period following the protests of the first award decision, the agency reached out to the point of contact for this past performance questionnaire and asked for validation of the start date of the contract. The point of contact validated that the contract began in 2000. The PCAG then updated Raytheon’s performance confidence assessment to reflect the correct start date, as set forth in Raytheon’s proposal and validated by the reference point of contract. On reevaluation with the correct start date, the agency found that the reference was a “greater impact” reference, which allowed Raytheon to meet the agency’s threshold for a substantial confidence performance confidence rating. Accordingly, Raytheon’s performance confidence assessment was upgraded from satisfactory to substantial.

Discussions occur when an agency communicates with an offeror for the purpose of obtaining information essential to determine the acceptability of a proposal, or provides the offeror with an opportunity to revise or modify its proposal in some material respect. FAR § 15.306(d). Here, Raytheon’s proposal was not modified or revised as a result of its exchanges with the agency during its debriefing. Rather, the exchanges caused the agency to review the information already provided in Raytheon’s proposal, and to validate the correct start date for Raytheon’s past performance reference. Therefore, this exchange did not constitute unequal discussions, and the basis of protest is denied.

RECOMMENDATION

We recommend that the agency perform and document a proper cost realism analysis, including a documented assessment of whether the offerors’ proposed labor categories and labor hours are realistic to complete the work in accordance with each offeror’s unique technical approach. We also recommend that the agency document a technical evaluation of whether the offerors demonstrated an “understanding of the work to be performed . . . an appropriate mix of labor categories, reasonableness of labor hours, and an understanding of the knowledge, skills, abilities, and products needed to meet the requirements,” as required by the RFP. AR, Tab 7, RFP, at 177. Finally, we recommend that the agency perform a new performance confidence assessment reflecting an integrated analysis of recency, relevancy, and quality of past work, giving reasonable consideration to the quality of prior efforts, and document a new source selection decision consistent with the RFP’s requirement to make awards on the basis of performance confidence and cost/price.

We also recommend that the protesters be reimbursed the reasonable costs of filing and pursuing the protest, including attorneys’ fees. Bid Protest Regulations, 4 C.F.R. § 21.8(d)(1) (2012). The protesters should submit their certified claims for costs, detailing the time expended and the costs incurred, directly to the contracting agency within 60 days after receipt of this decision.

The protest is sustained.

Susan A. Poling
General Counsel

[1] For clarity, we have modified the table to present the offerors in ascending order of proposed labor hours, to omit the technically unacceptable offeror, and to reflect the names of the offerors.

[2] We present this chart exactly as it is shown in the Price Evaluation and the Contracting Officer’s Statement, including use of the agency’s abbreviations for the names of the offerors, and including the technically unacceptable offeror.

[3] Booz Allen additionally challenged the agency’s realism analysis concerning the offerors’ proposed labor rates. In this regard, the agency explains that it first compared the labor rates to rates provided by the Bureau of Labor Statistics. Where rates significantly departed from the BLS rates, the agency compared the rates to commercial sources, such as payscale.com and salary.com. Booz Allen alleged that this approach was unreasonable where the commercial websites present only wide ranges of salaries, and where the agency did not specifically explain why departures from the BLS rates were realistic. We disagree with the protester. An agency’s cost realism analysis need not achieve scientific certainty; rather, the methodology employed must be reasonably adequate and provide a measure of confidence that the agency’s conclusions about the most probable costs under an offeror’s proposal are reasonable and realistic in view of the cost information reasonably available to the agency at the time of its evaluation. Serco Inc., B-407797.3, B-407797.4, Nov. 8, 2013, 2013 CPD ¶ 264 at 9. The agency’s realism evaluation of the offerors’ proposed labor rates demonstrates an independent assessment utilizing appropriate data. The protester’s disagreement with the results of the agency’s evaluation does not render the evaluation unreasonable.

[4] CSC, HP, and Booz Allen additionally challenge the agency’s characterization of their past performance records as lower quality in comparison to various awardees. We do not provide substantial analysis of these arguments because, due to other errors discussed in this decision, we recommend that the agency conduct a new performance confidence evaluation and tradeoff decision. However, we agree with CSC, HP, and Booz Allen, that several statements in the second source selection briefing, concerning the comparative assessments of their past performance against the awardees, were unsupported by the record. See CSC Comments at 30-31 (SSA misunderstanding of the recency of CSC’s past performance references in determining that CSC’s “[DELETED].” AR, Tab 139, SSDD, at 19); HP Comments at 42-43 (Incorrect determination that HP had fewer very good and excellent past performance quality reports in comparison to awardees. AR, Tab 138, Briefing Transcript, at 8); Booz Allen Comments at 42-44 (Assessment that Booz Allen past performance was “poorer quality,” and that there was a “[DELETED],” AR, Tab 138, Briefing Transcript, at 6, 12, is not supported by underlying performance quality assessments.).

[5] The record is somewhat inconsistent between the contracting officer’s statements, SSDD Briefing narratives, and SSDD Briefing charts, as to whether the “greater/lesser impact” label included assessment of relevancy as well as recency. We conclude that the most consistent indicator of the impact assessment is presented in the SSDD Briefing charts, which appear to base the assessment on recency (including sustainment) alone.

[6] The SSDD briefing narratives suggest that an offeror’s references were required to have a quality rating of very good or exceptional to support a substantial confidence rating. However, we cannot reconcile certain evaluation conclusions with that requirement. See e.g., AR, Tab 137, SSDD Briefing, at 21 ([DELETED] evaluation under the web services criteria does not demonstrate two references meeting all criteria for a substantial confidence rating, unless satisfactory quality ratings were considered “favorable.”) In any event, whether the agency’s threshold required a minimum of satisfactory or very good performance quality, we conclude that the threshold served to mask the individual quality ratings of the offerors in assigning certain offerors substantial confidence ratings.

[7] That is, this analysis looked for two or more references that were (1) “greater impact” (“more recent” and sustained for more than one year), (2) relevant or highly relevant, (3) favorable, and (4) performed at worldwide locations, for each of the 27 subcriteria.

[8] For example, based on the “concerns,” the SSA wrote that “the PCAG had concerns about [DELETED]’s ability to perform six of the twenty-seven criteria,” “the PCAG had concerns about [DELETED]’s ability to perform four of the twenty-seven criteria,” “the PCAG had concerns about [DELETED]’s ability to perform three of the twenty-seven criteria,” and “PCAG had concerns about [DELETED]’s ability to perform nine of the twenty-seven criteria.” AR, Tab 139, SSDD, at 6-8.

[9] For example, the SSA explained that “not all offerors with a Substantial confidence assessment are the same. For example, [DELETED] was rated Substantial despite having 9 areas of concern (where the offeror did not address the 9 criteria with at least two citations with greater impact at worldwide locations) while [DELETED] was assessed as substantial with no areas of concern.” AR, Tab 139, SSDD, at 14.

[10] CSC notes that its three past performance references met all other elements of the agency’s “substantial confidence” threshold. Accordingly, based on information clearly ascertainable from the agency’s record, if these three references had merely been considered “more recent” in the agency’s evaluation scheme, CSC’s rating would--essentially automatically--change from satisfactory confidence with [DELETED] concerns, to substantial confidence with [DELETED] concerns. CSC Comments at 9. CSC asserts that this significant change in its overall rating on the basis of only a few months’ difference in the recency of its references (within the RFP’s stated recency window) serves to demonstrate the unreasonable nature of the agency’s overall evaluation.

[11] HP also alleges that the entire “concerns” process was improperly duplicative, because the application of the offerors’ references to the 27 confidence subcriteria had already been considered in the relevance rating, which was considered within the offeror’s overall performance confidence rating. We agree. The RFP set forth the 27 subcriteria areas as guides for the agency’s determination of the relevance of offerors’ past performance references. Thus, offerors’ references’ coverage of the subcriteria areas was considered in assigning each reference’s overall relevance rating. When the agency then returned to the subcriteria areas for the purpose of assessing, essentially, each offerors’ corporate experience in the subcriteria areas, the agency improperly “double-counted,” and greatly exaggerated the importance of subcriteria areas in the evaluation. See J.A. Jones Management Services, Inc., B-254941.2, Mar. 16, 1994, 94-1 CPD ¶ 244 at 6.

[12] As discussed above, we conclude that the agency’s recency analysis was unreasonable. However, for the purposes of our discussion of the agency’s “concerns” evaluation, even if the agency had utilized a reasonable basis for evaluating “more recent” versus “less recent” past performance, the “concerns” evaluation remains unreasonable for failure to afford any consideration to references not meeting the agency’s various ratings thresholds.

[13] Although we present only selected examples from Booz Allen’s evaluation in this analysis, each offeror demonstrated that they were prejudiced by the agency’s evaluation in the same manner.

[14] Nonetheless, our review of the evaluation record strongly suggests that multiple other offerors may not have been seriously considered during the SSDD briefing or in the SSDD, due to the agency’s concerns regarding the reasonableness of the offerors’ TEPs, without regard to any level of past performance superiority. While we do not sustain the protest on this basis, we conclude that the agency may want to reconsider the reasonableness of the TEPs as a part of any reevaluation pursuant to this decision, and consider whether discussions are needed in this area.