On the second day of April 2011, a group of test pilots and engineers boarded a Gulfstream business jet prototype for a routine test flight at Roswell, New Mexico. But as the aircraft lifted off the runway, it suddenly veered to the right, crashed into the ground, and burst into flames, destroying the test aircraft and killing all four people on board. The cause of the accident would prove to be anything but simple. Somewhere in the complex design process, Gulfstream had made assumptions about the aircraft's performance that turned out to be inaccurate, sending its test pilots on a mission impossible. The issue had permeated the core of the company, as the company's demands to meet certification schedules pressured engineers to avoid following clues that their calculations were wrong. The final accident would provide important lessons not only for Gulfstream, but for anyone handling large company projects, about the dangers of creating a work environment where safety testing becomes a formality.

When a new aircraft carries passengers for the first time, the world sees it as a beginning: the start of a long-term service life, a new era for a company, or the latest trend in design. But for another group, a group often overlooked, the introduction of a new aircraft is not a beginning but an end, the culmination of thousands of challenging and sometimes dangerous working days that turned an idea into a truly functional and safe aircraft. These are the test pilots and flight test engineers - those who push prototype aircraft into the unknown, so that future pilots do not have to.

For an established manufacturer like Gulfstream Aerospace, creator of high-end business jets, the process of designing, building, and testing a new aircraft is a familiar (though lengthy) habit. There is a clearly defined list of hundreds of tasks to be completed, from determining the basic aerodynamic characteristics of the aircraft to perfecting the procedures that pilots will one day use regularly. But it may be that familiarity that led Gulfstream astray.

◊◊◊

In 2008, Gulfstream announced that they had begun work on the G650, a twin-engine business jet with seating for up to 19 passengers. Loosely based on the Gulfstream's main product line dating back to the 1980s, the jet had undergone a major redesign to make it larger, faster, more capable, and more expensive than previous models. Early advertising promised that the aircraft would be ready by September 2011 and would be able to take off from runways as short as 6,000 feet, increasing the number of airports that customers could fly to compared to competitors.

Determining the minimum takeoff distance for an aircraft requires complex calculations about its takeoff performance, not only in normal operations, but also in the case of an engine failure. But to understand what Gulfstream was trying to achieve - and why it went wrong - an explanation of some important speeds is necessary. Warning: Many numbers lie ahead.

For the average pilot, the most important takeoff parameter is V1 or decision speed, the highest speed at which a pilot can reject takeoff and stop the aircraft on the runway under specified conditions.

The second most critical speed overall and most important in this specific case is V2 or safe takeoff speed, the minimum speed at which the aircraft must achieve a height of 35 feet above the ground in the event of an engine failure. This speed ensures that the aircraft can safely climb on one engine while still being controllable.

According to federal regulations, the minimum takeoff distance for an aircraft is determined by the distance needed to accelerate to V1 and then stop; 115% of the distance needed to accelerate on all engines to V2 + 10 nautical miles by a height of 35 feet; or the distance needed to reach V2 with a height of 35 feet after an engine failure one second before V1, whichever is most limiting. In the case of the Gulfstream G650, it was the third scenario that proved to be the deciding factor in its minimum takeoff distance. Therefore, Gulfstream wanted to ensure that V2 was slow enough for the aircraft to legally take off from a 6,000-foot runway, as it had promised to potential customers.

Most of the time, the V2 speed for a new aircraft is derived from a special mathematical formula applied to a different speed, called VMU. VMU, or minimum unstick speed, is the slowest speed at which an aircraft can become airborne. Pull the nose back as far as it will go, and the speed at which the plane lifts off the runway is VMU, a parameter that cannot be changed for a set of specific conditions and can be determined experimentally.

On previous Gulfstream business jets, engineers had found that their calculations produced a V2 speed lower than the minimum allowed by regulations. When those aircraft were designed, V2 was required to be no less than 1.2 times the stall speed, a speed below which the wing cannot generate enough lift to keep the plane airborne. Because these early Gulfstream models could safely climb on one engine with a speed lower than this regulatory minimum, the law had become a limiting factor in determining the allowable minimum takeoff distance, rather than the aircraft's performance. Therefore, V2 was defined as equal to the minimum as per regulations and other takeoff parameters, including the necessary distance, were reverse-engineered from V2.

However, in the 1990s, the Federal Aviation Administration changed the regulatory minimum for V2 to 1.13 times the lowest speed at which the wing can support the weight of the aircraft (also known as VSR). It turns out that Gulfstream did not fully assess how this change would affect its strategy to achieve takeoff speed on future aircraft.

When Gulfstream began work on the G650 in the late 2000s, it once again defined V2 as the regulatory minimum of 1.13 VSR. From this value, engineers derived VR, the speed at which a pilot will rotate the nose up, and VLOF, the speed at which the aircraft will lift off the runway during a normal takeoff, using a formula developed for its predecessor aircraft, the G550.

However, hidden in this process was a key assumption that they did not carefully consider: despite working with a new aircraft and a new definition of the V2 rule, the V2 speed calculated directly from VMU would still be below the regulatory minimum. If someone had truly crunched the numbers, they would have discovered that the actual V2 speed for the G650 exceeded the minimum, not fell below it. In fact, it was impossible to safely achieve a V2 speed of 1.13 VSR on this aircraft. This meant that if the pilots accelerated to the rotation speed (VR), pulled the nose up, and lifted off the runway to a height of 35 feet, the aircraft would always be moving faster than the V2 speed chosen by Gulfstream. A higher V2 speed in itself would not be a safety issue, but the company could not simply increase V2 without also increasing the required takeoff distance to meet legal requirements, thus breaking the promise to customers that the aircraft could take off from a 6,000-foot runway. Therefore, Gulfstream tasked its test pilots with finding ways to achieve a V2 speed of 1.13 VSR.

During the first takeoff test, held in Roswell, New Mexico in late 2010, the Gulfstream flight test team collected data to determine the VMU for the G650. The responsibility for analyzing this data and reporting to the project management team fell to the first flight test engineer of the G650,* who was also responsible for overseeing each test session. However, due to the high workload, he did not complete the report on the VMU tests as time management ordered the project to move on to the next phase, where test pilots would verify the calculated takeoff speed and develop techniques that pilots should use in order to achieve them.

When this next test phase was conducted in the first quarter of 2011, the flight test team faced the issue mentioned above: whenever a test required them to climb out at V2 speed, they would exceed this speed by a large margin. Furthermore, frightening things began to happen if they tried to reduce speed by pulling harder during the rotation. Their work in Roswell throughout March and April would be to find a takeoff technique that allowed pilots to consistently achieve V2 without endangering the safety of the aircraft. No one told them that this was impossible.

*Although the name of the first flight test engineer is publicly available, his friends and family have requested that it not be used.

The first alarming incident occurred at the end of 2010, in a test designed to determine VMU with flaps set to 20 degrees. During takeoff, the pilot pitched up excessively, reaching a pitch angle closer to 13 degrees than the targeted 10 degrees. Immediately, the wing had to be lowered and the flight became uncontrollable. The wing may have continued to rise until it touched the ground if not for the timely actions of the supervising pilot, who intervened to lower the nose and increase thrust, seemingly saving the flight.

After this event, the flight test team informally met and determined that the wing had lowered because the pilot pulled up too steeply during the rotation, causing lateral instability. The engineers agreed, and flight tests continued a few minutes later.

But the actual reason for the roll to the right was much deeper than that. The roll actually occurred because the wing had stalled: its angle of attack relative to the airflow became too large and it stopped producing lift. The supervising pilot was able to recover because his speed reduction input reduced the angle of attack back below the critical point and reversed the stall.

The reason the engineers did not realize that the roll was caused by a stall was because the aircraft never reached the angle of attack at which they thought a stall would occur. At the core of the issue is that the G650, like all other aircraft, will stall at a different angle of attack when affected by ground effect compared to when it is in free air.

Ground effect is simply the change in the behavior of airflow over the wing when the plane is close to the ground. Ground effect typically increases lift and decreases drag for a specific angle of attack, but it will also decrease the angle of attack at which the aircraft will stall. In wind tunnel tests conducted early in the development process, Gulfstream determined that the G650 in ground effect would stall at an angle of attack (hence, AOA) two degrees lower than in free air. After the VMU tests in 2010, the first flight test engineer performed some calculations and adjusted this estimated difference to 1.6 degrees. The aircraft's stall protection systems were then programmed with this figure, including the stall warning Shaker and the maximum lift limit indicator displayed on the pilot's attitude indicators, which delineates the highest safe lift in that phase of flight. The final version of the aircraft would also include a computerized flight envelope protection system that would prevent the aircraft from stalling, but at the time of the 2010-2011 flight test, it had not yet been installed.

However, the engineer's calculations were wrong. The reason was quite simple: he relied on academic sources, which were evaluated as equally erroneous. Some publications had declared, and the engineer clearly believed, that the maximum lift that could be generated by the wing - known as the maximum lift coefficient - was the same in ground effect as in free air, when in fact on many aircraft, this value is lower in ground effect. This misunderstanding invalidated his math and led him to significantly overestimate the wing's AOA in ground effect, actually lower by 3.5 degrees than the AOA in free air.

Believing that the ground effect AOA was 13.1 degrees (less than 1.6 degrees below the free air AOA of 14.7 degrees), the engineers programmed the stall warnings to trigger at an AOA of 12.3 degrees when the aircraft was close to the ground, providing a full warning margin. But in reality, the aircraft could stall at an AOA as low as 11.2 degrees while in ground effect, meaning the stall could occur before the stall warnings were activated. In the previous flight test incident, the pilot exceeded the target angle and surpassed the AOA in ground effect, causing the wing to stall without any warnings.

Although the pilots attempted to recover from this incident, they never fully understood what had happened, and a thorough investigation was not conducted. The issue was believed to be related to the pilots controlling too aggressively in 2011.

These two separate miscalculations had now placed the test flight crew in an almost impossible predicament. The low calculated V2 speed by Gulfstream could only be achieved by pitching up during takeoff to an angle exceeding the aircraft's AOA in ground effect, and the pilots were unaware. The only question was when the other shoe would drop.

◊◊◊

In the weeks and days leading up to the fateful test flight on April 2nd, the flight test team continuously discussed their strategies to find a takeoff technique that allowed V2 to be safely and reliably achieved. In Birmingham, Alabama in February, the pilots attempted to rotate at a higher speed than usual, holding the pitch to nine degrees, and then abruptly increasing altitude to 15 or 16 degrees immediately after liftoff.

The key to the strategy was as follows. Because the AOA equals a high angle while the plane rolls on the ground, keeping the pitch to nine degrees during this phase ensured that the AOA would not reach 11-12 degrees, the area where the pilots knew control roll issues had arisen. Then, once the aircraft began to climb, they could significantly increase the pitch without increasing the AOA (due to the aircraft's motion vector, and thus the airflow, also starting to point upwards). As the aircraft climbed at an angle, they hoped to slow down enough to reach V2. Using this method, the test pilots managed to get within 4 knots of V2, but this was still beyond the ±2 knots tolerance needed for a successful test. Furthermore, the pilots suspected that they could convince the FAA that this was a normal technique of Muslim pilots that could be routinely performed.

By April 2nd, the team had become convinced that this technique was too difficult and a new approach would be necessary. I'm not going to do those jerky things, it's not working, test pilot in command Kent Crenshaw told his colleagues early that morning. That's not how they're going to fly the plane, and I don't think the FAA will like it. It's a great flying machine, you shouldn't abuse it to fly. The consensus was that they needed a continuous control, not this jerky approach, where they tried to hold the pitch at nine degrees to avoid roll issues, only to suddenly pitch up after liftoff.

Reviewing the strategy, Crenshaw said that after another failed test, we are stopping because we are trying to make this arrest and I think we are focusing on that because if you really have a broken engine, the boys won't look at nine degrees, they will look at trying to get to V2. Agreeing with Crenshaw's assessment, the flight test team decided that in each test, they should keep it less than nine degrees, until they can't hold it, in the process of determining the sweet spot they believe in, they can pull up In a continuous motion, avoiding roll control issues and reducing speed for V2 at the same time. Unfortunately, there is no such sweet spot that actually exists.

Immediately after 9:00 a.m. that day, the flight crew taxed to the Runway threshold at Roswell for a test named 7A2. In the positions of the pilots were pilot in command Kent Crenshaw and second in command Vivan Ragusa. At their computer stations in the cabin were the first flight test engineer and the second flight test engineer David McCollum. Five other engineers monitored flight parameters from a remote introduction section next to the runway.

The plan for this test was to simulate an engine failure at V1 with the flaps set to 10 degrees. It was an operation they had performed many times before.

The power of the set, Rag Ragusa said.

With Airspeed's Alive, I got the handle, Mr. Crenshaw replied as the plane roared down the runway.

Eighty knots, Rag Ragusa called out.

Five seconds later, at a speed of 105\nKnots, Ragusa declared, Champ Chop, and reduced thrust in the engine appropriately to idle. Eight seconds passed as the plane continued to accelerate. Ban him he said.

Pull back with about 50 pounds of force, Crenshaw let the runway smooth out to nine degrees, hoping to lift off as the runway nears the danger zone exceeds 11 degrees. But he didn't quite make it: 4.4 seconds after turning, with the plane still on the ground, AOA reached 11 degrees. A second half later, the plane lifted off the ground and two tenths of a second later, with an AOA of 11.2 degrees, the right wing stalled.

The moment the plane became airborne, it began to roll to the right, despite Crenshaw's efforts to steer left. Crying out that something had happened, he applied more, Aileron, ineffective. Ailerons will be useless if the right wing doesn't fly. As AOA increased further, the final sticky roll finally activated, causing Crenshaw to descend, back below his upper limit indicator on his screen, believing this would fix the stall. But it didn't. With the plane at about 13 degrees to the right and in shallow water, the right wing hit the ground, scraping along the runway in a shower of sparks.

Cang Oh, whoa, whoa, whoa, whoa! Ragusa exclaimed.

Bank angle! Bank angle! ” blew an automatic warning.

Power, power, power! Crenshaw shouted. Ragusa had disrupted the appropriate engine power back to full power, but the plane still did not recover from the right.

Not sure why his recovery attempt failed to address the issue, and desperate to keep the plane in the air, Crenshaw took back control to climb. AOA shot up over 22 degrees, the bank angle increased significantly, and the plane fell like a rock to the ground.

"No no no no!" Ragusa exclaimed.

Bank angle! Bank angle! ”

"Ah, sorry guys!" Crenshaw said. His words would be the last on the cockpit voice recorder. Fifteen seconds after the plane first became airborne, the right wing once again hit the ground, sending the plane off to the right of the runway. The fuselage slammed into the ground, destroying the landing gear and the plane skidded on its belly in the desert, throwing up a cloud of dust and a large fire. The thrust reverser then took care of on a taxiway and into a small concrete structure, breaking fuel tanks and triggering a large explosion, before the plane finally came to a stop, surrounded by flames.

The impact was not particularly difficult, and all the original crew members survived. Although Crenshaw's legs were pinned in the wreckage, Ragusa, McCollum, and the first flight test engineer were able to get out of their seats and walk to the main cabin door. But before they could open it, intense heat and smoke overwhelmed them, and all four crew members died in the inferno. Other members of the team ran to the plane from a trailer from a distance, but the heat prevented them from reaching the door, and by the time the fire truck arrived four minutes later, it was clear that those on board could not survive.

◊◊◊

The deaths of the four outstanding flight test team members shocked the entire flight test community and called for a full investigation. That responsibility will fall to the National Transportation Safety Board, but this agency has very little experience with flight test accidents and the learning curve for investigators will be steep. Only through wide-ranging cooperation with surviving team members, those with deep rights to understand why their colleagues died, can the NTSB come to its own significant conclusions.

By examining and checking and performing numerous simulations, the NTSB and its partners at Gulfstream finally determined that the manufacturer had applied an outdated procedure to calculate takeoff speed that underestimated V2. At the same time, an incorrect assumption about the behavior of the plane led to a low assessment of the AOA stall in ground effect, causing the stall warnings to be incorrectly placed. The flight test team, tasked with achieving this unbelievable low V2 speed, ultimately exceeded it, to the AOA stall without warning and unable to recover before the right wing touched down. Although Crenshaw tried to recover, the adjusted warnings inaccurately led him to believe he had reduced enough to escape the stall, when in fact he needed to reduce further. With the information he had, there was no way he could understand the situation in time to prevent the accident.

The NTSB was shocked to discover that two previous incidents of the same phenomenon, one of which could have led to a similar accident, were not properly investigated. Gulfstream did not have a protocol in place to stop testing in the event of an unexpected event, such as an unknown roll. Instead, the team informally agreed on a cause that could be acceptable - a correct cause, but not grasping the true scale of the problem.

In fact, the consistent difficulty in approaching V2 should have served as a warning sign that something was amiss. If Gulfstream had directly calculated V2 from the VMU proven experimentally, they would have realized that the value they chose was too low, but no one did this. From the start, Gulfstream assumed that the G650 would operate like its predecessor, the G550, although this was not the case. Basic calculations could have shown that this was wrong. For example, on the G650, the difference between VLOF (lift-off speed) and actual V2 speed is greater than on the G550. If Gulfstream had reversed the math and tried to take the G650's VLOF from the desired V2 speed based on test data, instead of using formulas from the G550, they would have arrived at a VLOF lower than VMU, unattainable. Clearly the plane could not lift off at a speed lower than the minimum necessary to become airborne!

A series of these erroneous assumptions is a sign of a testing environment in which safety and quality control slip. As the NTSB interviewed more and more Gulfstream employees, the reasons began to become clear. From the start, company management pursued an aggressive testing schedule, which engineers and pilots thought was unreasonable. The management had a legitimate reason for this: if they could not get the plane certified by September 28, 2011, the five-year anniversary of the G650 type certificate application, the plane would be legally required to meet any new certification requirements introduced during that five-year period. This would cause further delays, preventing the plane from being delivered to customers on the promised date. If the company could not meet its promise, they would lose money, and at the end of the day, money is king.

The flight test personnel were dissatisfied with the schedule imposed on them by management. The chief flight test engineer told his supervisors that this tight schedule did not leave the backup room, but senior managers told him that this was a risk they were willing to take. The head of the G650 Flight Science Department admitted that the schedule could be relaxed, but said he did not want to extend the deadline because the management felt people would not work hard. We want to maintain a sense of urgency in the Gulf region to keep everything moving, he explained.

By March 2011, the FAA's Atlanta Aircraft Certification Office had become concerned that the program would not meet the September deadline. On March 31, the ACO wrote to Gulfstream: At one time, FAA has expressed our concern about your overly aggressive schedule, and at one time, now you have informally acknowledged 'unofficially' that everything is slipping; However, the company's TIA schedule continues to reflect a speed that has been proven to be unrealistic. However, the FAA cannot compel Gulfstream to modify its flight test schedule, as the agency will only be directly involved when the plane graduates from flight testing to certification testing. And before Gulfstream could respond to the FAA's letter, the accident occurred.

In its final report, the NTSB clearly concluded that this schedule pressure was one of the main reasons the company failed to fully investigate the V2 issue, and cleverly tried to argue. To save time and effort, formulas and methods from previous models were applied to the G650 without checking whether they were appropriate. Then, faced with a vague deadline, the entire project team focused on achieving a goal to the point where they did not step back to see if the goal could be achieved. The inability to achieve the selected V2 speed was not something management wanted to think about: not only did it eliminate the certification schedule, but it also posed a danger to customer promises about the plane's takeoff performance. As a result, although increasing V2 was the clearest solution to the V2 overrun problem, the process of taking this action was never seriously considered.

Contributing to these failures were a number of other organizational issues in the Gulf region. First, the tasks of team members were poorly defined and often different from those described in flight test instructions. Initially, various engineers were supposed to supervise the tests and analyze the data, but over time, these two responsibilities drifted, and by 2011, the first flight test engineer was responsible for both. This workload was too high for one person and as a result, he did not complete his data analysis from the fall 2010 VMU tests by the time the next test cycle began in February 2011. If this data had been fully analyzed, the fundamental issues with takeoff speed could have been detected. The NTSB felt that continuing takeoff performance testing was inappropriate before the speed schedule was checked against VMU data from the 2010 experimental.

This decision highlighted the dangerous lack of control gates in the Gulfstream G650 program. In project management, a control gate is a step that must be completed before the project can proceed, ensuring that no phase of the project is carried out before its prerequisites are met. In this case, the lack of a control gate allowed the project to move forward despite a lack of dangerous experimental knowledge about the plane's takeoff speed.

As a result of the tragedy in Roswell, Gulfstream made major changes to its management and organizational style, as well as the G650 aircraft itself. Flight testing was halted until December 2011, during which time, Gulfstream developed a new computer model, helping to develop an appropriate rotation technique and evaluate its margin against the AOA stall, rather than having the test pilots figure this out in the actual plane. Gulfstream eventually stabilized at a V2 speed over 15 knots higher than the originally proposed level, but ultimately they were still able to meet the minimum runway length assurance of 6,000 feet by increasing maximum thrust generated by the engines. The company also developed better tools to detect anomalies in test data and stronger procedures to address these anomalies before testing could be resumed. Finally, the G650 was equipped with more advanced fire suppression systems, additional emergency exits were added, and Gulfstream implemented a policy of stationing airport firefighters near the runway whenever high-risk flight tests were conducted.

On September 7, 2012, the Gulfstream G650 finally received its type certification, nearly a year after the promised date. Despite the delay, the model was successful, selling over 400 examples by the end of 2020. But if the Gulf region had pursued its diligence and abandoned its efforts to meet an unrealistic deadline, the delay would certainly have been shorter and the four men would still be alive.

◊◊◊

Flight testing always carries an element of danger, and it may always be so. But it is still the manufacturer's responsibility to minimize risks as much as possible, a responsibility that should not make the company's goals. Test pilots are among the best of us, possessing great skills, excellent judgment, and exceptional courage. But in the end, they are also human beings with families, and in today's age, no test pilot's spouse or child should have to hear that dreaded knock on the door and the words, I'm sorry, but there has been an accident.

The tragedy of Gulfstream Flight 153 also holds lessons far beyond the aviation industry. The accident is a classic case of poor project management and counterproductive work attitudes. Forcing team members to work too hard for the benefit of an unrealistic project schedule is not beneficial to workers or the company, and often leads to unsafe shortcuts, as Gulfstream found out the hard way. A project can be done quickly or it can be done well; it's all but it can't demand both. Gulfstream certainly learned that lesson - but will managers at other companies and in other industries take it to heart? Or will they continue to push their teams to the brink of disaster in pursuit of the almighty dollar? While it is hoped that this analysis will open some eyes, ultimately, only time will tell.

_____________________________________________________________

Join the discussion on this article on Reddit!

Visit R/DoDCloudberg to read and discuss over 200 similar articles.

You can also support me on Patreon.

Users who liked