The affected system sends flight hazards and real time restrictions to pilots.
The ground stop and Federal Aviation Administration systems failures Wednesday morning that impacted thousands of flights across the U.S. appear to have been the result of a mistake that occurred during routine scheduled systems maintenance, according to a senior official briefed on the internal review.
An engineer “replaced one file with another,” the official said, not realizing the mistake was being made. As the systems began showing problems and ultimately failed, FAA staff feverishly tried to figure out what had gone wrong. The engineer who made the error did not realize what had happened.
“It was an honest mistake that cost the country millions,” the official said.
Earlier Wednesday, the FAA said normal operations were “resuming gradually” after ordering a nationwide pause on all domestic departures until 9 a.m. on Wednesday morning following a computer failure that has delayed and canceled flights around the country.
“The ground stop has been lifted,” officials said at about 8:50 a.m. ET. “We continue to look into the cause of the initial problem[.]”
Departures were resuming at about 8:15 a.m. ET at two of the nation’s busiest hubs — Newark, New Jersey, and Atlanta — FAA officials said on Twitter, adding, “We expect departures to resume at other airports at 9 a.m. ET.”
The affected Notice To all Air Missions, or NOTAM, system is responsible for sending out flight hazards and real time restrictions to pilots, administration officials said earlier.
“The FAA is still working to fully restore the Notice to Air Missions system following an outage,” said the FAA announcing the temporary grounding of all planes nationwide. “The FAA has ordered airlines to pause all domestic departures until 9 a.m. Eastern Time to allow the agency to validate the integrity of flight and safety information.”
Had the FAA’s new NOTAM system been in place, redundancies would likely have stopped the cascading failures. With the antiquated system in place, there was nothing to stop the outages, the official told ABC News.
“At this time, there is no evidence of a cyberattack. The FAA is working diligently to further pinpoint the causes of this issue and take all needed steps to prevent this kind of disruption from happening again,” the FAA said in a statement Wednesday night.
There were still more than 7,300 delays and 1,100 cancellations midday, according to tracking website Flight Aware.
Failures likely due to ‘glitch’
Transportation Secretary Pete Buttigieg said a full investigation is necessary to prevent any future mishaps.
“When there’s an issue in the FAA that needs to be looked at, we’re gonna own it, same way we asked the airlines to own their companies and operations,” Buttigieg said during an appearance on CNN Wednesday.
Congressional hearings are expected as is a possible speed-up of system replacement.
On what caused the system meltdown, Buttigieg said that overnight there “was an issue with irregularities in the messages that were going out” — though more needs to be learned on what led to the widespread failure.
Now we have to understand how this could have happened in the first place. Why the usual redundancies that would stop it from being that disrupted, did not stop it from being disrupted this time, and what the original source of the errors or the corrupted files would have been,” he said.
A senior official briefed on the FAA computer problems told ABC News the software issue developed late last night and led to a “cascading” series of IT failures culminating in this morning’s disruption. As has been reported, the disruption is confined to the commercial side of aviation.
As of now, the assessment is the failures are the result of a “glitch” and not something intentional. All possibilities are being looked at to ensure that the FAA systems were not breached.
The FAA first reported the system failure on Tuesday, according to an internal memo from the Cybersecurity and Infrastructure Security Agency obtained by ABC News.
Notably, the FAA system that failed is overdue for replacement.
The official compared the current outage to the crisis that crippled Southwest Airlines during the holidays: antiquated software overdue for replacement inside a critical IT network. If one thing goes down, the system can become paralyzed.