l i n u x - u s e r s - g r o u p - o f - d a v i s
Next Meeting:
July 7: Social gathering
Next Installfest:
Latest News:
Jun. 14: June LUGOD meeting cancelled
Page last updated:
2004 Sep 22 11:36

The following is an archive of a post made to our 'vox mailing list' by one of its subscribers.

Report this post as spam:

(Enter your email address)
[vox] [fwd] [svlug] mission critical computing and air safety
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[vox] [fwd] [svlug] mission critical computing and air safety

Here's a nice long post with some more details about the air traffic issue
last week (and his thoughts on the Windows conversions) from Rick Kwan over
at SVLUG.  Just thought it was interesting reading...

----- Forwarded message from Rick Kwan -----

Date: Wed, 22 Sep 2004 08:43:02 -0700
From: Rick Kwan
Subject: [svlug] mission critical computing and air safety
To: SVLUG <svlug@lists.svlug.org>

Synopsis:  the SoCal ATC radio outage was presumably due to technician
error, a year after upgrade from UNIX to Windows 2000.  Slashdot has an
intense discussion of dissection and speculation.  While it is
correctly pointed out that this was a failure of the application,
not the OS, some question how the FAA could allow Windows in a
safety critical application.  My personal sense is that we will
see more, not less of this sort of thing.

A three-hour outage of the air traffic control radio system in the
SoCal (Southern California) airspace on September 14, 2004, left
800 airplanes in the air without radio communication.  The FAA says
communication was re-routed to other centers, and the problem did
not present a danger to planes or passengers.  However, other
reports mention that controllers were clearly shaken as they
witnessed near-tragedies.

Initial reports cite human error.  A technician forgot to reboot
Windows 2000 before 49.7 days elapsed.   It was installed last year
as part of an upgrade of the FAA's Voice Switching and Control
System from UNIX to Windows 2000 Advanced Server.  This system is
being (or has been) rolled out to all 21 Air Route Traffic Controller
Centers (ARTCCs) of the U.S. National Airspace System (NAS).

Yesterday, a heavy Slashdot discussion ensued.  Many expressed
shock that Windows 2000 was used in a safety critical system.
Rather than blame the technician, some say the FAA shouldn't have
let Windows 2000 be used in the first place.  Others say the FAA
is depending on the judgement of system contractors, in this case,
Harris Corporation.  Others correctly note that this was a problem
of the application, not the OS.  Yet others note that an application
problem should not require a server reboot.  (Now in fairness to
the FAA and other parties, perhaps radio communication is mission
critical, but not safety critical.  Where does one draw the line?)

Like many other areas of society, the NAS is becoming increasingly
dependent on commodity computing power.  This case "merely" involved
voice communication.  Imagine if it involved trajectories and

Frankly, not being inside Harris or the FAA, details still seem a
little murky to me.  But given Microsoft's recent Trustworthy
Computing Initiative, claims of five 9s, et cetera, I can imagine
how a Windows 2000 system crept in.  Someone must have done a
wonderful cost/benefit study to justify this and presented it to
FAA managers, who are aviation specialists, not computing folks.

Certainly, the FAA will do an in-depth investigation (although not
to the depth of Columbia).  But after identifying the causes, and
recommending and implementing revised maintenance procedures, I
expect more procurement of Windows-based servers.  This is not
because I don't trust the FAA; it's because that's what I see in
other industries.

For folks like SVLUG members, I expect that fundamental architecture
is a big issue.  In some measure, we operate on soundness of design
by white box inspection.  But government and other procurements
are usually a matter of requirements and getting a black box to
satisfy them.  Which is frequently as it should be; therefore, I
don't see this changing anytime soon.

Which leads me to the conclusion... we're in for more of the same.
We were simply lucky this time that no one died.

--Rick Kwan

----- End forwarded message -----
vox mailing list

LUGOD Group on LinkedIn
Sign up for LUGOD event announcements
Your email address:
LUGOD Group on Facebook
'Like' LUGOD on Facebook:

Hosting provided by:
Sunset Systems
Sunset Systems offers preconfigured Linux systems, remote system administration and custom software development.

LUGOD: Linux Users' Group of Davis
PO Box 2082, Davis, CA 95617
Contact Us

LUGOD is a 501(c)7 non-profit organization
based in Davis, California
and serving the Sacramento area.
"Linux" is a trademark of Linus Torvalds.

Sponsored in part by:
O'Reilly and Associates
For numerous book donations.