So… I’m minding my own business this morning, reading Teh Titters… and I come across a retweet from my friend and all around mensch @jessehirsch. He’d retweeted someone pointing to the Scientific American article from yesterday in which the otherwise acceptable Evgeny Kaspersky of Kaspersky Labs was quoted from an interview in Spiegel
SPIEGEL: What kind of damage can a super virus like this inflict?
Kaspersky: Do you remember the total power outage in large parts of North America in August 2003? Today, I’m pretty sure that a virus triggered that catastrophe. And that was eight years ago.
Think about these points:
- What the heck is Scientific American doing here? I can tell that they are pushing the article by David Nicol but this is more than just a little sensationalism.
- Mr. Kaspersky – where is your brain? You don’t know, you weren’t there, you have effectively ZERO understanding of large scale grid operations in North America. Heck, I’d challenge you to find me a virus that runs on TruCluster and when there are thousands of pages of facts available, your opinion, however well formed, doesn’t mean shit.
- The power of the re-tweet Jesse is one of the few journalists working today – and in a private conversation after I went all snotty on him on twitter, he pointed out that he cannot fact-check everything he tweets. This is a damn good point. And I apologized. The difference between “forwarding-ism” and “journal-ism” is fine, delicate, and completely frakked up in today’s press-wires are news world.
For the record – since Mr. Kaspersky can’t seem to find it – here’s the real information, with real links.
- The Official NERC documentation of the August 2003 Blackout
- The Ontario IESO (Independent Electricity System Operator) documentation of the August 2003 Blackout
- The technical blackout presentation which walks through exactly what happened and when (ppt)
- The technical blackout report (pdf)
- And the ACTUAL REPORT that tells you what ACTUALLY HAPPENED (pdf)
Of course, you could also have a look at my presention from Blackhat and DEF CON last year… SCADA and ICS for Security Experts: Avoiding Cyberdouchery …it’ll help you not look like Mr. Kaspersky looks right now.
So – how about instead of publishing some marketing screed – you drop the cyberdouchery and join the rest of us as we attempt to make our industry into something other than a bunch of tin-foil-hat wearing alarmists and get to the real work – which in your case would be an anti-virus product that detects viruses – you know, reliably, like all the time.
PS: It was the trees.
Believe it or not, the Intelligence Community (specifically the National Intelligence Officer for Science & Technology) pulled together some experts from within & outside the IC to analyze the 2003 blackout, & concluded that no skullduggery was involved, sinister forces were not at work, etc. I was with DOE Intelligence at the time & saw the briefing.
@Ralph Oh we believe it.
I was dealing with it here in Ontario (day job at the time) and laughing my ass off as the mayor of New York was telling CNN that it all started in Ottawa…yeah, no.
Later I provided briefings to several security agencies. It was a failure not a virus.
Kaspersky’s comments were baseless.
Kaspersky wasn’t necessarily saying that a virus had targeted the power system deliberately.
Actually, the explanation that I had previously heard, that a rather benign virus had, without being specifically targeted at power stations, knocked out computers that were necessary for power monitoring. That report may for all I know be false but it IS fully consistent with the Final NERC Report you cite. To suggest otherwise is very bad logic (see J. S. Mill on causation to go to a rather ancient source on the logic of causation).
“Shortly after 14:14, the alarm and logging system in the FE control room failed and was not
restored until after the blackout. Loss of this critical control center function was a key factor in the loss of situational awareness of system conditions by the FE operators.”
The system was designed to deal with trees, but was now blind in *two* ways because regularly timed contingency analysis had never functioned correctly and this loss was largely ignored, so…
“At no time during the morning or early afternoon of August 14 did the FE operators indicate voltage problems or request any assistance from outside the FE control area for voltage support.
FE did not report the loss of Eastlake Unit 5 to MISO. Further, MISO did not monitor system
voltages; that responsibility was left to its member operating systems.”
That wasn’t the trees preventing FE operators from addressing the problems. Trees had fallen before.
“Analysis of the alarm problem performed by FE after the blackout suggests that the alarm processor essentially “stalled†while processing an alarm event. With the software unable to complete that alarm event and move to the next one, the alarm processor buffer filled and eventually overflowed.”
It could be that reports of an internal buffer overflow became distorted into a virus story and that was passed on to Kaspersky, or that what I and perhaps Kaspersky read was just made up – but I think you have to argue for that, the evidence you cite simply doesn’t contradict that virus report.
“Although the FE alarm processor stopped functioning properly at 14:14, the computer support staff remained unaware of this failure until the second EMS server failed at 14:54, some 40 minutes later.
Even at 14:54, the responding support staff
understood only that all of the functions
normally hosted by server H4 had failed, and
did not realize that the alarm processor had
failed 40 minutes earlier.”
Lots more went wrong and systems design was poor in several ways. Perhaps to the point where a quite ordinary virus whacking a couple of key computers at one location could in fact have been the tipping point.
Leading to speculation in this article:
“In Blaster and the August 14th Blackout, security expert Bruce Schneier makes the case that the Blaster virus played a more significant role in the huge power outage than the official reports are acknowledging.
This is where I think Blaster may have been involved. The report gives a specific timeline for the failures. At 14:14 EDT, the “alarm and logging software” at FirstEnergy’s control room failed. This alarm software “provided audible and visual indications when a significant piece of equipment changed from an acceptable to problematic condition.” Of course, no one knew that it failed.”
http://blog.germuska.com/2003/12/15/the-blaster-virus-and-the-august-14th-blackout/
The Blaster worm affected more than a million computers running Windows during the days after Aug. 11.
http://www.schneier.com/blog/archives/2008/06/did_the_chinese.html
One certainly presumes that rebooting the alarm processor systems as part of the recovery process (as was done) erased any evidence of just what might have caused the buffer overflow and “stall”, or whether there was a buffer overflow, in truth. Therefore the NERC Final Report didn’t address these critical details more completely for the simple reason that they couldn’t possibly be known. It’s called a “Final Report”, not a “Complete Report”, because our knowledge isn’t and in this case I take it, couldn’t, be complete. (Unfortunately, the NERC report isn’t entirely consistent – if you include the summation – either.)
You get to be “pretty sure” it wasn’t a virus, if you wish, and Kaspersky gets to be “pretty sure” it was, if he wishes.
As NERC describes it, so much was wrong with the system that you could be forgiven for thinking that an especially vigorous sneeze could have brought it down. The recommended changes just go on and on and on…
@Joe
I normally don’t even look at comments posted on stories that are nearly a year old, but you obviously put lots of effort into your response.
To be clear, I have a different perspective on the blackout than any of my learned colleagues you quote. Additionally, Liquidmatrix Bossman Dave Lewis has an even better perspective than I do.
During the time of the blackout, I was a contractor working at one of the affected System Operators and Dave was an employee at one of the affected System Operators.
As Dave noted above, plenty of agencies (most of whom had a serious interest in pointing the finger at “cyber”) have stated that it was not.
My response at this point is still:
– There was nothing cyber involved. Cyber doesn’t cause a physical breaker to trip – a short circuit does.
– FE screwed the pooch in many ways, but ultimately, the load shedding was due to poor vegetation management.
To believe that the system is LESS brittle than this is naive in the extreme. The power grid in the north east was never designed – it just sort of happened. Starting in the late 60’s / early 70’s the engineering resources which could’ve made a difference were not set on the task of increasing resilience but rather on increasing cost-effectiveness. There has been little material investment in the power grid in the interim – the “smart grid” investment has been largely directed at some demand side load control and revenue management rather than building a grid which can support the needs of the people and the economy.
There are ways to alleviate or ameliorate the “cyber” issues with the grid, but playing the self-serving hype game ain’t one of them.
Oh, and have a look at budgets for plain ole infrastructure maintenance (let alone improvement) compared to the spend on demand side load management and revenue management and your brain will explode.