Horror story: Qld Health datacentre disaster

On 20 May, a brief electricity brown-out struck a Queensland Health datacentre, starting a chain of incidents that resulted in serious outages of over 20 health applications.

(CERN Datacentre, CERN, Geneva image by Cory Doctorow, CC2.0)

The datacentre, located on the campus of Herston hospital, is believed to be one of three datacentres Queensland Health operates. It only lost power for a fraction of a second, when two flooded Energex transformers failed at around 5:00pm on that day, according to a source close to the incident. Uninterrupted power supplies kicked in to keep servers up.

However, the brown-out tripped the chilled water system, cutting chilled water to the hospital campus. As it wasn't monitored, the datacentre support team didn't notice the loss of the chilled water. A datacentre employee came on scene to check everything was running, but being happy that there wasn't anything wrong, he left.

Only two of 10 air-conditioning units within the datacentre were able to use refrigerated gas if chilled water wasn't available, meaning that although the rest of the units were operating, they weren't cooling. The temperature in the datacentre began to rise.

Although people were called in to investigate the temperature rise, the cool water problem wasn't found. Due to a DNS change the day before the problems began, there were no messages being sent to tell staff of server problems. Four hours after the brown-out, services began to suffer. On-call hospital staff were affected and complained. Soon after, a server shut down.

The whereabouts of the air-conditioning specialist who had been called in was unknown to many staff members and he didn't answer his phone. It had taken the engineer three hours to arrive on site. Five hours after the systems failed, the fact that the chilled water pumps had not been operating was discovered as more servers shut down with temperatures over 50 degrees. It was believed to be fixed.

In the face of a severe weather event, the IT staff involved were outstanding in their response to minimise the impact of this incident.

Ray Brown, acting CIO Queensland Health

Because the remote access system wasn't working, staff had to wait until they arrived at the datacentre until they could begin shutting down servers. When they arrived, they started to move systems over to an alternate datacentre, which in some cases caused brief user inconvenience. Some, however, could not be moved since their servers had no ability to failover and Queensland Health's architecture for virtual machines didn't allow moving it over to a second datacentre.

The hospital's Cerner electronic medical record (patient administration) system was shut down by the hospital staff.

Six hours after the brown-out, the air conditioning was still not working. Although staff believed they had found the problem, more systems including iPharmacy shut down until 75 per cent of applications were down and the datacentre reached 45 degrees.

Eight hours after the brown-out, chilled water was finally brought back up. Nine hours after, the datacentre was back to normal and the services could be restored. By nine o'clock the morning after the brown-out, all services were restored.

Over the course of the problems, 12 applications caused significant impact, with another 12 having minor impact. Three years ago the datacentre was forced to shut down for the same reasons. Afterwards, the team had been told it could not happen again.

When queried on the incident, Queensland Health acting CIO Ray Brown did not respond to a question on what facilities around the state the downed applications provided services to. However, it is believed that Queensland Health's three datacentres provide services around the state to multiple locations.

He denied that there had been more than one incident over the past three years at the datacentre.

According to Brown, since several applications were relocated to the other datacentre, there was "minimal disruption" to services. "The majority of services impacted were available by 2:30am and all Queensland Health systems categorised as critical remained operational during this incident," he said.

"In the face of a severe weather event, the IT staff involved were outstanding in their response to minimise the impact of this incident. The ability of staff to physically attend the site was severely hampered by flooding in the area."

Lessons had been learned, according to Brown. Queensland Health was exploring options to remove reliance on chilled water. It also intended to replace the remote access system by the third quarter of this year. It is undertaking a review of management tools and is examining the crisis management plan.

Queensland Health has lost several chief information officers over the past several years. Long-time CIO Paul Summergreene had his contract terminated by the department in July 2008. Dr Richard Ashby filled his shoes for a short time, before leaving the chair vacant, with Brown currently leading the department's IT function in an acting capacity.

The news also comes as the Queensland Government flagged in the last state budget its intent to splurge hundreds of millions of dollars on health IT systems to support its e-health capability.

Talkback

Add your opinion

In order to post a comment, you need to be registered. (Sign In or register below)

Post your comment

Terms of Service - As a ZDNet registrant, and by using this service, you indicate that you agree to our Terms and Conditions and have read and understand our Privacy Policy.

ZDNet Australia Live

It's easy to rubbish an old operating system long after the rest of the world has already passed judgement upon it. I would be far more i...

6 minutes ago by ramnet on Microsoft admits Vista was 'cheesy'

Spotify launch suffers redirect bungle http://t.co/qUkSYPJB via @zdnetaustralia

Best user comment: "If Vista is cheesy, Metro is an over-ripe Stilton." http://t.co/ZJUwaxJT

If Vista is cheesy, Metro is an over-ripe Stilton.

21 minutes ago by meski on Microsoft admits Vista was 'cheesy'

A farewell to democracy: Kaspersky - ZDNet Australia - A farewell to democracy: KasperskyZDNet AustraliaWithout inte... http://t.co/4Chwa6uL

A farewell to democracy: Kaspersky http://t.co/mOhiBgDu

Spotify launch suffers redirect bungle http://t.co/EZeHfNeb

RT @zdnetaustralia: What are Android's biggest security flaws? http://t.co/SJoTiDUY ^ST

Chief Marketing Officer - the hottest seat in the C-suite http://t.co/Gfnvwm7c

you are kidding right - what qualification do you have to make such wildy stupid statements - do you really have customers who pay you fo...

47 minutes ago by rant rant rant on National Botnet Network coming: Earthwave

Spotify launch suffers redirect bungle - ZDNet Australia http://t.co/VmBsbPL8

Spotify launch suffers redirect bungle - ZDNet Australia http://t.co/E1kTrltd

Spotify launch suffers redirect bungle http://t.co/8UP4lyd1

by http://t.co/vmlQ0Ecb: Spotify launch suffers redirect bungle: Spotify's Australian launch seems to have failed... http://t.co/FRd6qAFw

Spotify launch suffers redirect bungle http://t.co/KPzJd2I8

Chrome overtakes IE: does it matter?: Google's Chrome appears to have become the most-used browser, having surpa... http://t.co/RJH13wPw

#Qantas promotes Strategy & Technology Head to #Jetstar CEO role from July 2012 http://t.co/bn5lmRRe

Monday madness Anonymous hacks Bureau of Justice http://t.co/GZ2jD9iO

A farewell to democracy: Kaspersky - ZDNet Australia http://t.co/I4NUagc8

A farewell to democracy: Kaspersky - ZDNet Australia http://t.co/50zNZ6O3

Spotify launch suffers redirect bungle: Spotify's Australian launch seems to have failed on at least one level: ... http://t.co/9btrXux2

Spotify launch suffers redirect bungle: Spotify's Australian launch seems to have failed on at least one level: ... http://t.co/9BvAawhj

A farewell to democracy: Kaspersky - ZDNet Australia http://t.co/qXfkgh8l #australia #technews

Spotify launch suffers redirect bungle: Spotify's Australian launch seems to have failed on at least one level: ... http://t.co/9BvEI6id

A little QA goes a long way. Spotify's redirection bungle http://t.co/NL5gCATG ^ST

Kaspersky says that democracy is threatened if we don't get a handle on e-voting http://t.co/w4Wgrqod ^ST

RT @lukehopewell: Eugene Kaspersky: without online passports, democracy will fall apart within 20 years http://t.co/nkNPUcph [COOL!]

BigAir acquires Qld wireless carrier - Communications - News - ZDNet Australia | @scoopit http://t.co/mha59x9x

Kaspersky's farewell to democracy: without online passports, democracy will fall apart within 20 years - http://t.co/w4Wgrqod ^LH

Android's biggest #security flaws: Android is widely accepted as being iOS' greatest rival, but, according to De... http://t.co/nVdKxBCD

BigAir acquires Qld wireless carrier http://t.co/ARFQmWqa

IBM bolsters big-data line-up with Vivisimo http://t.co/K2z8KrtP @zdnetaustralia

IBM bolsters big-data line-up with Vivisimo http://t.co/B6IOVeDv @zdnetaustralia

EU antitrust chief: We'll settle with Google http://t.co/9E7EEuAi

Chrome overtakes IE: does it matter? http://t.co/cTBwlULz

BigAir acquires Qld wireless carrier http://t.co/27vGpBMN

BigAir acquires Qld wireless carrier http://t.co/tUmhIliq

BigAir buys Qld wireless carrier Allegro Networks http://t.co/6DS1iadL ^ST

Exactly. There are two topics of discussion, that are co-mingled; 1) Unauthorized software was put on the company device, by an IT person...

4 hours ago by lamont on ABC's Bitcoin miner tackled in minutes

Of course, it's true and it may be quite unnerving and mind-boggling, to begin thinking about selling or buying precious jewelry. This, o...

9 hours ago by Sanchezgavi5 on Don't add Telstra deal to NBN cost: Quigley

First off, Bitcoin is not a virus. Second off, the only way to generate Bitcoins, is by using a Bitcoin miner. More information on this h...

13 hours ago by rizowski on ABC's Bitcoin miner tackled in minutes

When an operating system is sold it should not launch until an approved security service is purchased online with a list of approved supp...

14 hours ago by Kevin Cobley on National Botnet Network coming: Earthwave

Admits? Don't fall for their marketing. Vista was beautiful. Microsoft has a history of trashing their older OSes.

19 hours ago by anonymuos on Microsoft admits Vista was 'cheesy'

Gotta agree. For our Burnie, Tas. internet, we have a 1.5MB download speed adls connection through exetel using testra copper line. ADS...

20 hours ago by brozza on Broadband Speedtest

Well the message certainly is clear. Never do anything because something might happen. Seriously it seems to me "Earthwave" just want to...

21 hours ago by Hubert Cumberdale on National Botnet Network coming: Earthwave

you really think it's going to be such a grim future? looking at South Korea, Japan, even Czech Republic - I haven't seen either emit mo...

23 hours ago by romant on National Botnet Network coming: Earthwave

No... they'll just blame the NBN for that too ;-)

1 day ago by Beta on National Botnet Network coming: Earthwave

It seems that some of the people who set up ACCAN (not staff members) took the view that it would somehow be against their view of 'consu...

1 day ago by socrates on ACCAN gets govt tick amid industry criticism

Don't laugh, Mr Turnbull is dumb enough to try and use this against the NBN. I'm sure the noallitions magical FTTN will be impervious to ...

1 day ago by Jingles on National Botnet Network coming: Earthwave

OMG, the sky will fall if we get NBN - it must be cancelled immediately! Sorry; was just channelling Malcolm Turnbull there for a moment...

1 day ago by socrates on National Botnet Network coming: Earthwave

Thats just stupid.. what else is the NBN going to get blamed for? People die crossing the road, are you going to ban cars or police it b...

1 day ago by fibretech on National Botnet Network coming: Earthwave

And again - missed this bit did you? "... Telstra is responsible for estates where development approval was granted before 1 January 201...

1 day ago by Beta on Copper greenfield dominance irrelevant: Conroy

I think the idea of dropping aero glass bit of a mistake. At least have some colour. Thats something i liked (especially after working on...

1 day ago by JCOZ on Microsoft admits Vista was 'cheesy'

Yes, most people hate the processes put in place to ensure purchasing is fair, transparent and above board. Having been a purchasing off...

1 day ago by ozguy2000 on Woolies case poses procurement questions

God,..why spend another $6.7M on a system that's never going to be any good & never work in all probability!.. \ Government bureaucrats ...

1 day ago by Keith Styles on Vic scraps HealthSMART system

Facebook Activity

Keep up with ZDNet Australia

ZDNet Events Calendar

ZDNet Events Calendar