One of the major concerns of global organizational operations is business continuity.
Because firms rely on their information systems to operate, once a system shuts down unexpectedly, company operation will be impaired inevitably or even stopped. It is crucial for firms to provide a stable and reliable infrastructure for IT operations and reduce the possibility of disruptions. Besides emergency backup power generation, a data center also needs to closely monitor the operation rooms in order to ensure the continuous functionality of its hosted computer environment .
The Uptime Institute in Santa Fe, New Mexico, defined four levels of availability as shown below:
The tolerance for unavailability of service of the tier systems is listed below over one year (525,600 minutes):
Tier 1 (99.671%) status would allow 1729.224 minutes
Tier 2 (99.741%) status would allow 1361.304 minutes
Tier 3 (99.982%) status would allow 94.608 minutes
Tier 4 (99.995%) status would allow 26.28 minutes
High temperature is one of the major causes that lead to severe malfunction or damage to data centers. Many data centers have reported losses due to overheating conditions, including some of the leading firms. On March 14th 2013, Microsoft’s outlook.com service endured a 16-hour long outage caused by “a rapid and substantial temperature spike in the data center.” Wikipedia also experienced similar troubles on March 24th, 2010. “Due to an overheating problem in our European data center, many of our servers turned off to protect themselves”, as reported by Wikimedia on its tech blog (http://blog.wikimedia.org/2010/03/24/global-outage-cooling-failure-and-dns/). Earlier in the same year, too much hot air in the operation room knocked Spotify offline as one of the big air conditioner didn’t start properly.
Microsoft’s lengthy down time in 2013 was an unexpected accident due to its routine firmware updates. It caused a lot of trouble for customers who could not log into their Outlook and Hotmail accounts for a whole calendar day.
On the other hand, according to Domas Mituzas, the performance engineer at Wikipedia, the cost of downtime for the user-managed encyclopedia is minimal that “the down time used to be [their] most profitable product” because Wikipedia displays donation-seeking information for additional servers when it is offline.
The losses suffered from the shutdowns vary from firm to firm, and it is necessary for all parties to install safeguard process and close monitoring to minimize the potential damage. Next week we will briefly discuss how to protect your data center from changing environmental conditions.
Tom Warren, “Microsoft blames overheating datacenter for 16-hour Outlook outage”, March 14, 2013. http://www.theverge.com/2013/3/14/4102720/outlook-outage-overheating-datacenter
Rich Miller, “Wikipedia’s Data Center Overheats”, March 25th, 2010. http://www.datacenterknowledge.com/archives/2010/03/25/downtime-for-wikipedia-as-data-center-overheats/
Nicole Kobie , “Overheating London data centre takes Spotify offline”, Feb 22nd, 2010. http://www.itpro.co.uk/620752/overheating-london-data-centre-takes-spotify-offline
Ivory Wu, Sharp Semantic Scribe
Traveling from Beijing to Massachusetts, Ivory recently graduated with a BA from Wellesley College in Sociology and Economics. Scholastic Ivory has also studied at NYU Stern School of Business as well as MIT. She joins Temperature@lert as the Sharp Semantic Scribe, where she creates weekly blog posts and assists with marketing team projects. When Ivory is not working on her posts and her studies, she enjoys cooking and eating sweets, traveling and couch surfing (12 countries and counting), and fencing (She was the Women's Foil Champion in Beijing at 15!). For this active blogger, Ivory's favorite temperature is 72°F because it's the perfect temperature for outdoor jogging.