What causes the data center to break the network?

What causes the data center to break the network?

2013-03-26 12:01:41

One way of Thailand embarrassed embarrassed under the bed at the end of 2012 Chinese film box office records, but also refresh the record. But in the IT sector, the frequent outbreak of the data center security failure events, but also in the impact of the user's psychological defense again and again. Just look, the security problem of the data center, don't be embarrassed again. “ Thailand embarrassed ”.
Cloud computing services are touted as IT saints in this era, all services can be “ cloud &rdquo. However, when a lot of companies are the first to eat crab, but found that often the most vulnerable is also their. In recent years, the cloud service disconnection events emerge in an endless stream, so that the industry was terrified.

People gradually return to the ideal, more clearly see the true colors of the cloud computing. It can be said that no matter how lofty dreams or to find a firm foothold, cloud services will eventually have to be transmitted from one data center to another data center, collaborative work between this process still can not get rid of people, computer, network, power, storage etc.. As a result, the whole process of errors and vulnerabilities can hardly be avoided, coupled with natural calamities and man-made misfortunes. So, you must have a certain mental preparation to enable cloud services, but also have a secondary solution to deal with.
Here is a review of the reasons behind a series of network failures in recent years. Between -2012 years from 2009. Perhaps you can see that even if a computer error seems inevitable, reinsurance measures seem to be able to control the security incident in a small range of probability.
Broken network type
Typical event 1: Amazon AWS Christmas break network
Failure reason: elastic load balancing service failure
December 24, 2012, just past Christmas Eve, Amazon did not let their customers too safe. Amazon AWS is located in the eastern part of the United States in the 1 Data Center failure, its elastic load balancing service (Elastic Load Balancing) interrupt, leading to Netflix and Heroku and other sites affected. Among them, Heroku in the previous AWS service failures in the eastern United States has also been affected. However, some of the coincidence is Netflix's competitors, Amazon's own business Amazon Prime Instant Video has not been affected by this failure.
December 24th, Amazon AWS interrupt service event is not the first time, of course, is not the last time.
October 22, 2012, Amazon is located in northern Virginia's network service AWS also interrupted once. The reason is similar to the last. The impact of the accident, including Reddit, Pinterest and other well-known large sites. The effect of elastic Beanstalk interrupt service, followed by the elastic Beanstalk service console, relational database service, elastic buffer, EC2 Elastic Compute Cloud, and cloud search. The accident makes a lot of people think that Amazon should upgrade its northern Vigny data center infrastructure.
In April 22, 2011, a large area of the Amazon cloud data center server downtime, this event is considered to be the most serious in the history of the Amazon cloud computing security event. Because Amazon in northern Virginia's Cloud Computing Center downtime, including answering some web service Quora, Reddit, Hootsuite news service and location tracking service, affected by FourSquare. The official report claimed that Amazon, this event is due to the existence of loopholes and defects in the design to design the EC2 system, and the loopholes and defects continue to repair these known to improve the EC2 (Amazon ElasticComputeCloud service) competitiveness.
In January 2010, almost 68 thousand of the Salesforce.com user experience for at least 1 hours of downtime. Salesforce.com due to its own data center “ system error ”, including backup, including all services have a short paralysis. It also revealed that the Salesforce.com does not want to open the lock strategy: its PaaS platform, Force.com can not be used outside of Salesforce.com. So once the Salesforce.com problem, Force.com will also have problems. So the service takes a long time to break, the problem will become very difficult.

Broken network incentives two: natural disasters
1 typical events: Amazon Northern Ireland Berlin data center downtime
Causes of failure: lightning hit Berlin data center transformer
August 6, 2011, in Northern Ireland Dublin lightning caused by Amazon and Microsoft in Europe because of cloud computing network data center outage and massive downtime. Lightning hit a transformer near the Dublin data center, causing it to explode. The explosion caused a fire, so that all public service work temporarily interrupted, resulting in the entire data center downtime.
The data center is Amazon's only data storage in Europe, that is, EC2 cloud computing platform customers no other data center during the accident for temporary use. The number of outages using Amazon's EC2 cloud service platform site for up to two days long interruption time.
Typical event 2: Calgary data center fire accident
Failure reason: Data Center Fire
July 11, 2012 Calgary data center fire: a fire broke out in the Canadian communications service provider ShawCommunicationsInc is located in the Calgary Alberta data center, the delay caused by hundreds of local hospital surgery. Due to the data center to provide emergency management services, the fire affected the support of key public services major backup system. The incident has sounded the alarm for a number of government agencies, must ensure timely recovery and failover system, combined with the introduction of disaster management plan.
Typical event 3: Super hurricane Sandy attack data center
Causes of failure: storm and flood data center
In October 29, 2012, super hurricane Sandy: New York and New Jersey data center are affected by the hurricane, the adverse effects include stops in lower Manhattan area floods and some of the facilities, the area around the data center operation arrhythmia. The impact of Hurricane Sandy is more than a single interruption, bringing unprecedented disaster to the data center industry in the affected areas. In fact, diesel has become the lifeline of the data center recovery work, as a backup power system to take over the entire region of the load, prompting special measures to maintain the generator fuel. As the focus of our work is gradually shifting to post disaster reconstruction, it is necessary for us to discuss the location, engineering and disaster recovery of data centers for a long period of time, which may last for months or even years.

Broken network incentives three: human factors
Typical event 1:Hosting.com service interruption accident
Cause of failure: service provider performs UPS shutdown caused by incorrect circuit breaker operation sequence
July 28, 2012 Hosting.com outage event: human error is often considered to be one of the leading factors of data center downtime. July Hosting.com interrupt event caused 1100 customer service interruption is an example. The outage occurred accident is because the company is located in Delaware, Newark data center is the UPS system of preventive maintenance, “ service providers to implement the circuit breaker operation is not in the correct order caused by the closure of UPS is one of the key factors causing the data center room facilities loss. Hosting; &rdquo.19 off network event ”
Cause of the accident: client software Bug, Internet terminals frequently initiate domain name resolution request, causing DNS congestion
In May 19, 2009 21:50, Jiangsu, Anhui, Guangxi, Hainan, Gansu, Zhejiang province six users visit the web site reported slow or inaccessible. After the Ministry of the relevant units of the national survey bulletin said, six provincial network interruption, reason is the defect of a domestic company to launch the client software in the company domain authorization server and abnormal circumstances, lead to the Internet terminal to install the software to initiate frequent DNS requests, causing DNS congestion, resulting in a large number of users to access the site slow or not open the page.
Among them, DN SPod is one of the well-known domain name resolution service provider N SPod company, a number of well-known service domain name resolution service. The attack resulted in 6 DNS DNS server is DN SPod's paralysis, paralysis of domain name system directly resulting in a number of Internet service providers including storm, resulting in network congestion, resulting in a large number of users can not normally access. Ministry pointed out that the incident exposed the domain name resolution service has become a weak link in the current network security, indicating that the units should strengthen the security of DNS services.
Summary
Cloud enabled services companies, to a large extent, consider this service can be more editing, cost-effective. However, if this consideration is to reduce the cost of security, it is estimated that many companies will not agree with the boss. Endless cloud service off the network event caused by cloud security concerns.
At present, the solution can proceed from several angles, for enterprise customers, at the same time must be in the regular backup cloud data using cloud services, with second sets of solutions according to possible period of want or need. For cloud service providers, since a variety of network disconnection event is inevitable, it must think about a response to minimize the loss of their users, the network to improve the efficiency of the network to improve the efficiency of the event.
Government departments have the responsibility to monitor and remind the relevant laws and regulations related to cloud services have been introduced and constantly improved, and to remind the user of one hundred percent reliable cloud computing services currently does not exist.
Tags：Data center broken network server