Flux rss
 

High-Availability

High-Availability

Introduction to Reliability

No matter what service is being performed by a computer system, users must have confidence in how the system operates in order to be able to use it under good conditions. The term "reliability" characterises how trustworthy a computer system is.

A failure is when a service does not function properly, i.e. a state of operation that is abnormal or, more precisely, not in accordance with specifications. From the user's point of view, a service has two statuses:

  • appropriate service, i.e. in accordance with expectations
  • inappropriate service, i.e. not in accordance with expectations

A failure is attributable to an error, i.e. a local dysfunction. Not all errors lead to service failure.

There are several ways to limit service failure:

  • Error prevention, which consists of avoiding errors by anticipating them
  • Fault tolerance, the goal of which is to provide a service that is in accordance with specification despite errors by introducing redundancy
  • Error elimination, aiming to reduce the number of errors through corrective actions
  • Error prediction, by anticipating errors and their impact on service

Introduction to High-Availability

"High-availability" is all the measures that aim to guarantee service availability, i.e. ensure around-the-clock operation of a service.

The term "availability" refers to the probability that a service is operating properly at a given time.

The term "reliability", which is also sometimes used, refers to the probability that a system is operating normally over a given period of time. This is called "continuity of service".

Availability is most often expressed by the availability rate (a percentage), which is measured by dividing the time the service is available by the total time. Availability is most often expressed by the availability rate (a percentage), which is measured by dividing the time the service is available by the total time.

Availability Rate Length of Downtime
97% 11 days
98% 7 days
99% 3 days and 15 hours
99,9% 8 hours and 48 minutes
99,99% 53 minutes
99,999% 5 minutes
99,9999% 32 seconds

Risk Evaluation

Indeed, the failure of a computer system can cause losses in productivity and money and even material and human losses in certain critical cases. Thus, it is necessary to evaluate the risks tied to the dysfunction (failure) of one of the components of a computer system and anticipate the means and measures to be used to avoid the incidents or to reestablish service in an acceptable amount of time.

As everybody knows, there are numerous ways in which a network computer system can fail. The causes of failures can be broken down as follows:

  • Physical causes (these can be natural or criminal in nature):
    • Natural disaster (flood, earthquake, fire)
    • Environment (bad weather, humidity, temperature)
    • Material failure
    • Network failure
    • Power cut
  • Human causes (these can be intentional or accidental):
    • Design error (software bug, poor network provisioning)
  • Human causes (these can be intentional or accidental):
    • Design error (software bug, poor network provisioning)
  • Operational causes (these are linked to system status at a given moment):
    • Software bug
    • Software failure

All of these risks can have different causes such as the following:

  • Intentional maliciousness

Fault Tolerance

Since it is impossible to totally prevent breakdowns, one solution consists in setting up redundancy mechanisms by duplicating critical resources.

The ability of a system to operate despite the failure of one of its components is called fault tolerance.

When one of the resources breaks down, the other resources take over in order to give system administrators the time to find a solution to the problem. This is called "Fail-Over Service" (FOS).

Ideally, in the case of material failures, the faulty material elements should be hot swappable, i.e. capable of being extracted and replaced without service interruption.

Backup

Setting up a redundant architecture ensures that system data will be available but does not protect the data against user-introduced errors or against natural disasters such as fires, floods or even earthquakes.

Therefore it is necessary to set up backup mechanisms (ideally remote) in order to guarantee data perenniality.

Moreover, a backup mechanism can also be used for archival storage, i.e. saving data in a state that corresponds to a given date.

Last update on Thursday October 16, 2008 02:43:18 PM.

This document entitled « High-Availability » from Kioskea (en.kioskea.net) is made available under the Creative Commons license. You can copy, modify copies of this page, under the conditions stipulated by the licence, as this note appears clearly.

Results for High Availability

High-Availability - Clustering A "cluster" is an architecture made up of several computers that form nodes, where each node is able to operate independently. There are two main types of clusters: High-availability clusters spread a workload over a large number of servers and... en.kioskea.net/surete-fonctionnement/clusters.php3
High-Availability - NAS An "NAS" (Network Attached Storage) is a network storage device. An NAS is a storage server that can be easily attached to a company's network in order to serve the file server and provide fault-tolerant storage space. An NAS is a separate server... en.kioskea.net/surete-fonctionnement/nas.php3
High-Availability - Load Balancing One of the main difficulties encountered by network administrators is scalability, i.e. the ability to meet requests in an acceptable amount of time, even in a high traffic situation. Load balancing consists in distributing a task to a pool of... en.kioskea.net/surete-fonctionnement/load-balancing-equilibrage-charge.php3

Results for High Availability

How to capture streaming videos from the InternetHow to capture streaming videos from the Internet Download-helper Websites Firefox extension Commercial softwares Many websites like YouTube use a Flash Video format that enables highly-compressed streaming video to effectively play in... en.kioskea.net/faq/sujet-140-how-to-capture-streaming-videos-from-the-internet
I'm building a new PC, how big should its power supply be?For a normal home PC which only runs games occasionally, a 300W-400W PSU should be fine. For a medium to high-end PC which runs games frequently, I'd recommend 450W-550W. For an ultra high-end gaming PC, 600W+. en.kioskea.net/faq/sujet-19-i-m-building-a-new-pc-how-big-should-its-power-supply-be
Activation/Deactivation of System Restore under Windows VistaActivation/Deactivation of System Restore under Windows Vista ] After a clean up of malware infection in your system, it is highly recommended that you reset your system restore in order to clean up restore point that might have been... en.kioskea.net/faq/sujet-584-activation-deactivation-of-system-restore-under-windows-vista

Results for High Availability

High definition soundHello, just bought one new brand pc integrated with high definition audio but when i play a game my speakers crackles to the sound of an explosion in the game. i was thinking about the speakers may be not hd but when i checked it out ive found that it... en.kioskea.net/forum/affich-29245-high-definition-sound
Ping too high on swat4hey whenever i tried to play swat 4 on my system and i got disconnect after 5 minutes when i already enter the game and says that my ping is too high! do you have any idea of how could i change ping when i connect to the game ?? en.kioskea.net/forum/affich-21030-ping-too-high-on-swat4
Progams are always high lightedHello, Hi In my program list no. of new program still high lighted. I'm using them daily but they are still high lighted. whats solution plz any body help me.? en.kioskea.net/forum/affich-19716-progams-are-always-high-lighted

Results for High Availability

Download Adobe Flash PlayerAdobe Flash Player is the high performance, lightweight, highly expressive client runtime that delivers powerful and consistent user experiences across major operating systems and browsers. en.kioskea.net/telecharger/telecharger-91-adobe-flash-player
Download Driver Canon mp160The printer Cannon MP160 is a powerful printer, endowed with an impression with very high resolution and which provokes a big interest to the public for its performances of instant. However, according to means by which you obtain it, or further to an... en.kioskea.net/telecharger/telecharger-862-driver-canon-mp160
Download 7-Zip7Z also known as 7Zip archive utility is available as open source and is free to use. 7Z is fast, efficient and free. Features: High compression ratio in new 7z format with LZMA compression 7-Zip is free software distributed under the... en.kioskea.net/telecharger/telecharger-605-7-zip

Results for High Availability

China opens first very high-speed rail lineVisitors walk past bullet trains at a station in Beijing, on July 22, the latest high-speed intercity railway transport between Beijing and Tianjin. China's first very high-speed rail line went into operation on Friday. China's first very... en.kioskea.net/actualites/china-opens-first-very-high-speed-rail-line-10578-actualite.php3
Japan delays high-speed Internet satellite: agencyA rocket carrying a satellite lifts off from Tanegashima island in Japan in 2007. Japan's space agency said Wednesday it was delaying the launch of a satellite aimed at providing high-speed Internet access across Asia due to a technical problem... en.kioskea.net/actualites/japan-delays-high-speed-internet-satellite-agency-10119-actualite.php3
Japan delays high-speed Internet satellite: agencyA rocket carrying a satellite lifts off from Tanegashima island in Japan in 2007. Japan's space agency said Wednesday it was delaying the launch of a satellite aimed at providing high-speed Internet access across Asia due to a technical problem... en.kioskea.net/actualites/japan-delays-high-speed-internet-satellite-agency-10104-actualite.php3

Results for High Availability

High-Availability - SAN A "SAN" (Storage Area Network) is a complete storage network. A SAN is a complete architecture that groups together the following elements: A fibre channel broadband network or SCSI Dedicated interconnection equipment (switches, bridges, etc.)... en.kioskea.net/surete-fonctionnement/san.php3
High-Availability - NAS An "NAS" (Network Attached Storage) is a network storage device. An NAS is a separate storage server that can be easily attached to a company's network in order to serve the file server and provide fault-tolerant storage space. An NAS is a separate... en.kioskea.net/surete-fonctionnement/das.php3
Reliability - Fault Tolerance Since it is impossible to totally prevent breakdowns, one solution consists in setting up redundancy mechanisms by duplicating critical resources. The ability of a system to operate despite the failure of one of its components is called fault... en.kioskea.net/surete-fonctionnement/tolerance-pannes.php3