Cloud-Niagara: A high availability and low overhead fault tolerance middleware for the cloud.


Fault tolerance is the ability to a system to continue its functionality despite the presence of faults in the architecture. For a dynamic system such as the cloud, fault tolerance is required to ensure business continuity. This paper proposes a high availability middleware that en-sures fault tolerance for cloud based applications. Effective Descriptive Set Theory is used to determine the model of fault detection for real life applications running on the open source cloud. A deterministic algorithm of the middleware is provided that achieves automatic allocation of backup nodes to the system based on the faults. After detection of faults, the middleware directs the system to add new nodes as replicas of the failed nodes, ensuring continuity of the cloud applications. Next, a case study including seven real life applications such as PostGreSQL Database, etc are described and fault tolerance is ensured through the proposed middleware. Empirical performance analysis of the algorithm is carried out and results are compared to traditional systems.Results show that in the presence of faults induced during experimentation, the middleware can be effectively used to introduce replica and ensure fault tolerance of bottleneck resources for executing 700 to 1000 processes per unit time.

16th International Conference on Computer and Information Technology (ICCIT), 2013