General Information
    • ISSN: 1793-8201 (Print), 2972-4511 (Online)
    • Abbreviated Title: Int. J. Comput. Theory Eng.
    • Frequency: Quarterly
    • DOI: 10.7763/IJCTE
    • Editor-in-Chief: Prof. Mehmet Sahinoglu
    • Associate Editor-in-Chief: Assoc. Prof. Alberto Arteta, Assoc. Prof. Engin Maşazade
    • Managing Editor: Ms. Mia Hu
    • Abstracting/Indexing: Scopus (Since 2022), INSPEC (IET), CNKI,  Google Scholar, EBSCO, etc.
    • Average Days from Submission to Acceptance: 192 days
    • E-mail: ijcte@iacsitp.com
    • Journal Metrics:

Editor-in-chief
Prof. Mehmet Sahinoglu
Computer Science Department, Troy University, USA
I'm happy to take on the position of editor in chief of IJCTE. We encourage authors to submit papers concerning any branch of computer theory and engineering.

IJCTE 2012 Vol.4(5): 726-730 ISSN: 1793-8201
DOI: 10.7763/IJCTE.2012.V4.566

A Failure Detection and Prediction Mechanism for Enhancing Dependability of Data Centers

Qiang Guan, Ziming Zhang, and Song Fu

Abstract—Modern data centers continue to grow in their scale and complexity. They are changing dynamically as well due to the addition and removal of system components, changing execution environments, frequent updates and upgrades, online repairs and more. Classical reliability theory and conventional methods do rarely consider the actual state of a system and are therefore not capable to reflect the dynamics of runtime systems and failure processes. In this paper, we present an unsupervised failure detection and prediction method using an ensemble of Bayesian models. It characterizes normal execution states of the system and detects anomalous behaviors. We implement a prototype of our failure detection and prediction mechanism and evaluate its performance on a data center test platform. Experimental results show that our proposed method can forecast failure dynamics with high accuracy.

Index Terms—Data centers, failure detection, failure management, dependable computing.

Q. Guan, Z. Zhang, and S. Fu are with the Department of Computer Science and Engineering, University of North Texas, Denton, Texas 76203 USA (e-mail: QiangGuan@my.unt.edu; ZimingZhang@my.unt.edu; Song.Fu@unt.edu, Tel.: +1-940-565-2341; fax: +1-940-565-2799).

[PDF]

Cite: Qiang Guan, Ziming Zhang, and Song Fu, "A Failure Detection and Prediction Mechanism for Enhancing Dependability of Data Centers," International Journal of Computer Theory and Engineering vol. 4, no. 5, pp. 726-730, 2012.


Copyright © 2008-2024. International Association of Computer Science and Information Technology. All rights reserved.