Abstract—Big data and Cloud computing are emerging as new promising technologies, gaining noticeable momentum in nowadays IT. Nowadays, and unprecedentedly, the amount of produced data exceeds all what has been generated since the dawn of computing; a fact which is mainly due to the pervasiveness of IT usage and to the ubiquity of Internet access. Nevertheless, this generated big data is only valuable if processed and mined. To process and mine big data, substantial HPC (high-performance computing) power is needed; a faculty which is not that affordable for most, unless we adopt for a convenient venue, e.g., cloud computing. In this paper, we propose a blue print for deploying a real-world HPC testbed. This will help simulating and evaluating HPC relevant concerns with minimum cost.
Indeed, cloud computing provides the unique opportunity for circumventing the initial cost of owning private HPC platforms for big data processing, and this by providing HPC as a service (HPCaaS). In this paper, we present the subtleties of a synergetic “fitting” between big data and cloud computing. We delineate opportunities and address relevant challenges. To concretize, we advocate using private clouds instead of public ones, and propose using Hadoop along with MapReduce, on top of Openstack, as a promising venue for scientific communities to own research-oriented private clouds meant to provide HPCaaS for Big data mining.
Index Terms—High-performance computing, cloud computing, big data, Hadoop.
Mohamed Riduan Abid is with Alakhawayn University, Ifrane, Morocco (e-mail: R.Abid@aui.ma).
[PDF]
Cite:Mohamed Riduan Abid, "HPC (High-Performance the Computing) for Big Data on Cloud: Opportunities and Challenges," International Journal of Computer Theory and Engineering vol. 8, no. 5, pp. 423-428, 2016.