学术报告

信电号角系列学术报告: PortHadoop-R: Support the Merging of HPC and Cloud

发布日期:2017-06-07 浏览次数:

报告人:美国伊利诺伊理工学院 孙贤和 教授

间:2017617日(星期六) 上午10:00

点:信电楼406 学术报告厅

联系人:李振波  62738751

报告简介

High Performance Computing (HPC) is becoming data intensive. In the meantime, big data applications are requiring more and more computing power. The merging of HPC and big data analytics is inevitable. However, the conventional HPC ecosystem, represented by MPI and Parallel File Systems (PFS) environments, and the newly emerged Cloud/big data ecosystem, represented by MapReduce/Spark and Hadoop File Systems (HDFS) environments, are designed for different applications and with different design principles. They do not work together naturally. Even worse, by the CAP theory, any of the two ecosystems cannot be extended to have all the merits of the other. In other words, these two ecosystems will co-exist. The best we can have is a merged system which can provide the functionality and merits of both ecosystems. In this study, we provide the PortHadoop-R solution to support the merging of HPC and Cloud at the file level. PortHadoop-R allows data to be read directly from PFS to the memory of Hadoop nodes and integrates the data transfer with R data analysis and visualization. PortHadoop-R is carefully optimized to utilize the merits of PFS and MapReduce to achieve concurrent data transfer and latency hiding. PortHadoop-R is tested on NASA climate modeling applications. Experimental results show PortHadoop-R delivered a 15x speedup. Even without the 15x speedup of PortHadoop-R, the MapReduce environment is already significantly faster than MPI clusters on processing climate data. PortHadoop-R further demonstrates the potential of the merging of HPC and Cloud.

报告人介绍

 

 

Dr. Xian-He Sun is a University Distinguished Professor of Computer Science of the Department of Computer Science at the Illinois Institute of Technology (IIT). He is the director of the Scalable Computing Software laboratory at IIT and a guest faculty in the Mathematics and Computer Science Division at the Argonne National Laboratory. Before joining IIT, he worked at DoE Ames National Laboratory, at ICASE, NASA Langley Research Center, at Louisiana State University, Baton Rouge, and was an ASEE fellow at Navy Research Laboratories. Dr. Sun is an IEEE fellow and is known for his memory-bounded speedup model, also called Sun-Ni’s Law, for scalable computing. His research interests include data-intensive high performance computing, memory and I/O systems, software system for big data applications, and performance evaluation and optimization. He has over 250 publications and 5 patents in these areas. He is a former IEEE CS distinguished speaker, a former vice chair of the IEEE Technical Committee on Scalable Computing, the past chair of the Computer Science Department at IIT, and is serving and served on the editorial board of leading professional journals in the field of parallel processing. More information about Dr. Sun can be found at his web site www.cs.iit.edu/~sun/.


电话:010-62736746

邮箱:eic@cau.edu.cn

地址:北京市海淀区清华东路17号

邮编:100083

微信关注

©2017 信息与电气工程学院