题 目:PortHadoop-R: Support the Merging of HPC and Cloud
内容简介:High Performance Computing (HPC) becomes data intensive. In the meantime, big data applications are requiring more and more computing power. The merging of HPC and big data analytics is inevitable. However, the conventional HPC ecosystem, represented by MPI and Parallel File Systems (PFS) environments, and the newly emerged Cloud/big data ecosystem, represented by MapReduce/Spark and Hadoop File Systems (HDFS) environments, are designed for different applications and with different design principles. They do not work together naturally. Even worse, by the CAP theory, any of the two ecosystems cannot be extended to have all the merits of the other. In other words, these two ecosystems will co-exist. The best we can have is a merged system which can provide the functionality and merits of both ecosystems. In this study, we provide the PortHadoop-R solution to support the merging of HPC and Cloud at the file level. PortHadoop-R allows data to be read directly from PFS to the memory of Hadoop nodes and integrates the data transfer with R data analysis and visualization. PortHadoop-R is carefully optimized to utilize the merits of PFS and MapReduce to achieve concurrent data transfer and latency hiding. PortHadoop-R is tested on NASA climate modeling applications. Experimental results show PortHadoop-R delivered a 15x speedup. Even without the 15x speedup of PortHadoop-R, the MapReduce environment is already significantly faster than MPI clusters on processing climate data. PortHadoop-R further demonstrates the potential of the merging of HPC and Cloud.
报告人:美国伊利诺伊斯理工大学 孙贤和 教授
报告人简介:Dr. Xian-He Sun is a University Distinguished Professor of Computer Science at the Department of Computer Science in the Illinois Institute of Technology (IIT). He is the director of the Scalable Computing Software laboratory at IIT and a guest faculty in the Mathematics and Computer Science Division at the Argonne National Laboratory. Before joining IIT, he worked at DoE Ames National Laboratory, at ICASE, NASA Langley Research Center, at Louisiana State University, Baton Rouge, and was an ASEE fellow at Navy Research Laboratories. Dr. Sun is an IEEE fellow and is known for his memory-bounded speedup model, also called Sun-Ni’s Law, for scalable computing. His research interests include data-intensive high-performance computing, memory and I/O systems, software system for big data applications, and performance evaluation and optimization. He has over 250 publications and 6 patents in these areas. He is the Associate Chief Editor of IEEE Transactions on Parallel and Distributed Systems, a Golden Core member of the IEEE CS society, a former vice chair of the IEEE Technical Committee on Scalable Computing, the past chair of the Computer Science Department at IIT, a member of the China Qian-Ren (B) program, and is serving and served on the editorial board of leading professional journals in the field of parallel processing. More information about Dr. Sun can be found at his web site www.cs.iit.edu/~sun/
时 间:2017年12月16日(周六)上午10:00始
地 点:南海楼338室
热烈欢迎广大师生参加!
信息科学技术学院/网络空间安全学院
2017年12月13日