Google and IBM have announced an initiative to promote new software development methods which will help students and researchers address the challenges of internet-scale applications in the future.
The goal of this initiative is to improve computer science students’ knowledge of highly parallel computing practices to better address the emerging paradigm of large-scale distributed computing. IBM and Google are teaming up to provide hardware, software and services to augment university curricula and expand research horizons.
With their combined resources, the companies hope to lower the financial and logistical barriers for the academic community to explore this emerging model of computing.
The University of Washington was the first to join the initiative. A small number of universities will also pilot the program, including Carnegie-Mellon University, Massachusetts Institute of Technology, Stanford University, the University of California at Berkeley and the University of Maryland. In the future, the program will be expanded to include additional researchers, educators and scientists.
"Google is excited to partner with IBM to provide resources which will better equip students and researchers to address today’s developing computational challenges," says Eric Schmidt, CEO of Google. "In order to most effectively serve the long-term interests of our users, it is imperative that students are adequately equipped to harness the potential of modern computing systems and for researchers to be able to innovate ways to address emerging problems."
Fundamental changes in computer architecture and increases in network capacity are encouraging software developers to take new approaches to computer-science problem solving. F
or web software such as search, social networking and mobile commerce to run quickly, computational tasks often need to be broken into hundreds or thousands of smaller pieces to run across many servers simultaneously. Parallel programming techniques are also used for complex scientific analysis such as gene sequencing and climate modeling.
"This project combines IBM’s historic strengths in scientific, business and secure-transaction computing with Google’s complementary expertise in Web computing and massively scaled clusters," says Samuel Palmisano, chairman, president and chief executive officer, IBM. "We’re aiming to train tomorrow’s programmers to write software that can support a tidal wave of global Web growth and trillions of secure transactions every day."
For this project, the two companies have dedicated a large cluster of several hundred computers (a combination of Google machines and IBM BladeCenter and System x servers) that is planned to grow to more than 1 600 processors. Students will access the cluster via the Internet to test their parallel programming course projects.
The servers will run open source software including the Linux operating system, XEN systems virtualization and Apache’s Hadoop project, an open source implementation of Google’s published computing infrastructure, specifically MapReduce and the Google File System (GFS).
To simplify the development of massively parallel programs Google and IBM have created the following resources:
* A cluster of processors running an open source implementation of Google’s published computing infrastructure (MapReduce and GFS from Apache’s Hadoop project);
* A Creative Commons licensed university curriculum developed by Google and the University of Washington focusing on massively parallel computing techniques available at: http://code.google.com/edu/content/parallel.html;
* Open source software designed by IBM to help students develop programs for clusters running Hadoop. The software works with Eclipse, an open source development platform. The plugin is currently available at: http://lucene.apache.org/hadoop/;
* Management, monitoring and dynamic resource provisioning of the cluster by IBM using IBM Tivoli systems management software; and
* A website to encourage collaboration among universities in the program. This will be built on Web 2.0 technologies from IBM’s Innovation Factory.