Henri Bal: Large-scale parallel computing on grids
Computational grids are interesting platforms for solving large-scale
computational problems, because they consist of many (geographically
distributed) resources. Thus far, grids have mainly been used for
high-throughput computing on independent (or trivially parallel) jobs.
However, advances in grid software (programming environments,
schedulers)
and optical networking technology make it more and more feasible to
use grids for solving challenging large-scale problems.
The talk will first give a brief introduction to grid infrastructures,
using the Dutch DAS-3 Computer Science grid as example. DAS-3 has a
flexible and reconfigurable 40 Gb/s optical network called StarPlane
between its five clusters and a 10 Gb/s dedicated optical link to the
French Grid'5000 system. From a parallel programming point of view,
grids like DAS-3 are characterized by a high-latency/high-bandwidth
network and a hierarchical structure.
Next, the talk will discuss how algorithms and applications can
be optimized to run in such an environment. It focusses on search
applications like retrograde analysis, which, much like model checkers,
analyze huge search spaces. As a case study, we have implemented
an application that solves the game of Awari, which has 900 billion
different states. Several optimizations were needed to obtain high
performance on DAS-3/StarPlane.
The last part of the talk will discuss research on programming
environments that will make it easier to develop parallel
applications for grids. Grid programmers often have to use low-level
programming interfaces that change frequently, and they have to
deal with heterogeneity, connectivity problems, security issues, and
dynamically changing execution environments. The Ibis project aims
to drastically simplify the whole programming and deployment process
of high-performance grid applications. The philosophy of Ibis is
that grid applications should be developed on a local workstation
and simply be launched from there. Ibis uses middleware-independent
Application Programming Interfaces with different abstraction levels,
ranging from low-level message passing to high-level divide-and-conquer
parallelism and group communication.