Workshop on Clusters, Clouds,
and Data for Scientific Computing
CCDSC 2018
(last update 9/6/18 12:50 AM)
427 Chemin de ChanzŽ, France
Sponsored by:
NSF, AMD, PGI, Nvidia, Intel, Mellanox, STFC, ICL/UTK,
Vanderbilt University, Grenoble Alps University.
Clusters, Clouds, and Data for Scientific Computing
2018
427 Chemin de ChanzŽ, France
September 4th Ð 7th,
2018
CCDSC 2018 will be held at a resort outside of Lyon France called La maison des contes http://www.chateauform.com/en/chateauform/maison/17/chateau-la-maison-des-contes
The address of the Chateau
is:
Ch‰teauformÕ La Maison des
Contes
427 chemin de ChanzŽ
69490 DareizŽ
Telephone: +33 1 30 28 69 69
1 hr 30 min from the Saint
ExupŽry Airport
45 minutes from Lyon
GPS Coordinates: North latitude 45¡ 54' 20" East longitude 4¡ 30' 41"
Go to http://maps.google.com and type in: Ò427 chemin de ChanzŽ 69490 DareizŽÓ or
see:
Maps: click here
Map of Chateau: click here
This proceeding gathers information
about the participants of the Workshop on Clusters, Clouds, and Data for
Scientific Computing that will be held at La Maison des Contes, 427 Chemin de
ChanzŽ, France on September 4th-7th, 2018. This workshop is a continuation of a
series of workshops started in 1992 entitled Workshop on Environments and Tools
for Parallel Scientific Computing. These workshops have been held every two
years and alternate between the U.S. and France. The purpose of this the
workshop, which is by invitation only, is to evaluate the state-of-the-art and
future trends for cluster computing and the use of computational clouds for
scientific computing.
This workshop addresses a
number of themes for developing and using both cluster and computational clouds.
In particular, the talks covered:
¤ Survey and analyze the key
deployment, operational and usage issues for clusters, clouds and grids,
especially focusing on discontinuities produced by multicore and hybrid architectures,
data intensive science, and the increasing need for wide area/local area
interaction.
¤
Document the current state-of-the-art in each of these areas,
identifying interesting questions and limitations. Experiences with clusters, clouds and grids relative to
their science research communities and science domains that are benefitting
from the technology.
¤
Explore interoperability among
disparate clouds as well as interoperability between various clouds and grids
and the impact on the domain sciences.
¤ Explore directions for future
research and development against the background of disruptive trends and
technologies and the recognized gaps in the current state-of-the-art.
Speakers will present their
research and interact with all the participants on the future software
technologies that will provide for easier use of parallel computers.
This workshop was made
possible thanks to sponsorship from NSF, AMD, PGI, Nvidia, Intel, Mellanox, STFC,
ICL/UT, Vanderbilt Univeristy, Grenoble Alps University.
Thanks!
Jack Dongarra, Knoxville,
Tennessee, USA.
Bernard Tourancheau, Grenoble,
France
September 4th |
Jack Dongarra, U of Tenn Bernard Tourancheau, U Grenoble |
Introduction and Welcome |
6:30 Ð 7:45 |
Session Chair: Jack Dongarra |
(5 talks - 15 minute each) |
6:30 |
Luiz DeRose, Cray |
Scaling DL Training Workloads with the Cray PE Plugin |
6:45 |
Joe Curley, Intel |
Optimizing Deep Learning on General Purpose hardware Ð Part 2 |
7:00 |
Steve Scalpone, PGI/Nvidia |
The F18 Fortran Compiler |
7:15 |
Bill Brantley, AMD |
AMD HPC Server Update |
7:30 |
|
|
8:00 pm Ð 9:00 pm |
Dinner |
|
Wednesday,
September 5th |
|
|
7:30 - 8:30 |
Breakfast |
|
8:30 - 10:30 |
Session Chair: Bernard Tourancheau |
(6 talks Ð 20 minutes each) |
8:30 |
Ian Foster |
Learning Systems for Science |
8:50 |
Rosa Badia |
TANGO:
How to dance with multiple task-based programming approach |
9:10 |
Bill Gropp |
Managing Code Transformations for Better Performance Portability |
9:30 |
Geoffrey Fox |
AI-Driven
Science and Engineering with the Global AI Supercomputer |
9:50 |
Ewa Deelman |
Building a Cyberinfrastructure Community |
10:10 |
Ilkay Altintas |
Collaborative
Workflow-Driven Science in a Rapidly Evolving Cyberinfrastructure
Ecosystem |
10:30 -11:00 |
Coffee |
|
11:00 - 1:00 |
Session Chair: Padma Raghavan |
(6 talks Ð 20 minutes each) |
11:00 |
Pete Beckman |
The Tortoise and the Hare: Is there still time for HPC to catch up in the performance race? |
11:20 |
Vaidy Sunderam |
Data Driven Systems For Spatio-Temporal Applications |
11:40 |
Anne Benoit |
Co-scheduling HPC Workloads on Cache-Partitioned CMP Platforms |
12:00 |
Ken Birman |
High Performance State Machine Replication for HPC: Bridging Two Worlds |
12:20 |
Nicola Ferrier |
Computing at the Edge |
12:40 |
Guillaume Aupy |
I/O Management in HPC Systems, from Burst-Buffers to I/O Scheduling |
1:00 - 2:00 |
Lunch - break |
|
2:30 Ð 3:00 |
Coffee |
|
3:00 - 5:20 |
Session Chair: Dorian Arnold |
(6 talks Ð 20 minutes each) |
3:00 |
Barbara Chapman |
Compiler Optimizations for
Parallelism and Locality on Emerging Hardware |
3:20 |
Al Geist |
Latest
Results from Summit Ð the New #1 System on the TOP500 |
3:40 |
Tony Hey |
Machine Learning
and Big Scientific Data Benchmarks |
4:00 |
Franck Cappello |
Frontiers of Lossy Compression for Scientific Data |
4:20 |
Heike Jagode |
PAPI's new Software-Defined Events for In-depth Performance Analysis |
4:40 |
Yves Robert |
A Little Scheduling Problem |
6:30 Ð 7:30 |
Organic wine tasting in the Ç salon des contes È where we had the first welcome gathering |
|
8:00 Ð 9:00 |
Dinner |
|
Thursday, September 6th |
|
|
7:30 - 8:30 |
Breakfast |
|
8:30 - 10:30 |
Session Chair: Emmanuel Jeannot |
(6 talks Ð 20 minutes each) |
8:30 |
Joel Saltz |
Integrative Everything, Deep Learning and Streaming Data |
8:50 |
Judy Qiu |
Real-Time Anomaly Detection from Edge to HPC-Cloud |
9:10 |
Ron Brightwell |
Resource Management in the Era of Extreme Heterogeneity |
9:30 |
Andrew Grimshaw |
Timing is Everything: The CCC as an Alternative to Commercial Clouds |
9:50 |
Phil Papadopoulos |
Virtualization is the answer. What was the question? |
10:10 |
Alok Choudhary |
The Ultimate Self-Driving Machine |
10:30 Ð 11:00 |
Coffee |
|
11:00 - 1:00 |
Session Chair: Laurent
Lefevre |
(6 talks Ð 20 minutes each) |
11:00 |
Rich Vuduc |
Algorithm-Level Control of Performance and Power Tradeoffs |
11:20 |
Michela Taufer |
Modeling
Record-and-Replay for Nondeterministic Applications on Exascale Systems |
11:40 |
Emmanuel Jeannot |
Process Placement from Monitoring to Data Analysis |
12:00 |
Haohuan Fu |
Extreme-Scale Earthquake Simulation on Sunway TaihuLight |
12:20 |
Carl Kesselman |
Computation as an Experimental Science |
12:40 |
Torsten Hoefler |
Quantum Computing from an HPC System's Perspective |
1:00 - 2:00 |
Lunch |
|
2:00 Ð 4:00 |
Session Chair: Michela Taufer |
(6 talks Ð 20 minutes each) |
2:00 |
Dimitrios Nikolopoulos |
Realistic Fault Injection and Analysis for Exascale Systems |
2:20 |
Rob Ross |
Versatile Data Services
for Computational Science |
2:40 |
Mary Hall |
Mainstreaming Autotuning Compilers for Performance Portability: What will it Take? |
3:00 |
Manish Parashar |
Enabling Data-Driven Edge/Cloud Application Workflows |
3:20 |
David Abramson |
Energy Efficiency Modeling of Parallel Applications |
3:40 |
Dorian Arnold |
Big Deal, Little Deal or No Deal? The Realities of the HPC Resilience Challenge |
4:00 - 4:30 |
Coffee |
|
4:30 Ð 5:40 |
Session Chair: Rosa Badia |
(5 talks Ð 20 minutes each) |
4:30 |
Jeff Vetter |
Preparing for Extreme Heterogeneity in High Performance Computing |
4:50 |
George Bosilca |
MPI as perceived by the ECP community |
5:10 |
Padma Raghavan |
Rethinking the Computational Complexity and Efficiency in the Age of ÒBig DataÓ |
5:30 |
Laurent Lefevre |
Building and exploiting the table of energy and power leverage for energy efficient large scale HPC systems |
5:50 |
Hartwig Anzt |
Towards a Modular Precision Ecosystem |
8:00 Ð 9:00 |
Dinner |
|
9:00 pm - |
|
|
Friday, September 7th |
|
|
7:30 - 8:30 |
Breakfast |
|
8:30 - 10:30 |
Session Chair: Padma Raghavan |
(6 talks Ð 20 minutes each) |
8:30 |
Bernd Mohr |
On the ROI of Parallel Performance Optimization |
8:50 |
Christian Obrecht |
Building Simulation: an Illusion |
9:10 |
LaŽrcio Lima Pilla |
Decoupling schedulers from runtime systems for increased reuse and portability |
9:30 |
Frederic Vivien |
A Generic
Approach to Scheduling and Checkpointing Workflows |
9:50 |
Martin Swany |
Network Microservices and Edge Computing |
10:10 |
Jonathan Churchhill |
Managing Mismatched Network Interface Performance in Multi Terabit Converged Ethernet Software Defined Storage |
10:30 -11:00 |
Coffee |
|
11:00 Ð 12:00 |
Session Chair: Bernard Tourancheau |
(3 talks Ð 20 minutes each) |
11:00 |
Frederic Desprez |
SILECS: Super Infrastructure for Large-scale Experimental Computer Science |
11:20 |
Minh Quan Ho |
Standard Libraries on Non-Standard Processors |
11:40 |
Rich Graham, Mellanox |
The NetworkÕs Role in
the Large-Scale Computational Eco-System |
12:00 - 1:30 |
Lunch |
|
1:30 |
Depart |
|
Here is some information on
the meeting in Lyon. We have
updated the workshop webpage http://bit.ly/ccdsc-2018 with the workshop agenda.
On Tuesday September 4th
there will be a bus to pick up participants at Lyon's Saint ExupŽry (old name
Satolas) Airport at 3:00 pm. (Note that the Saint ExupŽry airport has its own
train station with direct TGV connections to Paris via Charles de Gaulle. If
you arrive by train at Saint ExupŽry airport please go to the airport meeting
point (point-rencontre) (second floor, next to the shuttles, near the hallway
between the two terminals, see http://www.lyonaeroports.com/en/practicals-informations/information-points ).
The
bus will be at the TGV station is after a long corridor from the airport
terminal. The bus stop is near the station entrance on the parking lot called
"depose minute".
The bus will then travel to
pick up people at the Lyon Part Dieu railway station at 4:45 pm. (There are two
train stations in Lyon, you want Part Dieu station not the Perrache station.)
There will be someone with a sign at the "Meeting Point/point de
rencontre" of the station to direct you to the bus.
The bus is expected to arrive
at the La Maison des Contes around 5:30. We would like to hold the first
session on Tuesday evening from 6:30 pm to 8:00 pm, with dinner following the
session. The La Maison des Contes is about 43 Km from Lyon. For a map to the La
Maison des Contes go to http://maps.google.com and type in: Ò427 chemin de ChanzŽ 69490 DareizŽÓ or
see: Maps: click here
Map of Chateau: click here
VERY IMPORTANT: Please send
your arrival and departure times to Jack so we can arrange the appropriate size
bus for transportation. VERY VERY
IMPORTANT: If your flight is such that you will miss the bus on Tuesday
September 4th at 3:00 send Bernard your flight arrival information
so he can arrange for a transportation to pick you up at the train station or
the airport in Lyon. It turns out that a taxi from Lyon to the Chateau can cost
as much as 100 Euro and the Chateau may be hard to find at night if you rent a
car and are not a French driver :-).
At the end of the meeting on Friday
afternoon, we will arrange for a bus to transport people to the train station
and airport. If you are catching an early flight in the morning of Saturday September
8th you may want to stay at the hotel located at Lyon's Saint
ExupŽry Airport,
see http://www.lyonaeroports.com/eng/Shops-facilities/Hotels
for details.
There are also many hotels in
Lyon area, see: http://www.en.lyon-france.com/
Due to room constraints at
the La Maison des Contes, we ask that you not bring a guest. Dress at the
workshop is informal. Please tell
us if you need special requirements (vegetarian food etc...) We are expecting
to have internet and wireless connections at the meeting, but you know this is
France.
Please send this information
to Jack (dongarra@icl.utk.edu) by
August 5th.
Name:
Institute:
Title:
Abstract:
ParticipantÕs
brief biography:
Arrival / Departure Details:
|
|
Arrival |
Departure |
Special |
David |
Abramson |
9/4 Part Dieu |
9/7 Part Dieu |
Vegetarian |
Ilkay |
Altintas |
9/4 UA9487 12:15pm |
9/8 UA8881 7:40am |
|
Hartwig |
Anzt |
9/4 Part Dieu |
9/7 Part Dieu 2:41pm Train |
|
Dorian |
Arnold |
9/4 Part Dieu |
9/7 Part Dieu |
|
Guillaume |
Aupy |
Car |
Bus to Part Dieu |
|
Rosa |
Badia |
9/4 VY1220 12:25pm |
9/7 EZY4417 2:10pm |
Will need a car on September 7th
mid morning. |
Pete |
Beckman |
9/4 Train to Airport 2:01 |
9/7 airport |
|
Anne |
Benoit |
Car |
Car |
|
Ken |
Birman |
9/4 TAP 476 11:15 am Taxi |
9/7 TAP 473 6:00am |
Will need a car on September 6th
for the early flight back on September 7th |
George |
Bosilca |
9/4 Part Dieu |
9/7 Early |
Will need a car on September 7th
mid morning. |
Bill |
Brantley |
9/4 Part Dieu |
9/7 Part Dieu |
|
Ron |
Brightwell |
9/4 4:00pm Part Dieu |
9/7 Part Dieu |
|
Franck |
Cappello |
9/4 Car |
9/7 Bus to airport |
|
Barbara |
Chapman |
9/4 UA9959 10:50am |
9/7 Part Dieu |
|
Alok |
Choudhary |
9/4 LH 1076 1:35pm |
9/7 LH 1077 2:15pm |
Will need a car on September 7th
mid morning. |
Jonathan |
Churchhill |
9/4 EZY8415 10:50am |
9/7 EZY8418 7:15pm |
|
Joe |
Curley |
9/4 LH1074 10:05am |
Bus to airport |
|
Ewa |
Deelman |
9/4 Part Dieu |
Bus back to Part Dieu |
|
Luiz |
DeRose |
9/4 Part Dieu |
9/7 Part Dieu |
|
Frederic |
Desprez |
Drive |
Drive |
|
Jack |
Dongarra |
9/4 KLM 1415 1:20pm |
9/7 KLM 1412
6:10am |
Will need a car on September 6th
for the early flight back on September 7th |
Fanny |
Dufosse |
9/4 Part Dieu |
|
|
Nicola |
Ferrier |
9/4 UA8914 10:05am |
9/7 bus |
|
Ian |
Foster |
9/4 UA9487 12:15pm |
9/7 UA9026 2:15pm |
Will need a car on September 7th
mid morning. |
Geoffrey |
Fox |
9/4 Airport Bus |
9/7 Airport Bus |
|
Haohuan |
Fu |
9/4 Will take a taxi to chateau |
9/7 Part Dieu |
|
Al |
Geist |
9/4 KLM 1415 1:20pm |
9/8 DL 8611 6:15am |
|
Rich |
Graham |
9/4 Late; Taxi to Chateaux |
9/7 Part Dieu |
|
Andrew |
Grimshaw |
9/4 LH1076 1:35pm |
9/6 LH2253 7:45pm |
Will need a car on September 6th
mid afternoon |
Bill |
Gropp |
9/4 Part Dieu |
9/7 Part Dieu |
|
Mary |
Hall |
9/4 Part Dieu |
9/7 Part Dieu |
|
Li |
Han |
Car |
|
|
Tony |
Hey |
9/4 BA 362
4:30pm |
9/7 BA363 5:20pm |
Arriving too late for the bus,
will arrange a car |
Torsten |
Hoefler |
9/4 Part Dieu |
9/7 Part Dieu |
|
Minh
Quan |
Ho |
9/4 Part Dieu |
9/7 Part Dieu |
|
Heike |
Jagode |
9/4 LH1076 1:35pm |
9/7 LH1077 2:15pm |
Vegan Will need a car on September 7th
mid morning. |
Emmanuel |
Jeannot |
9/4 Part Dieu |
9/7 3:50pm flight |
|
Carl |
Kesselman |
9/4 UA8916 1:35pm |
9/8 UA9486 9:20am |
|
Laurent |
Lefevre |
Drive to workshop |
Drive |
vegetarian
(no meat, but fish, milk, eggs OK) |
LaŽrcio
|
LIMA
PILLA |
Car |
Car |
|
Bernd |
Mohr |
9/4 4:00pm Part Dieu TGV 9828 |
9/7 3:04pm Part Dieu TGV 6622 |
|
Dimitrios |
Nikolopoulos |
9/3 4:30pm BA0362 |
9/6 6:50am BA0365 |
Will need a car on September 5thevening. |
Christian |
Obrecht |
Drive to workshop |
Drive |
|
Phil |
Papadopoulos |
9/4 KLM 1415 1:20pm |
9/9 DL 9499 9:55am |
|
Manish |
Parashar |
9/4 UA9959 10:50am |
9/7 UA9944 7:20am |
Vegetarian Will need a car on September 6th
for the early flight back on the 7th |
Judy |
Qiu |
9/4 Airport Bus |
9/7 Airport Bus |
|
Padma |
Raghavan |
9/4 AA 6487 4:30pm |
9/8 BA365 6:50am |
Arriving too late for the bus,
will arrange a car9/4 |
Yves |
Robert |
Drive to workshop |
Drive |
|
Rob |
Ross |
9/4 (AC 828) United 8024 8:10am |
9/7 (LH 4229) United 9488 12:50pm |
Will need a car on September 7th
early morning. |
Joel |
Saltz |
9/4 Part Dieu 2:44pm |
9/7 Part Dieu 6:04pm |
|
Steve |
Scalpone |
9/4 KLM1415 1:20pm |
9/8 LH1075 10:45am |
|
Vaidy |
Sunderam |
9/4 TGV #6615 Part Dieu 2:56pm |
9/8 TGV 08:00am Part Dieu |
|
Martin |
Swany |
Part Dieu |
Part Dieu |
|
Michela |
Taufer |
9/4 AA 6487 4:30pm |
9/8 AA 8602 10:30am |
Vegetarian (no meat, fish
OK) |
Bernard |
Tourancheau |
Airport bus |
|
|
Jeff |
Vetter |
9/4 AF 7652 1:35pm |
9/8 AF 7651 6:15am |
|
Frederic
|
Vivien |
Car |
Car |
|
Rich |
Vuduc |
9/4 Lyon Airport |
9/7 DL 85 3:20pm CDG |
Will need a car on September 7th
early morning or September 6th. |
David Abramson, U of Queensland
Energy Efficiency Modeling of Parallel Applications
Abstract: Energy efficiency has become increasingly important in high performance computing (HPC), as power constraints and costs escalate. Workload and system characteristics form a complex optimization search space in which optimal settings for energy efficiency and performance often diverge. Thus, we must identify trade-off options for performance and energy efficiency to find the desired balance between them. We present an innovative statistical model that accurately predicts the Pareto optimal performance and energy efficiency trade-off options using only user-controllable parameters. Our approach can also tolerate both measurement and model errors. We study model training and validation using several HPC kernels, then explore the feasibility of applying the model to more complex workloads, including AMG and LAMMPS. We can calibrate an accurate model from as few as 12 runs, with prediction error of less than 10%. Our results identify trade-off options allowing up to 40% improvement in energy efficiency at the cost of under 20% performance loss. For AMG, we reduce the required sample measurement time from 13 hours to 74 minutes (about 90%).
Ilkay
Altintas, UCSD
Collaborative Workflow-Driven Science in a Rapidly Evolving
Cyberinfrastructure Ecosystem
ABSTRACT: Scientific workflows are powerful tools for computational data
scientists to perform scalable experiments, often composed of complex tasks and
algorithms distributed on a potentially heterogeneous set of
resources. Existing cyberinfrastructure provides powerful components that
can be utilized as building blocks within workflows to translate the newest
advances into impactful repeatable solutions that can execute at scale.
However, any workflow development activity today depends on the effective
collaboration and communication of a multi-disciplinary data science team, not
only with humans but also with analytical systems and infrastructure. Dynamic,
predictable and programmable interfaces to systems and scalable infrastructure
is key to building effective systems that can bridge the exploratory and
scalable activities in the scientific process. This talk will focus on our
recent work on the development of methodologies and tools for effective
workflow driven collaborations, namely the PPoDS methodology and family of
SmartFlows tools for the practice and smart utilization of workflows.
Hartwig Anzt, Karlsruher Institut fŸr Technologie
Towards a modular
precision ecosystem
Abstract: Over the last years, we have observed a growing mismatch between the
arithmetic performance of processors in terms of the number of floating point operations
per second (FLOPS) on the one side, and the memory performance in terms of how
fast data can be brought into the computational elements (memory bandwidth) on
the other side. As a result, more and more applications can utilize only a
fraction of the available compute power as they are waiting for the required
data. With memory operations being the primary energy consumer, data access is
pivotal also in the resource balance and the battery life of mobile
devices. In this I will introduce a disruptive paradigm change with
respect to how scientific data is stored and processed in computing
applications. The goal is to 1) radically decouple the data storage format from
the processing format; 2) design a "modular precision ecosystem'' that
allows for more flexibility in terms of customized data access; 3) develop
algorithms and applications that dynamically adapt data access accuracy to the
numerical requirements.
Dorian
Arnold, Emory U
Big
Deal, Little Deal or No Deal? The Realities of the HPC Resilience Challenge
Abstract: Considering fault-tolerant distributed systems, conceptual works date back to at least the 1960s, and practical software and hardware systems begin to appear at least in the early 1970s. Today, fault-tolerance or resilience is stated as one of the major challenges to realizing exascale computational capabilities. Yet, there are widely varying perspectives on the extent to which fault-tolerance is or will be an impediment. In this talk, we briefly survey landmark hardware and software technologies from the early to present day to posit answers to the questions: Should we be worried about fault-tolerance? If so, how much and specifically what about?
Rosa
M Badia, Barcelona Supercomputing Center
TANGO: How to dance with multiple task-based programming
approaches
Abstract: In the EU funded project TANGO BSC
has been integrating two of the instances of task-based programming models:
COMPSs and OmpSs. The combination of both is very interesting, since enables to
parallelize applications at task level in distributed computing platforms
(including Clouds) through COMPSs and to exploit finer level parallelism
offloading OmpSs tasks to GPUs and FPGAs. Additionally, a new elasticity
concept in large clusters has been integrated in COMPSs. The talk will
introduce the TANGO programming model and will illustrate its application with
several use cases, from HPC to embedded areas.
Pete
Beckman, ANL
The Tortoise and the Hare: Is there still time for
HPC to catch up in the performance race?
Abstract: Speed
and scale define supercomputing. By some metrics, our supercomputers are
the fastest, most capable systems on the planet. However over the last
twenty years, the HPC community has become overconfident. Instead of leading
the race for new architectures, methods, and software stacks, we pride
ourselves on uptime, reliability, and the performance of a handful of hero
computations. For many HPC deployments, lowering risk is more important
than sprinting ahead. Has the cloud computing community already won the
race? Can HPC regain leadership?
Anne
Benoit, ENS Lyon, France
Co-scheduling HPC workloads on
cache-partitioned CMP platforms
Abstract: Co-scheduling techniques are used to improve the
throughput of applications on chip multiprocessors (CMP), but sharing resources
often generates critical interferences. We focus on the interferences in the
last level of cache (LLC) and use the Cache Allocation Technology (CAT)
recently provided by Intel to partition the LLC and give each co-scheduled
application their own cache area. We consider m iterative HPC
applications running concurrently and answer the following questions: (i) how
to precisely model the behavior of these applications on the cache partitioned
platform? and (ii) how many cores and cache fractions should be assigned to
each application to maximize the platform efficiency? Here, platform efficiency
is defined as maximizing the performance either globally, or as guaranteeing a
fixed ratio of iterations per second for each application. Through extensive
experiments using CAT, we demonstrate the impact of cache partitioning when
multiple HPC application are co- scheduled onto CMP platforms.
Ken
Birman, Cornell U
High Performance State Machine Replication for HPC:
Bridging two worlds
Abstract: Our new Derecho system shows that by leveraging HPC
hardware (such as RDMA or Intel OMNI Path), state machine replication can run
at stunning speeds and scale. DerechoÕs main target is to support a new
kind of edge computing with massive data rates and demanding real-time response
requirements, a need also seen in many of todayÕs most exciting HPC
settings. Indeed, many cloud-edge uses are basically HPC scenarios.
Meanwhile, the HPC community has long struggled with issues of fault-tolerance
for very large computations. Can high performance state machine
replication bridge the two worlds?
George
Bosilca, UTK
MPI as perceived
by the ECP community
Abstract: The Exascale Computing Project (ECP) is currently
the primary effort in the United States focused on developing ÒexascaleÓ levels
of computing capabilities, including hardware, software and applications. In
order to obtain a more thorough understanding of how the software projects
under the ECP are using, and planning to use the Message Passing Interface
(MPI), and help guide the work of our own project within the ECP, we created a
survey. This talks presents some results of the survey, providing a picture of
MPI capabilities as perceived by some of it's "power users".
Bill
Brantly, AMD
AMD HPC Server Update
Abstract: AMD
HPC products have changed a great deal since the last CCDSC. I will give
a very brief overview of current CPU and GPU server products and thus far
public information about 2019 products.
Ron
Brightwell, Sandia Labs
Resource Management in the Era of Extreme Heterogeneity
Abstract: Future HPC systems will be characterized by a large number and
variety of complex, interacting components including processing units,
accelerators, deep memory hierarchies, multiple interconnects, and alternative
storage technologies. In addition to extreme hardware diversity, there is a
broadening community of computational scientists using HPC as a tool to address
challenging problems. Systems will be expected to efficiently support a wider
variety of applications, including not only traditional HPC modeling and
simulation codes, but also data analytics and machine learning workloads. This
era of extreme heterogeneity creates several challenges that will need to be
addressed to enable future systems to be effective tools for enabling
scientific discovery. A recent DOE/ASCR workshop discussed these challenges and
potential research directions to address them. In this talk, I will give my
perspective on the resource management challenges and approaches stemming from
extreme heterogeneity and offer my views on the most important system software
capabilities that will need to be explored to meet these challenges.
Franck
Cappello, ANL
Frontiers of lossy compression for
scientific data
Abstract:
Lossy compression is becoming popular for scientific data because of the need
to reduce scientific data significantly. However, while the application of
lossy compression is well understood in many domains (audio, video, image), it
opens many questions when the data is produced and consumed by scientific
simulations. In this talk, we explore three frontiers of lossy compression for
scientific data: (i) the compression algorithms, (ii) the application of lossy
compression for scientific simulation and (iii) the methodology to evaluate,
compare and assess the impacts of lossy compression.
Barbara
Chapman, SUNY Stonybrook
Compiler Optimizations for Parallelism and Locality on
Emerging Hardware
Abstract: Pre-exascale computing
systems are already giving us insight into the level of complexity that
next-generation HPC architectures will entail. In order to enable their exploitation,
and strive to meet the expectations of application developers, intra-node
programming interfaces may provide constructs that enable code to explicitly
use new architectural features. On
the other hand, approaches are also needed that reduce the level of effort
required to port codes to a potentially diverse array of computers.
The feature set of the OpenMP
API is being enhanced in its 5.0 specification, now available in a draft form,
to address these challenges. Its implementation technology will also need to be
extended to meet new challenges. In
this talk, we describe some of the ways in which we are working in the
ECP-funded SOLLVE project to implement anticipated new features and enhance the
state of the art in OpenMP implementations.
Alok
N. Choudhary, Northwestern University
The
Ultimate Self-Driving Machine
Abstract: HPC, ML, data mining, IOT, and control systems
technologies among others have played a central role in advancing the cause of
self-driving vehicles. This talk is not about the technologies. The talk will
explore the possible impact on business and society and potential
transformation that may occur or may be required if and when self-driving
vehicles become a reality.
Jonathan Churchhill, STFC
Managing Mismatched Network Interface Performance In Multi
Terabit Converged Ethernet Software Defined Storage
Abstract: In this talk I will
discuss some of the issues weÕve encountered, with mixing 10/25/40/50/100Gb
equipped compute and storage servers in our very large multi-Terabit converged
Ethernet network that is the heart of JASMIN. This is in the context of our
move from traditional parallel file systems to similarly high performance
software defined object storage.
Joe Curley, Intel
Optimizing Deep
Learning on General Purpose Hardware Ð Part 2
Ewa
Deelman, ISI
Building a Cyberinfrastructure Community
Abstract: This talk will examine the opportunities of building a community around cyberinfrastructure design and deployment in scientific projects. It will examine what types of the capabilities can be shared across large cyberinfrastructure projects. It will ask the questions about how to build such a community, how to sustain it over time?
Luiz
DeRose, Cray
Scaling DL Training Workloads with the Cray PE Plugin
Abstract: Deep Learning with convolutional neural networks is emerging as a powerful tool for
analyzing complex datasets through classification, prediction and regression.
Neural networks can also be trained to produce datasets in scenarios
traditionally addressed with simulation and at significantly lower
computational cost. However, training neural networks is a computationally
intensive workload with training times measured in days or weeks on a single
server or node. Thus, the computational resources needed to train sufficiently
complex networks can limit the use of Deep Learning in production. High
Performance Computing, in particular efficient scaling to large numbers of
nodes, is ideal for addressing this problem. In this talk I will present the
Cray Programming Environments Deep Learning Scalability Plugin, a portable
solution for high performance scaling of deep learning frameworks.
Frederic Desprez, INRIA
SILECS: Super Infrastructure for Large-scale Experimental
Computer Science
Abstract: SILECS, based on two existing infrastructure (FIT and Grid'5000),
aims to provide a large robust, trustable and scalable instrument for research
in distributed computing and networks. Experiments from the Internet of Things,
data centers, cloud computing, security services, and the networks connecting
them will be possible, in a reproducible way, on various hardware and software.
This instrument will offer a multi-platform experimental infrastructure (HPC,
Cloud, Big Data, Software Defined Storage, IoT, wireless, Software Defined
Network / Radio) capable of exploring the
infrastructures that will be deployed tomorrow and assist researchers and
industrial about how to design, build and operate a multi-scale, robust and
safe computer system. Diverse digital resources (compute, storage, link,
IO devices) are be assembled to support a ÒplaygroundÓ at scale.
Nicola
Ferrier, ANL
Computing at the Edge
Ian
Foster, U of Chicago and ANL
Learning Systems for Science
Abstract: New learning
technologies seem likely to transform much of science, as they are already
doing for many areas of industry and society. We can expect these technologies
to be used, for example, to obtain new insights from massive scientific data and
to automate research processes. However, success in such endeavors will require
new learning systems: scientific computing platforms, methods, and software
that enable the large-scale application of learning technologies. These systems
will need to enable learning from extremely large quantities of data; the
management of large and complex data, models, and workflows; and the delivery
of learning capabilities to many thousands of scientists. In this talk, I
review these challenges and opportunities and describe systems that my
colleagues and I are developing to enable the application of learning
throughout the research process, from data acquisition to analysis.
Geoffrey
Fox, Indiana University
AI-Driven Science and Engineering with the
Global AI Supercomputer
Abstract:
Most things are dominated by Artificial Intelligence (AI). Technology Companies
like Amazon, Google, Facebook, and Microsoft are AI First organizations.
Engineering achievement today is highlighted by the AI buried in a vehicle or machine.
Industry (Manufacturing) 4.0 focusses on the AI-Driven future of the
Industrial Internet of Things. Software is eating the world. We can describe
much computer systems work as designing, building and using the Global AI
supercomputer which itself is autonomously tuned by AI. We suggest that this is
not just a bunch of buzzwords but has profound significance and examine
consequences of this for education and research. Naively high-performance
computing should be relevant for the AI supercomputer but somehow the corporate
juggernaut is not making so much use of it. We discuss how to change this.
Haohuan
Fu, Tsinghua University
Extreme-Scale Earthquake Simulation on Sunway TaihuLight
Abstract: This talk would first
introduce and discuss the design philosophy about the Sunway TaihuLight system,
and then describe our recent efforts on performing earthquake simulations on
such a large-scale system. Our work in 2017 accomplished a complete redesign of
AWP-ODC for Sunway architectures, achieves over 15% of the system's peak,
better than the 11.8% achieved by a similar software running on Titan, whose
byte to flop ratio is 5 times better than TaihuLight. The extreme cases
demonstrate a sustained performance of over 18.9 Pflops, enabling the
simulation of Tangshan earthquake as an 18-Hz scenario with an 8-meter
resolution. Our recent work further improves the simulation framework with
capabilities to describe complex surface topography, and to drive building
damage prediction and landslide simulation, which are demonstrated with a case
study of the Wenchuan earthquake with accurate surface topography and improved
coda wave effects.
Al
Geist, ORNL
Latest results from Summit Ð the new #1 system
on the TOP500
Abstract:
In June 2018 the Summit system at the Oak Ridge Leadership Computing Facility
became the new #1 system on the TOP500 at 122 PF. This talk will describe the
design of this system, the complex three
lab collaboration used in its development, and the latest application results from Summit. The design
of Summit provides a very green
computer that is able to do traditional high performance computing as well as
machine learning and data analytics. SummitÕs peak double-precision performance
is 200 PF, but more amazing is that itÕs peak machine learning capability is
over 3 exaops. This talk will describe a data analytics application that has
already achieved 1.9 exaops on
Summit.
Rich
Graham, Mellanox
The NetworkÕs Role in
the Large-Scale Computational Eco-System
Abstract: As the volume of data transfers
increases, the opportunities to manipulate data in flight increase, providing
an opportunity to increase overall system efficiency and increasing application
performance. This presentation will present several capabilities Mellanox
Technologies has introduced to support in-network computing, and describe
improvements that result from these. Technologies such as SHARP, MPI
hardware tag matching, and UMR will be discussed.
Andrew
Grimshaw, U of Virginia
Timing is Everything:
The CCC as an Alternative to Commercial Clouds
Abstract: "Grid computing is where I give you access to my resources and get nothing in return." - A Skeptical Resource Administrator.
Wide-area, federated, compute-sharing systems (such as Condor, gLite, Globus, and Legion) have been around for over twenty years. Outside of particular domains such as physics, these systems have not been widely adopted. Recently, however, universities are starting to propose and join resource-sharing platforms. Why this sudden change?
Mostly, this change has come in response to cost concerns. HPC managers are under new pressure from university administrators who demand that infrastructure outlays be economically justified. "Why not just put it all on Amazon?" goes the administration's refrain. In response, HPC managers have begun to document the true cost of university-, department-, and research-group-owned infrastructure, thus enabling a legitimate cost comparison with Amazon or Azure. Additionally, it may be noted, this pressure to consider outsourcing computing infrastructure has legitimized both remote computing and paying for computation.
In this talk I will briefly describe the Campus Compute Cooperative's (CCC). I will then detail both the results of our market simulations and the take-aways from interviews with stakeholders. By both of these measures, the CCC is valuable and viable: first, the simulation results clearly show the gains in institutional value; second, stakeholders indicated that many institutions are open to trading resources. Most promising, some institutions expressed interest in selling resources and others expressed willingness to pay.
William
Gropp, University of Illinois at Urbana-Champaign
Managing Code Transformations for Better
Performance Portability
Abstract:
With the end of Dennard Scaling, performance has depended on innovations in
processor architecture. While these innovations have allowed per chip performance
to continue to increase, it has made it increasingly difficult to write and
maintain high performance code. Many different approaches to this problem have
been tried, including enhancements to existing languages, new programming languages,
libraries, tools, and even general techniques.
I
will discuss the Illinois Coding Environment (ICE), which is used to provide
code transformations for the primary code used by the Center for the Exascale
Simulation of Plasma-Coupled Combustion. ICE is an example of an approach that
uses annotations to an existing language to provide additional information that
an guide performance optimizations, and uses a framework that can invoke
third-party tools to apply performance enhancing transformations.
Mary
Hall, U Utah
Mainstreaming Autotuning Compilers for
Performance Portability: What will it Take?
Abstract:
We describe research on mainstreaming autotuning compiler technology, whereby
the compiler automatically explores a search space of alternative
implementations of a computation to find the best implementation for a target
architecture. Autotuning has demonstrated success in achieving
performance portability as it enables the compiler to tailor optimization and
code generation to a specific architectural context, starting from the same
high-level program specification. Still, mainstream adoption requires
availability in widely-used compilers and demonstrated impact on production
application codes while under development. This talk will highlight an example
of the impact of autotuning compiler technology, recent work on a brick data
layout and associated code generator for stencil computations that uses
fine-grained data blocking as a tunable abstraction for performance portability
across CPUs and GPUs. It also will describe research on migrating
autotuning technology into Clang/LLVM to support autotuning of OpenMP and
complex loop transformation sequences.
Tony Hey,
Science and Technology Facilities Council, UK
Machine Learning and Big Scientific Data Benchmarks
Abstract: This
talk will review the challenges posed by the growth of experimental data
generated by the new generation of large-scale experiments at UK national
facilities such as the Diamond Synchrotron at the Rutherford Appleton
Laboratory site at Harwell near Oxford. Increasingly, scientists now need to
use sophisticated machine learning and other AI technologies to automate parts
of the data pipeline and to find new scientific discoveries in the deluge of
experimental data. In industry, Deep Learning is now transforming many areas of
computing and researchers are now exploring their use in analyzing their ÔBig
Scientific DataÕ. The talk will
include a discussion about the creation of a set of Big Scientific Data Machine
Learning ÔbenchmarksÕ for exploring the use of these technologies in the
analysis of experimental research data. Such benchmarks could also be important
in providing new research insights into the robustness and transparency of such
these algorithms.
Torsten
Hoefler, ETH Zurich
Quantum Computing from an HPC System's
Perspective
Abstract: Quantum computation may be a big paradigm shift in the next century.
Yet, the specifics of how computations happen are subtle and entangles with
quantum mechanical concepts. The situation is further confused by many popular-
and real-science misconceptions of the basic concepts. This talk tries to
provide an intuitive-as-possible view on the field and its challenges from a
computer systems perspective.
Minh Quan Ho, University Grenoble
Standard libraries on non-standard processors
Abstract: Potential of
non-conventional many-core processors is clear for
future HPC and AI platforms. However, the hardness of developing
the standard software stack on those architectures frightens
system and application developers. In this talk, we present our
approaches on porting and optimizing BLAS and FFT libraries on
the MPPA processor - a DMA-based many-core architecture, while
keeping minimal footprint in a memory-constrained environment.
Heike
Jagode, UTK
PAPI's new Software-Defined Events for in-depth Performance
Analysis.
Abstract: One of the most recent developments of the Performance API (PAPI) is
the addition of Software-Defined Events (SDE). PAPI has successfully served the
role of the abstraction and unification layer for hardware performance counters
for over a decade. This talk presents our effort to extend this role to
encompass performance critical information that does not originate in hardware,
but rather in critical software layers, such as libraries and runtime systems.
Our overall objective is to enable monitoring of both types of performance
events, hardware- and software-related events, in a uniform way, through one
consistent PAPI interface. Performance analysts will be able to form a complete
picture of the entire application performance without learning new
instrumentation primitives. In this talk, we outline PAPI's new SDE API and
showcase the usefulness of SDE through its employment in software layers as
diverse as the math library MAGMA, the dataflow runtime PaRSEC, and the
state-of-the-art chemistry application NWChem. We outline the process of
instrumenting these software packages and highlight the performance information
that can be acquired with SDEs.
Emmanuel
Jeannot, INRIA Bordeaux
Process Placement from Monitoring to Data Analysis
Abstract: In this talk we will review the complete chain of topology process mapping: gathering the topology, monitoring the application, map processes, analyze results. We will review latest advances in this research field by my group (MPI monitoring, Hwloc, TreeMatch, etc.).
Carl Kesselman, ISI
Computation as an
Experimental Science
Laurent
Lefevre, Inria
Builing and exploiting the table of energy and power
leverage for energy efficient large scale HPC systems
Abstract: Large scale distributed
systems and supercomputers consume huge amounts of energy.
To address this issue, a set of hardware and software capabilities
and techniques (leverages) exist to modify power
and energy consumption in large scale systems.
Discovering, benchmarking and efficiently exploiting such
leverages, remains a real challenge for most of the
users. This talk will address the building of the table of energy and
power leverages and will present how to exploit it for energy efficient
systems.
LaŽrcio
Lima Pilla
Decoupling
schedulers from runtime systems for increased reuse and portability
Abstract: Global schedulers are components used in parallel
solutions, specially in dynamic applications, to optimize resource usage. Nonetheless,
their development is a cumbersome process due to necessary adaptations to cope
with the programming interfaces and abstractions of runtime systems. This
presentation will focus on our model to dissociate schedulers from runtime
systems in order to lower software complexity. Our model is based on the
scheduler breakdown into modular and reusable concepts that better express the
scheduler requirements. Through the use of meta-programming and design
patterns, we are able to achieve fully reusable workload-aware scheduling
strategies with increased reuse, less lines of code for the algorithms, and
negligible run time overhead.
Bernd Mohr, Juelich Supercomputing Centre, Germany
On the ROI of Parallel Performance Optimization
Abstract:
Developers of HPC applications can count on free advice from European experts
to analyse the performance of their scientific codes. The Performance
Optimization and Productivity (POP) Centre of Excellence, funded by the
European Commission under H2020, ran from October 2015 to March 2018. The POP
Centre of Excellence gathered together experts from BSC, JSC, HLRS, RWTH Aachen
University, NAG and Ter@tec. The objective of POP was to provide performance
measurement and analysis services to
the industrial and academic HPC community, help them to better understand the
performance behaviour of their codes and suggest
improvements to increase their efficiency. Training and user education
regarding application tuning was also provided. Further information can be
found at http://www.pop-coe.eu/. The talk will give an overview of the POP Centre of Excellence and
describe the common performance assessment strategy and metrics developed and
defined by the project partners. The presentation will close with the success
stories and reports from the over 150 performance assessments performed during
the project.
Dimitrios Nikolopoulos, Queens University Belfast.
Realistic fault injection and analysis for Exascale systems
Abstract: We explore compiler-based
tools for accelerating resilience studies on Exascale systems. We look into how
these tools can achieve accurate fault injection compared to binary-level
tools, how they can handle multithreaded and parallel code, and how they can be
scaled to conduct realistic fault resilience analysis campaigns.
Christian Obrecht, Centre for
Energy and Thermal Sciences of Lyon (CETHIL)
National Institute of Applied Sciences of Lyon (INSA Lyon)
Building simulation: an illusion?
Abstract: Global warming mitigation through the reduction of greenhouse gas
emission requires a drastic decrease of our energy consumption. In this
perspective, residential buildings represent one of the largest potential
source of energy savings. However, harnessing this resource will require
considerable advancement in terms of building design. In this presentation, we
will first focus on the computational challenges of designing energy efficient
buildings and on why current practices in building simulation are inadequate.
Secondly, we will address the computational effort needed for accurate building
simulation and evaluate to which extent it is economically viable. To conclude,
we will give some insights on how recent advances in computational sciences
such as deep learning could help in designing more efficient buildings.
Phil Papadopoulos, UC, Irvine
"Virtualization is the answer. What was the
question?"
Abstract: 10 years ago
virtualization in High-Performance computing was a
"non-starter" in the community. This talk will take an
abbreviated tour of the short history of virtualization in HPC starting with
para-virtualization (of the nearly dead Xen project), going through full-system
(KVM as the exemplar), high-performance virtual clusters (enabled fundamentally
by SRIOV) and finally to containers. There is demonstrated success for Virtual
clusters on the SDSC's Comet Cluster where, as unseen
infrastructure, these facilitated some recent key science discoveries.
Virtualized systems bring more software control to the end user, but this can
exact some significant, but hidden, costs. As more users want to
"containerize" their applications there are open questions about how
container orchestration engines like Kubernetes fit within the landscape.
We'll use the Pacific Research Platform as a motivator for some possible
directions while illuminating some dark corners.
Manish Parashar, NSF/Rutgers University
Enabling
Data-Driven Edge/Cloud Application Workflows
Abstract: The proliferation of edge devices and associated data streams are
enabling new classes of dynamic data driven applications. However, processing
these data streams in a robust, effective and timely manner as part of
applications workflows presents challenges that are not addressed by current
stream programming frameworks. In this talk, I will present R-Pulsar, a unified
cloud and edge data processing platform that extends the serverless computing
model to the edge to enable streaming data analytics across cloud and edge
resources in a location, content and resource aware manner. R-Pulsar has been
deployed on edge devices and is being used to support disaster recovery
workflows. This research is part of the Computing in the Continuum project at
the Rutgers Discovery Informatics Institute.
Judy Qiu, Indiana University
Real-Time Anomaly Detection from Edge to HPC-Cloud
Abstract: Detection of anomalies in real-time streaming data has significant importance to a wide variety of application domains. They require high performance analytics and prediction that give actionable information in critical scenarios to racing cars, autonomous vehicles, monitoring medical conditions, security detection, nanoparticle interactions, and fusion reactions. In the motor racing event of Indianapolis 500, telemetry data is observed sequentially and gathered from multiple vehicles at the edge of networks, and then stored in a MongoDB database. To enable car simulators and analytics on-the-fly, we leverage a novel HPC-Cloud convergence framework named Harp-DAAL and demonstrate that the combination of Big Data and HPC techniques can simultaneously achieve productivity and performance. We show how simulations and Big Data analytics can use common programming environments with a runtime based on a rich set of collectives and libraries.
Padma Raghavan, Vanderbilt University
Rethinking the
Computational Complexity and Efficiency in the Age of ÒBig DataÓ
Abstract: ÒBig dataÓ sets are here and as they continue to get bigger, there is an important and growing Òsmall dataÓ challenge, namely the energy costs of moving small numbers of bits and bytes within the hardware. This will impact high performance computing disproportionately as there is higher susceptibility to hardware errors while single thread performance is not improving despite the multi-megawatt power consumption of even modest-sized systems which have multi-million way thread parallelism. We need to rethink how we seek to optimize computational performance and resiliency starting with the key measure, namely the computational complexity and efficiency of an algorithm that has traditionally concerned the number calculations. I will provide some illustrative examples drawn from sparse computations, where the number of data elements moved per operation are high for traditional algorithms, with a view to inform alternative approaches that could potentially increase performance, energy-efficiency and resiliency.
Yves
Robert, ENS Lyon
A Little Scheduling
Problem
Abstract: The talk addresses a
little scheduling problem related to these large HPC platforms
that we love to play with.
ParticipantÕs brief biography: Yves Robert has attended all CCDSC meetings but
one. That says it all!
In addition, please send me your arrival and departure information:
- driving my own car
- arriving on Tuesday around 5:30pm) and leaving on Friday afternoon
- will likely have some passengers from LIP (Anne, FrŽdŽric, ....)
Rob Ross,
ANL
Versatile Data Services
for Computational Science
Abstract: On the data management side of HPC, the adoption of new services
over the past decade has been slow. Globally-available parallel file
systems still dominate the scene, despite the availability and
success of alternatives outside the HPC community. At the same time,
the approach of composing codes from multiple coordinating components
is having great success in other areas of computational science. In
this presentation we motivate the composition of data services
from reusable components and describe our efforts in this direction under
the Mochi project (https://www.mcs.anl.gov/research/projects/mochi/). Mochi aims to provide
the tools for an ecosystem of specialized data services for HPC. We
will discuss the approach and our rationale, components built
to date, and describe some motivating computational science use
cases.
Joel Saltz,SUNY Stronybrook
Integrative Everything, Deep Learning and Streaming Data
Abstract: The need to create to label information and segment regions in individual sensor data sources and to create synthesizes from multiple disparate data sources span many areas of science, biomedicine and technology. The rapid evolution in sensor technologies Ð from digital microscopes to UAVs drive requirements in this area. I will describe a variety of use cases, describe technical challenges as well as tools, algorithms and techniques developed by our group and collaborators.
Steve Scalpone, PGI/Nvidia
The F18 Fortran Compiler
F18 is an all-new open source
Fortran compiler infrastructure. It is being developed as part of the
Flang project, a collaboration between NVIDIA, the US Dept of Energy and a
growing community of contributors to create a Fortran front-end for LLVM.
F18 is written in C++ in the style of Clang and LLVM, is designed to be integrated
with LLVM code generation, and is designed to facilitate ready development of
language extensions and Fortran-related tools. F18 source code is
available under the LLVM and Apache licenses, so it's friendly for both
research and commercial use. F18 is designed for a future in which all
proposed extensions to Fortran and its related directives can be implemented
and proven out prior to formal adoption in a standard or specification, where
Fortran language features, extensions, pragmas and directives are portable
across all HPC platforms, where each processor manufacturer can support their
latest hardware optimizations in an end-to-end Fortran compiler built around
modern software engineering principles, and where researchers and academics can
implement state-of-the-art language and optimization features built on a
production-quality Fortran source base.
Vaidy Sunderam, Emory University
Data driven systems for spatio-temporal applications
Abstract: Data driven systems are rapidly increasing in prevalence,
especially in spatio-temporal domains. Numerous smart devices collect and
report observations, for individual and collective value, but with variable
reliability and potential loss of privacy. We present several research
contributions aimed at addressing: (1) task assignment in crowdsourcing systems
with privacy protection; (2) truth discovery whereby reports or observations
from multiple entities can be fused to improve veracity; and (3) tensor
factorization methods for extracting patterns from spatio-temporal data.
Models, issues, approaches, and preliminary results will be presented.
Martin
Swany, Indiana University
Network Microservices and Edge Computing
Abstract: With proliferating sensor networks and Internet of Things-scale
devices, networks are increasingly diverse and heterogeneous. To enable
the most efficient use of network bandwidth with the lowest possible latency,
we propose InLocus, a stream-oriented architecture situated at (or near) the
network's edge which balances hardware-accelerated performance with the
flexibility of asynchronous software-based control.
Michela
Taufer, University of Tennessee
Modeling Record-and-Replay for Nondeterministic Applications on
Exascale Systems
Abstract: Record-and-replay
(R&R) techniques present an attractive method for mitigating the harmful
aspects of nondeterminism in HPC applications (e.g., numerical
irreproducibility and hampered debugging), but are hamstrung by two problems.
First, there is insufficient understanding of how existing R&R techniques
cost of recording responds to changes in application communication patterns,
inputs, and other aspects of configuration, and to the degree of concurrency.
Second, current R&R techniques have insufficient ability to exploit
regularities in the communication patterns of individual applications.
To tackle these
problems, it is crucial that the HPC community is equipped with modeling and
simulation methodologies to assess the response of R&R tools, both in terms
of execution time overhead and memory overhead, to changes in the configuration
of the applications they monitor.
To realize effective modeling of the relationship between application
configuration, R&R tool configuration, and the cost of recording, we apply
a fourfold approach. First, we design a general and expressive representation
of executions of the parallel applications that record-and-replay tools target.
Second, we define a rigorous notion of the dissimilarity between multiple
executions of the same nondeterministic application. Third, we implement a method of determining for a particular
event graph the cost of recording that execution given a record-and-replay
tool. Finally, we implement a methods
of extracting from those event graphs that correspond to costly recordings the
components of those event graphs that contribute most to the cost. In our talk,
we describe our approach towards addressing each one of these four contributions.
This is joint
work with Dylan Chapp and Danny Rudabaugh (UTK), Kento
Sato and Dong Ahn (LLNL)
Jeff Vetter, ORNL
Preparing for
Extreme Heterogeneity in High Performance Computing
Abstract: Concerns about energy-efficiency and cost are forcing our community to reexamine system architectures, including the memory and storage hierarchy. While computing technologies have remained relatively stable for nearly two decades, new architectural features, such as heterogeneous cores, deep memory hierarchies, non-volatile memory (NVM), and near-memory processing, have emerged as possible solutions to address these concerns. However, we expect this Ôgolden ageÕ of architectural change to lead to extreme heterogeneity and it will have a major impact on software systems and applications. Software will need to be redesigned to exploit these new capabilities and provide some level of performance portability across these diverse architectures. In this talk, I will sample these emerging memory technologies, discuss their architectural and software implications, and describe several new approaches to address these challenges. One system is Papyrus (Parallel Aggregate Persistent -yru- Storage); it is a programming system that aggregates NVM from across the system for use as application data structures, such as vectors and key-value stores, while providing performance portability across emerging NVM hierarchies.
FrŽdŽric Vivien, INDIA
A Generic Approach to
Scheduling and Checkpointing Workflows
Abstract:
This work deals with scheduling and checkpointing strategies to execute
scientific workflows on failure-prone large-scale platforms. To the best of our
knowledge, this work is the first to target fail-stop errors for arbitrary
workflows. Most previous work addresses soft errors, which corrupt the task
being executed by a processor but do not cause the entire memory of that
processor to be lost, contrarily to fail-stop errors. We revisit classical
mapping heuristics such as HEFT and MinMin and complement them with several
checkpointing strategies. The objective is to derive an efficient trade-off
between checkpointing every task (CkptAll), which is an overkill when failures
are rare events, and checkpointing no task (CkptNone), which induces dramatic
re-execution overhead even when only a few failures strike during execution.
Contrarily to previous work, our approach applies to arbitrary workflows, not
just special classes of dependence graphs such as M-SPGs (Minimal
Series-Parallel Graphs). Extensive experiments report significantcant gain over
both CkptAll and CkptNone, for a wide variety of workflows.
Rich Vuduc, GATech
Algorithm-level control of performance and power tradeoffs
Abstract: I'll discuss a novel technique to control power consumption by tuning
the amount of parallelism that is available during the execution of an
algorithm. The specific algorithm is a tunable variation of delta-stepping for
computing a single-source shortest path (SSSP); its available parallelism is
highly irregular and depends strongly on the input. Informed by an analysis of
these runtime characteristics, we propose a software-based controller that uses
online learning techniques to dynamically tune the available parallelism to
meet a given target, thereby improving the average available parallelism while
reducing its variability. We verify experimentally that this mechanism makes it
possible for the algorithm to Òself-tuneÓ the tradeoff between performance and
power. The prototype extends Gunrock's GPU SSSP implementation, and the
experimental apparatus consists of embedded CPU+GPU development boards (NVIDIA
Tegra series), which have separately tunable GPU core and memory frequency
knobs, attached to an external power monitoring device (PowerMon 2). This work
is led by Sara Karamati, a Ph.D. student, and joint with Jeff Young, both at
Georgia Tech.
David Abramson has
been involved in computer architecture and high performance computing research since
1979. He has held appointments at Griffith University, CSIRO, RMIT and
Monash University. Prior to joining UQ, he was the Director of the Monash
e-Education Centre, Science Director of the Monash e-Research Centre, and a
Professor of Computer Science in the Faculty of Information Technology at
Monash. From 2007 to 2011 he was an Australian Research Council
Professorial Fellow. David has expertise in High Performance Computing,
distributed and parallel computing, computer architecture and software engineering.
He has produced in excess of 200 research publications, and some of his
work has also been integrated in commercial products. One of these, Nimrod, has
been used widely in research and academia globally, and is also available as a
commercial product, called EnFuzion, from Axceleon. His world-leading
work in parallel debugging is sold and marketed by Cray Inc, one of the world's
leading supercomputing vendors, as a product called ccdb. David is a Fellow of
the Association for Computing Machinery (ACM), the Institute of Electrical and
Electronic Engineers (IEEE), the Australian Academy of Technology and
Engineering (ATSE), and the Australian Computer Society (ACS). He is currently
a visiting Professor in the Oxford e-Research Centre at the University of
Oxford.
Ilkay Altintas is the Chief Data Science Officer at the San Diego Supercomputer Center (SDSC), UC San Diego, where she is also the Founder and Director for the Workflows for Data Science Center of Excellence. In her various roles and projects, she leads collaborative multi-disciplinary with a research objective to deliver impactful results through making computational data science work more reusable, programmable, scalable and reproducible. Since joining SDSC in 2001, she has been a principal investigator and a technical leader in a wide range of cross-disciplinary projects. Her work has been applied to many scientific and societal domains including bioinformatics, geoinformatics, high-energy physics, multi-scale biomedical science, smart cities, and smart manufacturing. She is a co-initiator of the popular open-source Kepler Scientific Workflow System, and the co-author of publications related to computational data science at the intersection of workflows, provenance, distributed computing, big data, reproducibility, and software modeling in many different application areas.
Hartwig Anzt is
a Helmholtz-Young-Investigator Group leader at the Steinbuch Centre for
Computing at the Karlsruhe Institute of Technology. He obtained his PhD in Mathematics
at the Karlsruhe Institute of Technology, and afterwards joined Jack Dongarra's
Innovative Computing Lab at the University of Tennessee in 2013. Since 2015 he
also holds a Senior Research Scientist position at the University of Tennessee.
Hartwig Anzt has a strong background in numerical mathematics, specializes in
iterative methods and preconditioning techniques for the next generation
hardware architectures. His Helmholtz group on Fixed-point methods for numerics
at Exascale (``FiNE'') is granted funding until 2022. Hartwig Anzt has a long
track record of high-quality software development. He is author of the
MAGMA-sparse open source software package managing lead and developer of the
Ginkgo numerical linear algebra library, and part of the US Exascale computing
project delivering production-ready numerical linear algebra libraries.
Dorian Arnold is
an associate professor of Computer Science at Emory University with research
interests in operating and distributed systems, fault-tolerance, online
(streaming) data analysis and high-performance software tools. DorianÕs
projects target a productive balance of principles and practice: his 60+
research articles have been cited over 1600 times, and two of his research
projects (NetSolve and STAT) have won Top 100 R&D awards in 1999 and 2011.
He is a senior member of the IEEE and an ACM Distinguished Speaker. Arnold
received Ph.D. and M.S. degrees in Computer Science from the Universities of
Wisconsin and Tennessee, respectively. He also received his B.S. in Math and
Computer Science from Regis University (Denver, CO) and his A.S. in Physics,
Chemistry and Math from St. John's Junior College (Belize).
Guillaume Aupy is a researcher at Inria Bordeaux Sud-Ouest. He currently
works on data-aware scheduling at the different levels of the memory hierarchy
(cache, memory, buffers, disks). He completed his PhD at ENS Lyon in 2014 on
reliable and energy efficient scheduling strategies in High-Performance
Computing. In 2017, he served as the Technical Program vice-chair for SC'17,
workshop chair for SC18 and algorithm track
vice-chair for ICPP'18.
Rosa M. Badia Rosa
M. Badia holds a PhD on Computer Science (1994) from the Technical University
of Catalonia (UPC). She is the manager of the Workflows and Distributed
Computing research group at the Barcelona Supercomputing Center (BSC).
Her current research interest are programming models for complex platforms
(from multicore, GPUs to Cloud). The group lead by Dr. Badia has been
developing StarSs programming model for more than 10 years, with a high success
in adoption by application developers. Currently the group focuses its efforts
in PyCOMPSs/COMPSs, an instance of the programming model for distributed
computing including Cloud.
Dr Badia has published near 200 papers in international conferences and
journals in the topics of her research. Her group is very active in projects
funded by the European Commission and in contracts with industry.
Pete Beckman is the co-director of the Northwestern University/Argonne Institute for Science and Engineering and a recognized global expert in high-end computing systems. During the past 25 years, his research has been focused on software and architectures for large-scale parallel and distributed computing systems. For the DOEÕs Exascale Computing Project, Pete leads the Argo team focused on extreme-scale operating systems and run-time software. He is the founder and leader of the Waggle project for smart sensors and edge computing that is used by the Array of Things project. Pete also coordinates the collaborative technical research activities in extreme-scale computing between the US Department of Energy and JapanÕs ministry of education, science, and technology and helps lead the BDEC (Big Data and Extreme Computing) series of international workshops. Pete leads the extreme computing research activities at Argonne National Laboratory. He received his Ph.D in computer science from Indiana University.
Anne Benoit received the PhD degree from Institut National Polytechnique de Grenoble in 2003, and the Habilitation ˆ Diriger des Recherches (HDR) from Ecole Normale SupŽrieure de Lyon (ENS Lyon) in 2009. She is currently an associate professor in the Computer Science Laboratory LIP at ENS Lyon, France. She is the author of one book on algorithm design, 43 papers published in international journals, and 87 papers published in international conferences. She is the advisor of 9 PhD theses. Her research interests include algorithm design and scheduling techniques for parallel and distributed platforms, and also the performance evaluation of parallel systems and applications, with a focus on energy awareness and resilience. She is Associate Editor (in Chief) of Elsevier ParCo, and Associate Editor of IEEE TPDS and Elsevier JPDC. She is the program chair of several workshops and conferences, in particular she is the program chair for HiPCÕ16, ICPPÕ17, SCÕ17 (papers chair), and IPDPSÕ18. She is a senior member of the IEEE, and she has been elected a Junior Member of Institut Universitaire de France in 2009.
Ken Birman is the N. Rama Rao Professor of Computer Science at Cornell. An ACM Fellow and the winner of the IEEE Tsutomu Kanai Award, Ken has written 3 textbooks and published more than 150 papers in prestigious journals and conferences. Software he developed operated the New York Stock Exchange for more than a decade without trading disruptions, and plays central roles in the French Air Traffic Control System and the US Navy AEGIS warship. Other technologies from his group found their way into IBMÕs Websphere product, AmazonÕs EC2 and S3 systems, MicrosoftÕs cluster management solutions, and the US Northeast bulk power grid. The new Derechos system is intended for demanding settings such as the smart power grid, smart highways and homes, and scalable vision systems. Download it, open source, from http://GitHub.com/Derecho-Project.
George Bosilca is a Research Director and Adjunct Assistant
Professor at the Innovative Computing Laboratory at University of Tennessee,
Knoxville. His research interests evolve around designing support for parallel
applications to maximize their efficiency, scalability, heterogeneity and
resiliency at any scale and in any settings. He is actively involved in
projects such as Open MPI, ULFM, PaRSEC, DPLASMA, TESSE.
Bill Brantley is a Fellow Design Engineer in the Research Division of
Advanced Micro Devices leading parts of *Forward research contracts as well as
other efforts. Prior to AMD he was at IBM T.J. Watson Research Center
where he was one of the architects and implementers of the 64 CPU RP3 (a DARPA
supported HPC system development in the mid-80s) including a hardware
performance monitor. In IBM Austin he held in a number of roles including
the analysis of server performance in the Linux Technology Center. Prior
to joining IBM, he completed his Ph.D. at Carnegie Mellon University in ECE
after working for 3 years at Los Alamos National Laboratory.
Ron Brightwell leads the Scalable System Software Department at Sandia National Laboratories. After joining Sandia in 1995, he was a key contributor to the high-performance interconnect software and lightweight operating system for the worldÕs first terascale system, the Intel ASCI Red machine. He was also part of the team responsible for the high-performance interconnect and lightweight operating system for the Cray Red Storm machine, which was the prototype for CrayÕs successful XT product line. The impact of his interconnect research is visible in technologies available today from Atos/Bull, Intel, and Mellanox. He has also contributed to the development of the MPI-2 and MPI-3 specifications. He has authored more than 115 peer-reviewed journal, conference, and workshop publications. He is an Associate Editor for the IEEE Transactions on Parallel and Distributed Systems, has served on the technical program and organizing committees for numerous high-performance and parallel computing conferences, and is a Senior Member of the IEEE and the ACM.
Franck Cappello is senior computer scientist at Argonne National Laboratory and adjunct associate professor in the department of computer science at University of Illinois at Urbana Champaign. He is the director of the Joint-Laboratory on Extreme Scale Computing gathering six of the leading high-performance computing institutions in the world: Argonne National Laboratory (ANL), National Center for Scientific Applications (NCSA), Inria, Barcelona Supercomputing Center (BSC), Julich Supercomputing center (JSC), Riken CCS and UTK-ICL. Franck is an expert in parallel/distributed computing and high-performance computing. Recently he started investigating lossy compression for scientific datasets to respond to the pressing needs of scientists performing large scale simulations and experiments for significant data reduction. Franck is member of the editorial board of IEEE Transactions on Parallel and Distributed Systems and of the IEEE CCGRID steering committees. He is fellow of the IEEE and recipient of the 2018 IEEE TCPP outstanding service award.
Barbara Chapman is a Professor of Applied Mathematics and Statistics, and of Computer Science, at Stony Brook University, where she is affiliated with the Institute for Advanced Computational Science. She also directs Computer Science and Mathematics Research at Brookhaven National Laboratory. She performs research on parallel programming interfaces and the related implementation technology, and has been involved in several efforts to develop community standards for parallel programming, including OpenMP, OpenACC and OpenSHMEM. Her research group created the OpenUH compiler that enabled practical experimentation with proposed enhancements to application programming interfaces and a reference implementation of the library-based OpenSHMEM standard. Dr. Chapman has co-authored over 200 papers and two books. She obtained her Ph.D. in Computer Science from QueenÕs University of Belfast.
Alok Choudhary is the Henry & Isabelle Dever Professor of Electrical
Engineering and Computer Science and a professor at Kellogg School of
Management. He is also the founder, chairman and chief scientist (served as its
CEO during 2011-2013) of 4C insights (formerly Voxsup Inc.), a big data
analytics and marketing technology software company. He received the National
Science Foundation's Young Investigator Award in 1993. He is a fellow of IEEE,
ACM and AAAS. His research interests are in high-performance computing, data
intensive computing, scalable data mining, high-performance I/O systems,
software and their applications in science, medicine and business. Alok
Choudhary has published more than 400 papers in various journals and
conferences and has graduated 40+ PhD students. Alok ChoudharyÕs work and
interviews have appeared in many traditional media including New York Times,
Chicago Tribune, The Telegraph, ABC, PBS, NPR, AdExchange, Business Daily and
many international media outlets all over the world.
Jonathan Churchhill
After a 20+ year career in the Semiconductor business designing high performance SRAMs and their associated CAD systems, joined STFC in 2006 focusing on HPC systems and support. In the last 6 years I have been responsible for building the architecture and systems operations for JASMIN - The UKÕs platform for environmental research data analysis Ð taking it from day one to todays 70 racks, 45PBÕs of high performance storage attached to cloud and physical HPC totalling ~10k cores. I am the named author on 15 US and UK patents.
Joe Curley serves Intel¨ Corporation as Senior Director, HPC Platform and Ecosystem Enablement in the High Performance Computing Platform Group (HPG). His primary responsibilities include supporting global ecosystem partners to develop their own powerful and energy-efficient HPC computing solutions utilizing Intel hardware and software products. Mr. Curley joined Intel Corporation in 2007, and has served in multiple other planning and business leadership roles.
Prior to joining Intel, Joe worked at Dell, Inc. leading the global workstation product line, consumer and small business desktops, and a series of engineering roles. He began his career at computer graphics pioneer Tseng Labs.
Ewa Deelman is a Research Professor at the USC Computer Science Department and a Research Director, at the USC Information Sciences Institute (ISI). Dr. Deelman's research interests include the design and exploration of collaborative, distributed scientific environments, with particular emphasis on automation of scientific workflow and management of computing resources, as well as the management of scientific data. Her work involves close collaboration with researchers from a wide spectrum of disciplines. At ISI she leads the Science Automation Technologies group that is responsible for the development of the Pegasus Workflow Management software. In 2007, Dr. Deelman edited a book: ÒWorkflows in e-Science: Scientific Workflows for GridsÓ, published by Springer. She is also the founder of the annual Workshop on Workflows in Support of Large-Scale Science, which is held in conjunction with the Super Computing conference. In 1997 Dr. Deelman received her PhD in Computer Science from the Rensselaer Polytechnic Institute.
Luiz DeRose is
a Senior Principal Engineer and the Programming Environments Director at Cray
Inc, where he is responsible for the programming environment strategy for all
Cray systems. Before joining Cray in 2004, he was a research staff member and
the Tools Group Leader at the Advanced Computing Technology Center at IBM
Research. Dr. DeRose has a Ph.D. in Computer Science from the University of
Illinois at Urbana-Champaign. With more than 25 years of high performance
computing experience and a deep knowledge of its programming environments, he
has published more than 50 peer-review articles in scientific journals,
conferences, and book chapters, primarily on the topics of compilers and tools
for high performance computing.
FrŽdŽric Desprez is a Chief Senior Research Scientist at Inria and holds a position at the LIG laboratory (UGA, Grenoble, France) in the Corse research team. He is also Deputy Scientific Director at Inria. He received his PhD in C.S. from Institut National Polytechnique de Grenoble, France, in 1994 and his MS in C.S. from ENS Lyon in 1990. His research interests include parallel high performance computing algorithms and scheduling for large scale distributed platforms. He leads the Grid'5000 project, which offers a platform to evaluate large scale algorithms, applications, and middleware systems. See https://fdesprez.github.io/ for further information.
Jack Dongarra holds an appointment at the University of Tennessee, Oak Ridge National Laboratory, and the University of Manchester. He specializes in numerical algorithms in linear algebra, parallel computing, use of advanced-computer architectures, programming methodology, and tools for parallel computers. He was awarded the IEEE Sid Fernbach Award in 2004; in 2008 he was the recipient of the first IEEE Medal of Excellence in Scalable Computing; in 2010 he was the first recipient of the SIAM Special Interest Group on Supercomputing's award for Career Achievement; in 2011 he was the recipient of the IEEE IPDPS Charles Babbage Award; and in 2013 he received the ACM/IEEE Ken Kennedy Award. He is a Fellow of the AAAS, ACM, IEEE, and SIAM and a member of the National Academy of Engineering.
Fanny
Dufosse
Nicola Ferrier is a Senior Computer Scientist in ANLÕs Mathematics and Computer Science Division, and a Senior Fellow of University of ChicagoÕs Consortium for Advanced Science and Engineering (UChicago CASE) and Institute of Molecular Engineering, and a member of the Northwestern Argonne Institute for Science and Engineering. FerrierÕs research interests are in the use of computer vision (digital images) to control robots, machinery, and devices, with applications as diverse as medical systems, manufacturing, and biology. At Argonne National Lab and University of Chicago she collaborates with scientists from the Institute for Molecular Engineering, Advanced Photon Source, Materials Science, and biological sciences on various projects where images and computation facilitate Òscientific discoveryÓ. Prior to joining MCS in 2013 she was a professor of mechanical engineering at the University of Wisconsin-Madison where she directed the Robotics and Intelligent Systems lab (1996-2013).
Ian Foster is Distinguished Fellow and director of the Data Science and Learning Division at Argonne National Laboratory. He is also the Arthur Holly Compton Distinguished Service Professor of Computer Science at the University of Chicago. Ian received a BSc (Hons I) degree from the University of Canterbury, New Zealand, and a PhD from Imperial College, United Kingdom, both in computer science. His research deals with distributed, parallel, and data-intensive computing technologies, and innovative applications of those technologies to scientific problems in such domains as materials science, climate change, and biomedicine. His Globus software is widely used in national and international cyberinfrastructures. Foster is a fellow of the American Association for the Advancement of Science, Association for Computing Machinery, and British Computer Society. His awards include the Global Information Infrastructure Next Generation award, the British Computer Society's Lovelace Medal, the IEEEÕs Kanai award, and honorary doctorates from the University of Canterbury, New Zealand, and the Mexican Center for Research and Advanced Studies of the National Polytechnic Institute (CINVESTAV). He co-founded Univa, Inc., a company established to deliver grid and cloud computing solutions, and Praedictus Climate Solutions, which combines data science and high performance computing for quantitative agricultural forecasting.
Geoffrey Fox is a professor of Engineering, Computing, and Physics at Indiana University where he is director of the Digital Science Center, and Department Chair for Intelligent Systems Engineering at the School of Informatics, Computing, and Engineering. He has supervised the Ph.D. of 71 students and is a Fellow of APS (Physics) and ACM (Computing)
Haohuan Fu is a professor in the Ministry of Education Key Laboratory for Earth System Modeling, and Department of Earth System Science in Tsinghua University, where he leads the research group of High Performance Geo-Computing (HPGC). He is also the deputy director of the National Supercomputing Center in Wuxi, laeding the research and development division. Fu has a PhD in computing from Imperial College London. His research work focuses on providing both the most efficient simulation platforms and the most intelligent data management and analysis platforms for geoscience applications.
Al Geist is a Corporate Research
Fellow at Oak Ridge National Laboratory.
He is the Chief Technology Officer of ORNL's Leadership Computing Facility and Chief Scientist for the
Computer Science and Mathematics Division.
He is on the Leadership Team of the U.S. Exascale Computing Project. His recent research
is on Exascale computing and
resilience needs of the hardware and software.
Al Geist
Rich Graham is a Senior Director for HPC technology at Mellanox
Technologies, Inc. His primary focus is on the High Performance Computing
focusing on MellanoxÕs HPC technical roadmap, and working with customers on
their HPC needs. Prior to moving to Mellanox, Rich spent thirteen years
at Los Alamos National Laboratory and Oak Ridge National Laboratory, in
computer science technical and administrative roles, with a technical focus on
communication libraries and application analysis tools. He is cofounder
of the Open MPI collaboration, was chairman of the MPI 3.0 standardization
efforts.
Andrew Grimshaw received his Ph.D. from the University of
Illinois at Urbana-Champaign in 1988. He joined the University of Virginia as
an Assistant Professor of Computer Science, becoming Associate Professor in
1994 and Professor in 1999. He is the chief designer and architect of Mentat,
Legion, Genesis II, and the co-architect for XSEDE. In 1999 he co-founded Avaki
Corporation, and served as its Chairman and Chief Technical Officer until 2003.
In 2003 he won the Frost and Sullivan Technology Innovation Award. In 2008 he
became the founding director of the University of Virginia Alliance for
Computational Science and Engineering (UVACSE). The mission of UVACSE is to
change the culture of computation at the University of Virginia and to
accelerate computationally oriented research.
Andrew is the
chairman of the Open Grid Forum (OGF), having served both as a member
of the OGF's Board of Directors and as Architecture Area Director.
Andrew is the author or co-author of over 100 publications and book chapters.
His current projects are IT, Genesis II, and XSEDE. IT is a next generation
portable parallel language based on the PCubeS type architecture. Genesis II,
is an open source, standards-based, Grid system that focuses on making Grids
easy-to-use and accessible to non computer-scientists. XSEDE (eXtreme Science
and Engineering Discovery Environment) is the NSF follow-on to the TeraGrid
project.
William Gropp holds the Thomas M. Siebel chair in computer science at
the University of Illinois at
Urbana-Champaign, is the Director and Chief Scientist of the National Center
for Supercomputer Applications, and was the founding director of the Parallel
Computing Institute. Prior to joining Illinois in 2007, he held positions
at Argonne National Laboratory, including Associate Director for the
Mathematics and Computer Science Division and Senior Computer Scientist.
He is known for his work on scalable numerical algorithms and software
(sharing an R&D100 award and the SIAM/ACM Prize in Computational Science
and Engineering for PETSc software) and for the Message Passing Interface
(sharing an R&D100 award for MPICH, the dominant high-end implementation,
as well as co-authoring the leading books on MPI). For his accomplishments in
parallel algorithms and programming, he received the IEEE Computer Society's
Sidney Fernbach award in 2008, the SIAM-SC Career Award in 2014, and the
ACM/IEEE-CS Ken Kennedy Aared in 2016. He is a fellow of ACM, IEEE, and SIAM, and
is an elected member of the National Academy of Engineering
Mary Hall is
a Professor at University of Utah, where she has been since 2008. Her
research interests focus on programming systems for high-performance computing,
with a particular interest in autotuning compilers, parallel code generation
and domain-specific optimization. She leads the Y-Tune project that is
part of the U.S. Dept. of Energy Exascale Computing Project, in collaboration
with Lawrence Berkeley National Laboratory and Argonne National Laboratory.
Mary Hall is an ACM Distinguished Scientist and serves on the Computing
Research Association Board of Directors.
Li Han
Tony Hey began his career as a theoretical physicist with a doctorate in particle physics from the University of Oxford in the UK. After a career in physics that included research positions at Caltech and CERN, and a professorship at the University of Southampton in England, he became interested in parallel computing and moved into computer science. In the 1980Õs he was one of the pioneers of distributed memory message-passing computing and co-wrote the first draft of the successful MPI message-passing standard.
After being both Head of Department and Dean of Engineering at Southampton, Tony Hey was appointed to lead the U.K.Õs ground-breaking ÔeScienceÕ initiative in 2001. He recognized the importance of Big Data for science and wrote one of the first papers on the ÔData DelugeÕ in 2003. He joined Microsoft in 2005 as a Vice President and was responsible for MicrosoftÕs global university research engagements. He worked with Jim Gray and his multidisciplinary eScience research group and edited a tribute to Jim called ÔThe Fourth Paradigm: Data-Intensive Scientific Discovery.Õ Hey left Microsoft in 2014 and spent a year as a Senior Data Science Fellow at the eScience Institute at the University of Washington. He returned to the UK in November 2015 and is now Chief Data Scientist at the Science and Technology Facilities Council.
In 1987 Tony Hey was asked by Caltech Nobel physicist Richard Feynman to write up his ÔLectures on ComputationÕ. This covered such unconventional topics as the thermodynamics of computing as well as an outline for a quantum computer. FeynmanÕs introduction to the workings of a computer in terms of the actions of a Ôdumb file clerkÕ was the inspiration for Tony HeyÕs attempt to write ÔThe Computing UniverseÕ, a popular book about computer science. Tony Hey is a fellow of the AAAS and of the UK's Royal Academy of Engineering. In 2005, he was awarded a CBE by Prince Charles for his Ôservices to science.Õ
Torsten Hoeffler is
an Associate Professor of Computer Science at ETH ZŸrich, Switzerland. He is
best described as an HPC systems person with interests across the whole stack.
Recently, he started to investigate the potential of quantum computation.
Torsten won best paper awards at the ACM/IEEE Supercomputing Conference SC10,
SC13, SC14, EuroMPI'13, HPDC'15, HPDC'16, IPDPS'15, and other conferences.
He published numerous peer-reviewed scientific conference and journal
articles and authored chapters of the MPI-2.2 and MPI-3.0 standards. He
received the Latsis prize of ETH Zurich as well as an ERC starting grant in
2015. His research interests revolve around the central topic of
"Performance-centric System Design" and include scalable networks,
parallel programming techniques, and performance modeling. Additional
information about Torsten can be found on his homepage at htor.inf.ethz.ch.
Minh Quan Ho currently holds the position of Embedded library and High-performance computing solution expert at Kalray. He joined Kalray back in 2014 within his PhD program on optimizing stencil computations and linear algebra on the Kalray MPPA processor. Minh Quan graduated from his Master's degree in Computer Science from the Ecole Polytechnique de Grenoble and his PhD from the University Grenoble Alpes.
Heike Jagode is a Research Assistant Professor with the Innovative
Computing Laboratory at the University of Tennessee Knoxville. She specializes
in high-performance computing and the efficient use of advanced computer
architectures; focusing primarily on developing methods and tools for
performance analysis and tuning of parallel scientific applications. Her
research interests include the multi-disciplinary effort to convert
computational chemistry algorithms into a dataflow-based form to make them
compatible with next-generation task-scheduling systems, such as PaRSEC. She
received a Ph.D. in Computer Science from the University of Tennessee
Knoxville. Previously, she received an M.S. in High-Performance Computing from
The University of Edinburgh, Scotland, UK; an M.S. in Applied
Techno-Mathematics and a B.S. in Applied Mathematics from the University of
Applied Sciences Mittweida, Germany.
Emmanuel Jeannot is a Senior Research Scientist at Inria. He is doing his
research at INRIA Bordeaux Sud-Ouest and at the LaBRI laboratory since 2009.
From 2005 to 2006 he was researcher at INRIA Nancy Grand-Est. In 2006 I was a
visiting researcher at the University of Tennessee, ICL laboratory. From 2000
to 2005 he was assistant professor at the UniversitŽ Henry PoincarŽ. During the
period from 2000 to 2009 he did his research at the LORIA laboratory. He got
his PhD and Master degree of computer science (resp. in 1996 and 1999) both
from Ecole Normale SupŽrieur de Lyon, at the LIP laboratory. His main research
interests lies in parallel and high-performance computing and more precisely:
process placement, topology-aware algorithms, scheduling for heterogeneous
environments, data redistribution, algorithms and models for parallel machines,
distributed computing software, adaptive online compression, and programming
models.
Carl Kesselman
Laurent Lefevre is a permanent researcher in computer science at Inria
(the French Institute for Research in Computer Science and Control). He is a
member of the Avalon team (Algorithms and Software Architectures for
Distributed and HPC Platforms) from the LIP laboratory in Ecole Normale
SupŽrieure of Lyon, France. He has organized several conferences in high
performance networking and computing and he is a member of several
program committees. He has co-authored more than 100 papers published in
refereed journals and conference proceedings. Since more than a decade, he is
working on energy efficiency of large scale systems (HPC centers, datacenters,
clouds and big networks). His others interests include: high performance
computing, distributed computing and networking, high performance networks
protocols and services.
See
http://perso.ens-lyon.fr/laurent.lefevre
for further information.
LaŽrcio Lima Pilla received his PhD in Computer Science from the UniversitŽ Grenoble
Alpes, France, and the Universidade Federal do Rio Grande do Sul, Brazil, in
2014. He holds an Associate Professor position in the Universidade Federal de
Santa Catarina in Florian—polis, Brazil. He is currently working as a
postdoctoral researcher in the CORSE project-team in Grenoble in the hybrid parallelization
of a high order finite element solver for the numerical modeling of nanoscale
light/matter interaction. Starting this October, he will hold a position as
CNRS researcher in the ParSys team at LRI - University of Paris-Saclay. His
research interests are mainly related to parallel computing, runtime systems,
computer architecture, and global scheduling.
Bernd Mohr started to design and develop tools for performance analysis of parallel programs at the University of Erlangen in Germany in 1987. During a three year postdoc position at the University of Oregon, he designed and implemented the original TAU performance analysis framework. Since 1996 he has been a senior scientist at Forschungszentrum Juelich. Since 2000, he has been the team leader of the group "Programming Environments and Performance Analysis". Besides being responsible for user support and training in regard to performance tools at the Juelich Supercomputing Centre (JSC), he is leading the Scalasca performance tools efforts in collaboration with Prof. Felix Wolf of TU Darmstadt. Since 2007, he also serves as deputy head for the JSC division "Application support".
Dimitrios Nikolopoulos FBCS FIET is a Professor at QueenÕs University Belfast where he holds a personal chair in High Performance and Distributed Computing and is Director of the UniversityÕs Global Research Institute on Electronics, Communication and Information Technologies. Dimitrios currently holds a Royal Society Wolfson Research Merit Award and an SFI-DEL Investigator Award.
Christian Obrecht is an associate professor of applied physics at the Department of Civil Engineering and Urban Planning of the National Institute of Applied Sciences in Lyon (INSA Lyon). Dr Obrecht first graduated in mathematics from University of Strasbourg in 1990 and served as a teacher of mathematics from 1993 to 2008. He obtained a masterÕs degree in computer science from University of Lyon in 2009 and a doctoral degree in civil engineering from INSA Lyon in 2012. He was appointed associate professor in 2015 and joined the Centre for Energy and Thermal Sciences of Lyon (CETHIL). His research work is devoted to energy efficiency in buildings and focuses more specifically on innovative approaches in computational building physics.
Phil Papadopoulos received
his PhD in 1993 from UC Santa Barbara in Electrical Engineering. He spent 5
years at Oak Ridge National Laboratory (ORNL) as part of the the
Parallel Virtual Machine (PVM) development team. In 1998, he moved to UC San
Diego as research professor in computer science in 1998. In 1999. he began a
19-year career at the San Diego Supercomputer Center and become the
Chief Technology Officer at SDSC in 2008. He is the chief architect
of the NSF-funded Comet Cluster which supports high-performance virtual
clusters. In 2018, Dr. Papadopoulos moved to UC Irvine to become the
inaugural Director of the Research Cyberinfrastructure
Center. While his current job focuses more on CI
development and deployment for a leading research university, his own research
interests revolve around distributed, clustered, and cloud-based systems and
how they can be used more effectively in an expanding bandwidth-rich
environment. Dr. Papadopoulos has been a key investigator for several
research projects at UCSD including the The National Biomedical Computation
Resource(NBCR) and the Pacific Rim Applications and Grid Middlware Assembly
(PRAGMA, OCI-1234983) He is well known for leading the development of the
open-source, NSF-funded Rocks Cluster toolkit (OCI-0721623), which
has installed base of 1000s of clusters. Since his formative days
at ORNL, Dr. Papadopoulos has focused on the practicalities and challenges of
defining and building cluster and distributed cyberinfrastructure for local,
national, and international communities. He likes to hike, too.
Manish Parashar is Distinguished Professor of Computer Science at Rutgers University. He is also the founding Director of the Rutgers Discovery Informatics Institute (RDI2). He is currently on an IPA appointment at the National Science Foundation. His research interests are in the broad areas of Parallel and Distributed Computing and Computational and Data-Enabled Science and Engineering. Manish is the founding chair of the IEEE Technical Consortium on High Performance Computing (TCHPC), Editor-in-Chief of the IEEE Transactions on Parallel and Distributed Systems. He has received a number of awards for his research and leadership, and is Fellow of AAAS, Fellow of IEEE/IEEE Computer Society and ACM Distinguished Scientist. For more information please visit http://parashar.rutgers.edu/.
Judy Qiu is an associate professor of Intelligent Systems Engineering at Indiana University. Her general area of research is in data-intensive computing at the intersection of Cloud and HPC multicore technologies. This includes a specialization in programming models that support iterative computation, ranging from storage to analysis which can scalably execute data intensive applications. Her research has been funded by NSF, NIH, Microsoft, Google, Intel and Indiana University.
Padma Raghavan is
a Professor of Computer Science in the Department of Electrical Engineering and
Computer Science at Vanderbilt University, where she is also Vice Provost for
Research. Prior to joining Vanderbilt in February 2016, she was a Distinguished
Professor of Computer Science and Engineering at the Pennsylvania State
University and served as the Associate Vice President for Research and Director
of Strategic Initiatives, in addition to being the founding Director of the
Institute for CyberScience, the coordinating unit on campus for developing
interdisciplinary computation and data-enabled science and engineering and the
provider of high-performance computing services for the university. Raghavan
received her Ph.D. in computer science from Penn State. Prior to joining Penn
State in August 2000, she served as an associate professor in the Department of
Computer Science at the University of Tennessee and as a research scientist at
the Oak Ridge National Laboratory.
Raghavan specializes in high-performance computing and computational science
and engineering. She has led the development of Òsparse algorithmsÓ that derive
from and operate on compact yet accurate representation of high dimensional
data, complex models, and computed results. Raghavan has developed parallel
sparse linear solvers that limit the growth of computational costs and utilize
the concurrent computing capability of advanced hardware to enable the solution
of complex large-scale modeling and simulation problems that are otherwise
beyond reach. Raghavan was also among the first to propose the design of
energy-efficient supercomputing systems by combining results from sparse
scientific computing with energy-aware hardware optimizations used for
small-embedded computers. In her professorial role, Raghavan is deeply involved
in education and research, with 46 Masters and Ph.D. theses supervised and more
than hundred peer-reviewed publications. She has earned several awards
including an NSF CAREER Award (1995), the Maria Goeppert-Mayer Distinguished
Scholar Award (2002, University of Chicago and the Argonne National
Laboratory), and selection as an IEEE Fellow (2013). Raghavan is also a prominent
member of major professional societies including SIAM (Society of Industrial
and Applied Mathematics) and IEEE (Institute of Electrical and Electronics
Engineers). She served as the Chair of the Technical Program of the 2017
IEEE/ACM Conference on Supercomputing and she is a member of the SIAM Committee
on Science Policy and the SIAM Council, which together with its Board and
officers leads SIAM. Raghavan also
serves on the Advisory Board of the Computing and Information Science and
Engineering Directorate o
Yves Robert
Robert Ross is a Senior Computer Scientist at Argonne National Laboratory and a Senior Fellow at the Northwestern-Argonne Institute for Science and Engineering. He is the Director of the DOE SciDAC RAPIDS Institute for Computer Science and Data. RobÕs research interests are in system software for high performance computing systems, in particular distributed storage systems and libraries for I/O and message passing. Rob received his Ph.D. in Computer Engineering from Clemson University in 2000 and was a recipient of the 2004 Presidential Early Career Award for Scientists and Engineers.
Joel Saltz is an MD, PhD in Computer Science with a long career
spanning development of compiler, run time system, filter stream methods in
Computer Science and Multi Scale imaging and digital Pathology related tools,
algorithms and methods in Biomedical Informatics. He is currently Chair
of Biomedical Informatics and Professor of Computer Science at Stony
Brook.
Steve Scalpone is Director of Engineering for PGI compilers and tools at NVIDIA. He has worked on compilers for 20 years at Verdix, Rational Software, ST Microelectronics and NVIDIA. He also worked mobile device security at Nukona, Wind River, Intel, and Symantec.
Vaidy Sunderam is
Professor of Computer Science at Emory University and
Chair of the Computer Science Department. His research interests are in
parallel
and distributed computing systems, security and privacy issues in
spatiotemporal
systems, high-performance message passing environments, and infrastructures for
collaborative computing. His prior and current research efforts are supported
by
grants from NSF, DoE, AFOSR, and NASA and have focused on system for
metacomputing middleware, collaboration, and data driven systems. Sunderam
teaches computer science at the beginning, advanced, and graduate levels, and
advises graduate theses in the area of computer systems and data science.
Martin Swany is
Associate Chair and Professor in the Intelligent Systems Engineering Department
in the School of Informatics and Computing at Indiana University, and the
Deputy Director of the Center for Research in Extreme Scale Technologies
(CREST). His research interests include high-performance parallel and
distributed computing and networking.
Michela Taufer holds the Jack Dongarra Professorship in High Performance Computing in the Department of Electrical Engineering and Computer Science at the University of Tennessee Knoxville (UTK). Before to join UTK, she was Professor in Computer and Information Sciences and a J.P. Morgan Case Scholar at the University of Delaware where she also had a joint appointment in the Biomedical Department and the Bioinformatics Program. She earned her undergraduate degrees in Computer Engineering from the University of Padova (Italy) and her doctoral degree in Computer Science from the Swiss Federal Institute of Technology or ETH (Switzerland). From 2003 to 2004 she was a La Jolla Interfaces in Science Training Program (LJIS) Postdoctoral Fellow at the University of California San Diego (UCSD) and The Scripps Research Institute (TSRI), where she worked on interdisciplinary projects in computer systems and computational chemistry.
Bernard Tourancheau got a MSc. in Apply Maths from Grenoble University in 1986 and a MSc. in Renewable Energy Science and Technology from Loughborough University in 2007. He was awarded best Computer Science PhD by Institut National Polytechnique of Grenoble in 1989 for his work on Parallel Computing for Distributed Memory Architectures.
He was appointed assistant professor at Ecole Normale SupŽrieure de Lyon LIP lab in 1989 before joining CNRS as a junior researcher. After initiating a CNRS-NSF collaboration, he worked on leave at the University of Tennessee on a senior researcher position with the US Center for Research in Parallel Computation at the ICL laboratory.
He then took a Professor position at University of Lyon in 1995 where he created a research laboratory and the INRIA RESO team, specialized in High Speed Networking and HPC.
In 2001, he joined SUN Microsystems Laboratories for a 6 years sabbatical as a Principal Investigator in the DARPA HPCS project where he lead the backplane networking group.
Back in academia he oriented his research on wireless sensor networks for building energy efficiency at ENS LIP and INSA CITI labs.
He was appointed Professor at University Joseph Fourier of Grenoble in 2012. Since then in the LIG lab Drakkar team, he is developing research about protocols and architectures for the Internet of Things. He as well pursues HPC multicores GPGPU's communication algorithms optimization research. He is also a scientific promoter of renewable energy transition, relocalization and low tech to answer the peak oil and global warming issues.
He has authored more than 140 peer-reviewed publications and filed 10 patents.
Jeffrey Vetter, Ph.D., is a Distinguished R&D Staff Member at Oak Ridge National Laboratory (ORNL). At ORNL, Vetter is the founding group leader of the Future Technologies Group in the Computer Science and Mathematics Division. Vetter also holds joint appointments at the Georgia Institute of Technology and the University of Tennessee-Knoxville. Vetter earned his Ph.D. in Computer Science from the Georgia Institute of Technology. Vetter is a Fellow of the IEEE, and a Distinguished Scientist Member of the ACM. In 2010, Vetter, as part of an interdisciplinary team from Georgia Tech, NYU, and ORNL, was awarded the ACM Gordon Bell Prize. Also, his work has won awards at major conferences including Best Paper Awards at the International Parallel and Distributed Processing Symposium (IPDPS), the AsHES workshop, and EuroPar, Best Student Paper Finalist at SC14, and Best Presentation at EASC 2015. In 2015, Vetter served as the SC15 Technical Program Chair. His recent books, entitled "Contemporary High Performance Computing: From Petascale toward Exascale (Vols. 1 and 2)," survey the international landscape of HPC. See his website for more information: http://ft.ornl.gov/~vetter/.
FrŽdŽric Vivien received his Ph.D. degree from the ƒcole Normale SupŽrieure de Lyon in 1997. From 1998 to 2002, he was an associate professor at the Louis Pasteur University in Strasbourg, France. He spent the year 2000 working with the Computer Architecture Group of the MIT Laboratory for Computer Science. He is currently a senior researcher from INRIA, working at ENS Lyon, France. He leads the INRIA project-team Roma, which focuses on designing models, algorithms, and scheduling strategies to optimize the execution of scientific applications. He is the author of two books, more than 35 papers published in international journals, and more than 50 papers published in international conferences. His main research interests are scheduling techniques and parallel algorithms for distributed and/or heterogeneous systems.
Rich Vuduc is an Associate Professor at the Georgia Institute of Technology (ÒGeorgia TechÓ), in the School of Computational Science and Engineering, a department devoted to the study of computer-based modeling and simulation of natural and engineered systems. His research lab, The HPC Garage (@hpcgarage), is interested in high-performance computing, with an emphasis on algorithms, performance analysis, and performance engineering. He is a recipient of a DARPA Computer Science Study Group grant; an NSF CAREER award; a collaborative Gordon Bell Prize in 2010; Lockheed-Martin Aeronautics Company DeanÕs Award for Teaching Excellence (2013); and Best Paper Awards at the SIAM Conference on Data Mining (SDM, 2012) and the IEEE Parallel and Distributed Processing Symposium (IPDPS, 2015), among others. He has also served as his departmentÕs Associate Chair and Director of its graduate programs. External to Georgia Tech, he was elected to be Vice President of the SIAM Activity Group on Supercomputing (2016-2018); co-chaired the Technical Papers Program of the ÒSupercomputingÓ (SC) Conference in 2016; and serves as an associate editor of both the International Journal of High-Performance Computing Applications and IEEE Transactions on Parallel and Distributed Systems. He received his Ph.D. in Computer Science from the University of California, Berkeley, and was a postdoctoral scholar at Lawrence Livermore National Laboratory's Center for Advanced Scientific Computing.
CCGSC 1992, Participants (Some of them)
CCGSC 1994 Participants (Some of
them), Blackberry Farm, Tennessee
Missing CCGSC 1996 - Anyone have a picture?
CCGSC 1998 Participants,
Blackberry Farm, Tennessee
CCGSC 2000 Participants, Faverges,
France
CCGSC 2002 Participants,
Faverges, France
CCGCS 2004 Participants, Faverges, France
CCGCS 2006 Participants, Flat
Rock North Carolina
Some additional
pictures can be found here.
http://web.eecs.utk.edu/~dongarra/ccgsc2006/
CCGCS 2008 Participants, Flat
Rock North Carolina
http://web.eecs.utk.edu/~dongarra/ccgsc2008/
CCGCS 2010 Participants, Flat
Rock North Carolina
http://web.eecs.utk.edu/~dongarra/ccgsc2010/
CCDSC 2012 Participants,
Dareize, France
http://web.eecs.utk.edu/~dongarra/CCDSC-2012/index.htm
CCGSC 2014 Participants,
Dareize, France
http://web.eecs.utk.edu/~dongarra/CCDSC-2014/index.htm
CCGSC 2016 Participants,
Dareize, France