MSPC 2014

ACM SIGPLAN Workshop on Memory Systems Performance and Correctness

June 13, 2014, Edinburgh, Scotland. Co-located with PLDI 2014

Links

Important dates

Papers due

~~March 10, 2014~~
April 4, 2014

Acceptance notification

~~April 28, 2014~~

April 29, 2014

Final papers due

May 19, 2014

Workshop date

June 13, 2014

Workshop Organization

General Chair

Jeremy Singer, University of Glasgow

Program Co-chairs

Milind Kulkarni, Purdue University
Tim Harris, Oracle Labs

Program Committee

Cristiana Amza, University of Toronto
Hans Boehm, Google
Edouard Bugnion, EPFL
Mainak Chaudhuri, IIT-Kanpur
Dave Dice, Oracle Labs
Dave Grove, IBM Research
Jungwoo Ha, Google
Mary Hall, University of Utah
Matthew Hertz, Canisius College
Matt Horsnell, ARM
Derek Hower, Qualcomm
Engin Ipek, University of Rochester
Richard Jones, University of Kent
Sriram Krishnamoorthy, PNNL
Mikel Lujan, University of Manchester
Sally McKee, Chalmers University
Madan Musuvathi, Microsoft Research
Dimitrios Nikolopoulos, Queen's University, Belfast
Erez Petrank, Technion
Depei Qian, Beihang University
P Sadayappan, Ohio State University
Jennifer B. Sartor, Ghent University
Marc Shapiro, Inria & UPMC-LIP6
Harsha Vardhan Simhadri, LBNL
Dongping Zhang, AMD

Technical Program

9:00–9:15 Opening Remarks (Jeremy Singer, Tim Harris, Milind Kulkarni)
9:15–10:20 Keynote: (Chair: Milind Kulkarni)
- Memory Hierarchy Visibility in Parallel Programming Languages— Paul Keir, Codeplay Research
  
  The choice as to which levels in a memory hierarchy are exposed within a programming language or API can be critical. Expose too many, and you risk programmability, and performance portability.
  
  Heterogeneous computing and GPGPU aims to repurpose the data-parallel capability of graphics and commodity hardware for general calculations. GPGPU APIs, which now include OpenCL SYCL; Apple's Metal; and Qualcomm's MARE; must all decide on a suitable abstraction for hardware memory levels. Established GPGPU APIs such as CUDA, C++AMP, and OpenCL offer language support for four levels of volatile memory. However, while the presence of GPUs are now essentially ubiquitous, the diminished role of discrete graphics cards invigorates questions regarding memory abstraction.
  
  The multicore revolutionaries have now ceded mobile computing to the CPU-GPU system-on-chips; firmly established in mainstream options such as the Qualcomm Snapdragon; Samsung Exynos; and the AMD APU series. Meanwhile, the HSA Foundation builds upon a bedrock of uniform memory access; the Android GPGPU API, Renderscript, eschews explicit memory address spaces; and CUDA now offers "unified" memory. Can caché once again mean hidden?
10:20–10:50 Coffee break
10:50–12:05 Session 1: Hardware, memory technologies and optimization (Chair: Hans Boehm)
- O-structures: Semantics for Versioned Memory—Eran Gilad, Eric W Mackay, Mark Oskin, Yoav Etsion
- Main Memory and Cache Performance of Intel Sandy Bridge and AMD Bulldozer—Daniel Molka, Daniel Hackenberg, Robert Schöne
- Trash in Cache: Detecting Eternally Silent Stores—Jonathan Shidal, Zachary Gottlieb, Ron K. Cytron, Krishna M. Kavi
12:05–13:35 Lunch
13:35–14:50 Session 2: Locality and memory allocation (Chair: Dave Grove)
- A Study of Connected Object Locality in NUMA Heaps—Khaled Alnowaiser
- Affinity-Based Hash Tables (Short Paper)—Brian Gernhardt, Rahman Lavaee, Chen Ding
- Feedback Directed Optimization of TCMalloc—Sangho Lee, Teresa Johnson, Easwaran Raman
14:50–15:20 Coffee break
15:20–16:10 Session 3: Semantics (Chair: Richard Jones)
- Outlawing Ghosts: Avoiding Out-of-Thin-Air Results—Hans-J. Boehm, Brian Demsky
- Non-volatile Memory is a Broken Time Machine (Short Paper)—Ben Ransford, Brandon Lucia
16:15–16:50 Session 4: Poster lightning session (Chair: Tim Harris)
- Cache-Conscious Memory Management—Chen Ding, Pengcheng Li
- DataProf: Exposing Data Movements Across The Memory Hierarchy—William Wang, Chris Emmons, Nigel Paver
- Field Pinning Garbage Collector—Erik Österlund, Welf Löwe
- High-level Portable Programming Language for Optimized Memory Use of Network Processors—Yasusi Kanada
- Nesoi: Static checking of transactional coverage in parallel programs—Daniel Goodman, Behram Khan, Mikel Lujan, Ian Watson
- Optimal Thread-to-Core Mapping for Pipeline Programs—Hao Luo, Chen Ding, Pengcheng Li
- Precise Memory Allocator for Compressed Swap Cache in Mobile Devices—Daejong Kim, Dongkun Shin
16:50–17:40 Poster session and discussions