Speaker Affiliation Talk Title
Prof. R Govindarajan IISc, Bangalore Improving Programmability, Portability and Performance in Heterogeneous Accelerator-Based Architectures
Rapid advances in processor architecture have enabled accelerators such as Graphics Processing Units, Many Integrated Cores, and specialized accelerators, along with multicore processors as major players in increasing the performance of high performance computing systems. While the performance potentials of these architectures are high, they also pose significant challenges in extracting performance, while ensuring portability and productivity. In this talk the research contribution of our group in developing compiler and runtime systems for synergistic program execution on heterogeneous accelerator-based architectures will be presented. Specifically, our group's work on compiling techniques for programs written in languages like StreamIt or MATLAB and runtime approaches for automatic memory management and automatic work distribution across multiple devices will be described.

Prof. Mainak Chaudhuri IIT Kanpur Tiny Directory: Making Coherence Tracking Feather-light
The classical directory-based cache coherence protocols rely on the directory storage for maintaining coherence information. In a cache-coherent chip-multiprocessor, this storage typically takes the form of a sparse directory tracking the blocks in the private caches of the cores. Sparse directory structures are typically over-provisioned to avoid premature eviction of blocks from the private cache hierarchy. In this talk, I will introduce a novel infrastructure for significantly bringing down the sparse directory storage overhead while keeping the performance loss within a percentage on a chip-multiprocessor having 100+ cores. This study also offers interesting insights into efficient private cache hierarchy design for many-core server processors.

Dr. Vivek Seshadri Microsoft Research Bangalore Rethinking Virtual Memory for Modern Systems
Virtual memory was invented in the 1960's and is arguably one of the most successful inventions in computer architecture. Virtual memory simplifies a number of things for the operating system, including capacity management, and data protection and sharing between processes. However, in recent years, the translation overhead introduced by virtual memory is beginning to outweigh the benefits. In fact, many techniques have been proposed and implemented to reduce these overheads. In this talk, we will examine the following question: if we were to redesign virtual memory from scratch, what should it look like?
Prof. Manu Awasthi Ashoka University Handheld Device Architectures: Are We Doing Enough?
Handheld devices, in the form of tablets and smartphones are becoming increasingly commonplace. Since both performance and energy consumption/battery life remain top design criteria for these devices, architectures for systems-on-chip for these devices have to be explored differently than that for traditional server architectures. However, tools and benchmarks for design space exploration of mobile SoCs are either outdated or completely missing. In this talk, we will address some of the challenges that need to be addressed to reinvigorate mobile SoC research. In addition, some of the work that is currently being done by our group to address these problems will also be presented.

Prof. Biswabandan Panda IIT Kanpur Time for Compressed Memory Hierarchies
Effective management of last-level-cache (LLC) and DRAM capacity is key to system performance. Increase in the cache capacity improves system performance but at the cost of energy and power. The talk will focus on a recent cache compression scheme called dictionary sharing and how it can be adapted to DRAM as well. The talk will also provide a brief on a cache layout that can be used for cache management and cache compression techniques.

Prof. Arkaprava Basu IISc Bangalore Smarter virtual memory subsystem in GPGPUs
As the benefits from transistor scaling diminish, an increasingly larger number of applications are making use of domain-specific accelerators, such as general purpose graphics processing units or GPGPUs. Relative to traditional CPUs, these accelerators often provide significantly better performance and energy-efficiency for types of applications it is designed for. However, programming accelerators could be a challenge owing to their relatively constrained programming environments. Thus, a key research question is how to make accelerators more easily programmable without compromising its benefits.
In this talk, we will focus on GPGPU, a popular accelerator that is specialized in executing data-parallel programs. We will discuss a key programmability enhancing feature on GPGPUs called shared virtual memory (SVM). We will see that SVM could help ease programming GPGPUs but can significantly slowdown applications. We will then analyze application characteristics and design constraints in GPGPUs that lead to such performance overhead. Driven by this analysis, we will discuss our idea of smarter scheduling of virtual memory requests that could lead to 30% performance improvement. Finally, if time permits, we will conclude this talk by discussing some of the other relevant future research directions that are of interests to the speaker.

Dheryta Jaisinghani IIIT Delhi Hands on Session: Kernel Programming
Most of this session will focus on Kernel Module Programming. We will briefly talk about the interaction of different layers of operating system from userspace to kernel space. Starting from simple Hello World kernel modules, we will learn the development of more sophisticated modules related to device drivers and interrupt handlers. We will also briefly touch upon the shell scripts and how they can be used to extract system level information. Since, this will be a hands on session, attendees are expected to try the examples on their machines. Basic understanding of operating systems and C programming is expected for the tutorial.