Search results for: Mahmut Kandemir

Items from 1 to 15 out of 15 results

chapter

DEMM: A Dynamic Energy-Saving Mechanism for Multicore Memories

Akbar Sharifi, Wei Ding, Diana Guttman, Hui Zhao, more

2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS) > 210 - 220

2017 IEEE 25th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)

Since main memory system contributes to a large and increasing fraction of server/datacenter energy consumption, there have been several efforts to reduce its power and energy consumption. DVFS schemes have been used to reduce the memory power, but they come with a performance penalty. In this work, we propose DEMM, an OS-based, high performance DVFS mechanism that reduces memory power by dynamically...

chapter

A cache topology-aware multi-query scheduler for multicore architectures

Umut Orhan, Wei Ding, Praveen Yedlapalli, Mahmut Kandemir, more

2014 IEEE International Symposium on Workload Characterization (IISWC) > 86 - 87

2014 IEEE International Symposium on Workload Characterization (IISWC)

Growing performance gap between processors and main memory has made it worthwhile to consider off-chip data accesses in multi-query processing [2], [1], [3]. Exploiting data-sharing opportunities among concurrent queries can be critical for effective utilization of the underlying shared memory hierarchy. Given a set of queries, there may be a common retrieval operation for several cases to the same...

chapter

Quantifying and Optimizing the Impact of Victim Cache Line Selection in Manycore Systems

Mahmut Kandemir, Wei Ding, Diana Guttman

2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems > 385 - 394

2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS

In both architecture and software, the main goal of data locality-oriented optimizations has always been "minimizing the number of cache misses" (especially, costly last-level cache misses). However, this paper shows that other metrics such as the distance between the last-level cache and memory controller as well as the memory queuing latency can play an equally important role, as far as...

chapter

Trading cache hit rate for memory performance

Wei Ding, Mahmut Kandemir, Diana Guttman, Adwait Jog, more

2014 23rd International Conference on Parallel Architecture and Compilation (PACT) > 357 - 368

2014 23rd International Conference on Parallel Architecture and Compilation (PACT)

Most of the prior compiler based data locality optimization works target exclusively cache locality optimization, and row-buffer locality in DRAM banks received much less attention. In particular, to the best of our knowledge, there is no single compiler based approach that can improve row-buffer locality in executing irregular applications. This presents a critical problem considering the fact that...

chapter

Optimizing sparse matrix vector multiplication on emerging multicores

Orhan Kislal, Wei Ding, Mahmut Kandemir, Ilteris Demirkiran

2013 IEEE 6th International Workshop on Multi-/Many-core Computing Systems (MuCoCoS) > 1 - 10

2013 IEEE 6th International Workshop on Multi-/Many-core Computing Systems (MuCoCoS)

After hitting the power wall, the dramatic change in computer architecture from single core to multicore/manycore brings us new challenges on high performance computing, especially for the data intensive applications. Sparse matrix-vector multiplication (SpMV) is one of the most important computations in this area, and has therefore received a lot of attention in recent decades. In contrast to the...

chapter

Courteous cache sharing: Being nice to others in capacity management

Akbar Sharifi, Shekhar Srikantaiah, Mahmut Kandemir, Mary Jane Irwin

DAC Design Automation Conference 2012 > 678 - 687

2012 49th ACM/EDAC/IEEE Design Automation Conference (DAC)

This paper proposes a cache management scheme for mul-tiprogrammed, multithreaded applications, with the objective of obtaining maximum performance for both individual applications and the multithreaded workload mix. In this scheme, each individual application's performance is improved by increasing the priority of its slowest thread, while the overall system performance is improved by ensuring that...

chapter

Performance-reliability tradeoff analysis for multithreaded applications

Isil Oz, Haluk Rahmi Topcuoglu, Mahmut Kandemir, Oguz Tosun

2012 Design, Automation & Test in Europe Conference & Exhibition (DATE) > 893 - 898

2012 Design, Automation & Test in Europe Conference & Exhibition (DATE 2012)

Modern architectures become more susceptible to transient errors with the scale down of circuits. This makes reliability an increasingly critical concern in computer systems. In general, there is a tradeoff between system reliability and performance of multithreaded applications running on multicore architectures. In this paper, we conduct a performance-reliability analysis for different parallel...

chapter

Improving last level cache locality by integrating loop and data transformations

Wei Ding, Mahmut Kandemir

2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) > 65 - 72

2012 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)

Motivated by the observation that most existing data locality optimizations do not specifically target shared last-level caches of emerging multicores and that even multicore-specific locality-oriented techniques employ either loop or data layout optimizations but not both, in this paper we present an integrated loop and data layout optimization strategy, with the goal of improving the last-level...

chapter

Compiler Directed Data Locality Optimization for Multicore Architectures

Wei Ding, Jithendra Srinivas, Mahmut Kandemir, Mustafa Karakoy

2011 International Conference on Parallel Architectures and Compilation Techniques > 171 - 172

2011 International Conference on Parallel Architectures and Compilation Techniques (PACT)

This paper presents and evaluates a cache hierarchy-aware code parallelization/mapping and scheduling strategy for multicore architectures. Our proposed parallelization/mapping strategy determines a loop iteration-to-core mapping by taking into account the data access pattern of an application and the on-chip cache hierarchy of a target architecture. The goal of this step is to maximize data locality...

chapter

Optimizing Data Layouts for Parallel Computation on Multicores

Yuanrui Zhang, Wei Ding, Jun Liu, Mahmut Kandemir

2011 International Conference on Parallel Architectures and Compilation Techniques > 143 - 154

2011 International Conference on Parallel Architectures and Compilation Techniques (PACT)

The emergence of multicore platforms offers several opportunities for boosting application performance. These opportunities, which include parallelism and data locality benefits, require strong support from compilers as well as operating systems. Current compiler research targeting multicores mostly focuses on code restructuring and mapping. In this work, we explore automatic data layout transformation...

chapter

Improving energy efficiency of multi-threaded applications using heterogeneous CMOS-TFET multicores

Karthik Swaminathan, Emre Kultursay, Vinay Saripalli, Vijaykrishnan Narayanan, more

IEEE/ACM International Symposium on Low Power Electronics and Design > 247 - 252

2011 International Symposium on Low Power Electronics and Design (ISLPED)

Energy-Delay-Product-aware DVFS is a widely-used technique that improves energy efficiency by dynamically adjusting the frequencies of cores. Further, for multithreaded applications, barrier-aware DVFS is a method that can dynamically tune the frequencies of cores to reduce barrier stall times and achieve higher energy efficiency. In both forms of DVFS, frequencies of cores are reduced from the maximum...

chapter

A helper thread based dynamic cache partitioning scheme for multithreaded applications

Mahmut Kandemir, Taylan Yemliha, Emre Kultursay

2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC) > 954 - 959

2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC)

Focusing on the problem of how to partition the cache space given to a multithreaded application across its threads, we show that different threads of a multithreaded application can have different cache space requirements, propose a fully automated, dynamic, intra-application cache partitioning scheme targeting emerging multicores with multilayer cache hierarchies, present a comprehensive experimental...

chapter

Process variation-aware routing in NoC based multicores

Akbar Sharifi, Mahmut Kandemir

2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC) > 924 - 929

2011 48th ACM/EDAC/IEEE Design Automation Conference (DAC)

We propose a variation-aware source routing algorithm for a heterogenous NoC where each router has a different operating latency, as a result of process variations. Our proposed scheme computes the best path for each communication, based on the inherent speed of the routers (dictated by process variations) and the current traffic pattern. Our results indicate that employing our proposed routing scheme...

chapter

Intra-application cache partitioning

Sai Prashanth Muralidhara, Mahmut Kandemir, Padma Raghavan

2010 IEEE International Symposium on Parallel&Distributed Processing (IPDPS) > 1 - 12

2010 IEEE International Symposium on Parallel & Distributed Processing (IPDPS)

Efficient management of shared on-chip resources such as the shared level 2 (L2) cache has become an important problem with the emergence of chip multiprocessors (CMPs). Partitioning the shared cache in chip multiprocessors (CMPs) among concurrently executing applications can provide important benefits such as throughput improvement, fairness guarantees, and quality of service (QoS) enhancements....

chapter

Feedback control for providing QoS in NoC based multicores

Akbar Sharifi, Hui Zhao, Mahmut Kandemir

2010 Design, Automation&Test in Europe Conference&Exhibition (DATE 2010) > 1384 - 1389

2010 Design, Automation & Test in Europe Conference & Exhibition (DATE 2010)

In this paper, we employ formal feedback control theory to achieve desired communication throughput across a network-on-chip (NoC) based multicore. When the output of the system needs to follow a certain reference input over time, our controller regulates the system to obtain the desired effect on the output. In this work, targeting a multicore that executes multiple applications simultaneously, we...

Filter options

Keywords:
MULTICORE PROCESSING

Publication date

Set your own date range

Keywords

OPTIMIZATION (6)
INSTRUCTION SETS (5)
LAYOUT (4)
VECTORS (4)
ARRAYS (3)
SYSTEM-ON-CHIP (3)
BENCHMARK TESTING (2)
CACHE (2)
DATA LOCALITY (2)
EDUCATIONAL INSTITUTIONS (2)
HARDWARE (2)
MEASUREMENT (2)
MEMORY MANAGEMENT (2)
MULTICORE (2)
MULTIPROCESSING SYSTEMS (2)
QUALITY OF SERVICE (2)
RESOURCE MANAGEMENT (2)
SOCKETS (2)
SYSTEM-ON-A-CHIP (2)
THROUGHPUT (2)
ACTUATORS (1)
ADAPTATION MODELS (1)
ADAPTIVE SYSTEMS (1)
ARGON (1)
BARS (1)
BIPARTITE GRAPH (1)
CACHE HIERARCHY-AWARE (1)
CACHE STORAGE (1)
CHIP MULTIPROCESSORS (1)
CLUSTERING ALGORITHMS (1)
CMOS INTEGRATED CIRCUITS (1)
CMP (1)
COMPILER (1)
COMPILER OPTIMIZATION (1)
COMPUTATIONAL MODELING (1)
COMPUTER SCIENCE (1)
CONTEXT (1)
CRITICAL PATH THREAD (1)
DATA LAYOUT TRANSFORMATION (1)
DATA MODELS (1)
DATA TRANSFORMATION (1)
DELAY (1)
DISASTER MANAGEMENT (1)
DISTRIBUTED ALGORITHMS (1)
DYNAMIC CACHE PARTITIONING SCHEME (1)
DYNAMIC SCHEDULING (1)
ENERGY CONSUMPTION (1)
FAIRNESS GUARANTEES (1)
FEEDBACK CONTROL (1)
FORMAL FEEDBACK CONTROL THEORY (1)
FREQUENCY CONTROL (1)
FREQUENCY MODULATION (1)
GLOBAL CONTROLLER ARCHITECTURE (1)
HELPER THREAD (1)
INDEXES (1)
INTRAAPPLICATION CACHE PARTITIONING (1)
IP NETWORKS (1)
IRREGULAR APPLICATION (1)
JACOBIAN MATRICES (1)
L2 CACHE (1)
LOAD MANAGEMENT (1)
LOOP TRANSFORMATION (1)
MAGNETIC CORES (1)
MANYCORE (1)
MEMORY QUEUING LATENCY (1)
MICROPROCESSOR CHIPS (1)
MULTI-CORE (1)
MULTI-CORE ARCHITECTURES AND SUPPORT (1)
MULTI-THREADING (1)
MULTITHREADED APPLICATION (1)
MULTITHREADED APPLICATIONS (1)
NETWORK-ON-CHIP (1)
NETWORK-ON-CHIP BASED MULTICORE (1)
NETWORK-ON-CHIP LATENCY (1)
NOC (1)
NOC BASED MULTICORES (1)
PARTITIONING (1)
PERFORMANCE ENHANCEMENT (1)
PERFORMANCE EVALUATION (1)
PERFORMANCE GAIN (1)
PID CONTROLLER (1)
PRIVATE CACHE (1)
PROCESS VARIATION (1)
PROCESSOR SCHEDULING (1)
PROPORTIONAL INTEGRAL DERIVATIVE (1)
QOS (1)
QOS ENHANCEMENTS (1)
QUALITY OF SERVICE ENHANCEMENT (1)
RANDOM ACCESS MEMORY (1)
REGISTERS (1)
RELIABILITY (1)
RELIABLE PARALLEL (1)
ROUTING (1)
ROW BUFFER (1)
RUNTIME SYSTEM BASED PARTITITIONING (1)
SENSITIVITY (1)
SHARED CACHE MANAGEMENT (1)
SHARED CACHE SPACE (1)
SHARED LEVEL 2 CACHE (1)
more

INFONA - science communication portal

Search results for: Mahmut Kandemir

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options