Search results

Items from 1 to 20 out of 273 results

chapter

Improvement of a TCP incast avoidance method using a fine-grained kernel timer

Shigeyuki Osada, Shogo Wakai, Yukinobu Fukushima, Tokumi Yokohira

2017 International Conference on Information and Communication Technology Convergence (ICTC) > 147 - 152

2017 International Conference on Information and Communication Technology Convergence (ICTC)

When a standard TCP implementation using the minimum retransmission timeout (RTOmin) of 200 ms is used in distributed file systems in data centers, a well-known throughput degradation called TCP Incast occurs, because 200 ms is too large as an RTOmin in data centers. In order to avoid TCP Incast, a TCP implementation using a much smaller RTOmin attained by a fine-grained kernel timer is proposed....

chapter

Enhancement in Data-Recovery and Re-Transmit Mechanisms of TCP

Mudassar Ahmad, Muhammad Asif Habib, Rehan Ashraf, Muhammad Shahid

2017 IEEE 42nd Conference on Local Computer Networks Workshops (LCN Workshops) > 183 - 187

2017 IEEE 42nd Conference on Local Computer Networks Workshops (LCN Workshops)

Network performance is one of the most important entities in today’s long-distance networks. TCP congestion control mechanisms play an important role in these networks. Most of the current TCP congestion control mechanisms which are also known as TCP variants, detect congestion and slow down the packets transmission to avoid further congestion in the network. In this paper, three classes...

chapter

A network-centric TCP for interactive video delivery networks (VDN)

Md Iftakharul Islam, Javed I Khan

2017 IEEE 25th International Conference on Network Protocols (ICNP) > 1 - 6

2017 IEEE 25th International Conference on Network Protocols (ICNP)

Interactive video streaming requires very low latency and high throughput. Traditional latency based congestion control algorithm performs poorly in fairness. This results in very poor video quality to adaptive video streaming. Software defined networks (SDN) enables us to solve the problem by designing a network controller in the routers. This paper presents a SDN-centric TCP where sending rate of...

chapter

Accuracy/energy-flexible stochastic configurable 2D Gabor filter with instant-on capability

Naoya Onizawa, Kazumichi Matsumiya, Warren J. Gross, Takahiro Hanyu

ESSCIRC 2017 - 43rd IEEE European Solid State Circuits Conference > 43 - 46

ESSCIRC 2017 - 43rd IEEE European Solid State Circuits Conference (ESSCIRC)

This paper introduces an accuracy/energy-flexible configurable 2D Gabor filter based on stochastic computation, where bit streams representing information are used. The Gabor filters show a powerful feature extraction capability, but the calculation based on binary computation is complicated. As opposed to traditional memory-based methods that use fixed Gabor coefficients calculated by software in...

chapter

Mixed data layout kernels for vectorized complex arithmetic

Doru T. Popovici, Franz Franchetti, Tze Meng Low

2017 IEEE High Performance Extreme Computing Conference (HPEC) > 1 - 7

2017 IEEE High Performance Extreme Computing Conference (HPEC)

Implementing complex arithmetic routines with Single Instruction Multiple Data (SIMD) instructions requires the use of instructions that are usually not found in their real arithmetic counter-parts. These instructions, such as shuffles and addsub, are often bottlenecks for many complex arithmetic kernels as modern architectures usually can perform more real arithmetic operations than execute instructions...

chapter

POSTER: Accelerate GPU Concurrent Kernel Execution by Mitigating Memory Pipeline Stalls

Hongwen Dai, Zhen Lin, Chao Li, Chen Zhao, more

2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT) > 144 - 145

2017 26th International Conference on Parallel Architectures and Compilation Techniques (PACT)

In this study, we demonstrate that the performance may be undermined in the state-of-the-art intra-SM sharing schemes for concurrent kernel execution (CKE) on GPUs, due to the interference among concurrent kernels. We highlight that cache partitioning techniques proposed for CPUs are not effective for GPUs. Then we propose to balance memory accesses and limit the number of inflight memory instructions...

chapter

Non-von-neumann heap for better streaming, capturing and storing of raw 8K video data

Mohamed Shaafiee, Rajasvaran Logeswaran

2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA) > 469 - 473

2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

The advent of 8K and better resolutions of video pose problems for the capture and storage of data by these standards. The contemporary alternative is to compromise on quality and use various (often lossy) compression techniques to reduce the bandwidth required to move this data. This paper proposes a novel method for handling large volumes of video data without compromising its quality through space...

chapter

Enhancing VNF's performance using DPDK driven OVS user-space forwarding

Dani Vladislavic, Darko Huljenic, Julije Ozegovic

2017 25th International Conference on Software, Telecommunications and Computer Networks (SoftCOM) > 1 - 5

2017 25th International Conference on Software, Telecommunications and Computer Networks (SoftCOM)

Network function virtualization (NFV) is a concept aiming to achieve telecom grade cloud ecosystem for new generation networks focusing on Capital and Operational expenditure (CAPEX and OPEX) savings. Keeping at least the same performances is one of the main requirements of the applications when being virtualized. This work presents a performance impact of Open Virtual Switch (OVS) user-space forwarding...

chapter

OpenCL-based design pattern for line rate packet processing

Jehandad Khan, Peter Athanas, Skip Booth, John Marshall

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP) > 190 - 194

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

The ever changing nature of network technology requires a flexible platform that can change as the technology evolves. In this work, a complete networking switch designed in OpenCL is presented, identifying several high-level constructs that form the building blocks of any network application targeting FPGAs. These include the notion of an on-chip global memory and kernels constantly processing data...

chapter

Renovate high performance user-level stacks' innovation utilizing commodity network adaptors

Mao Miao, Xiaohui Luo, Fengyuan Ren, Wenxue Cheng, more

2017 IEEE Symposium on Computers and Communications (ISCC) > 906 - 911

2017 IEEE Symposium on Computers and Communications (ISCC)

Today's data center servers are equipped with high speed and complex network adaptors, featuring an array of functions, e.g. hardware TX/RX queues, packet filters, rate limiters, etc. Recent work like IX, Arrakis, MultiStack has made us rekindle the user-level network stacks' innovation utilizing these commodity network adaptors. In this paper, we revisit the idea to move stacks' design from in-kernel...

chapter

Fast and efficient implementation of Convolutional Neural Networks on FPGA

Abhinav Podili, Chi Zhang, Viktor Prasanna

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP) > 11 - 18

2017 IEEE 28th International Conference on Application-specific Systems, Architectures and Processors (ASAP)

State-of-the-art CNN models for Image recognition use deep networks with small filters instead of shallow networks with large filters, because the former requires fewer weights. In the light of above trend, we present a fast and efficient FPGA based convolution engine to accelerate CNN models over small filters. The convolution engine implements Winograd minimal filtering algorithm to reduce the number...

chapter

Taming Performance Degradation of Containers in the Case of Extreme Memory Overcommitment

Rina Nakazawa, Kazunori Ogata, Seetharami Seelam, Tamiya Onodera

2017 IEEE 10th International Conference on Cloud Computing (CLOUD) > 196 - 204

2017 IEEE 10th International Conference on Cloud Computing (CLOUD)

The efficiency of datacenters is important consideration for cloud service providers to make their datacenters always ready for fulfilling the increasing demand for computing resources. Container-based virtualization is one approach to improving efficiency by reducing the overhead of virtualization. Resource overcommitment is another approach, but cloud providers tend to make conservative allocations...

chapter

A Communication-Aware Container Re-Distribution Approach for High Performance VNFs

Yuchao Zhang, Yusen Li, Ke Xu, Dan Wang, more

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS) > 1555 - 1564

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)

Containers have been used in many applications for isolation purposes due to the lightweight, scalable and highly portable properties. However, to apply containers in virtual network functions (VNFs) faces a big challenge because high-performance VNFs often generate frequent communication workloads among containers while the container communications are generally not efficient. Compared with hardware...

chapter

On Energy-Efficient Congestion Control for Multipath TCP

Jia Zhao, Jiangchuan Liu, Haiyang Wang

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS) > 351 - 360

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)

Multipath TCP (MPTCP) enables transmission via multiple routes for an end-to-end connection to improve resource usage of regular TCP. Due to the increasing concern in green computing, there has been significant interest in designing energy-efficient multipath transport. For existing MPTCP congestion control algorithms, the research community still lacks a comprehensive understanding of which components...

chapter

Rate-aware flow scheduling for commodity data center networks

Ziyang Li, Wei Bai, Kai Chen, Dongsu Han, more

IEEE INFOCOM 2017 - IEEE Conference on Computer Communications > 1 - 9

IEEE INFOCOM 2017 - IEEE Conference on Computer Communications

Flow completion times (FCTs) are critical for many cloud applications. To minimize the average FCT, recent transport designs, such as pFabric, PASE, and PIAS, approximate the Shortest Remaining Time First (SRTF) scheduling. A common, implicit assumption of these solutions is that the remaining time is only determined by the remaining flow size. However, this assumption does not hold in many real-world...

chapter

PACENet: Energy efficient acceleration for convolutional network on embedded platform

Adwaya Kulkarni, Tahmid Abtahi, Colin Shea, Amey Kulkarni, more

2017 IEEE International Symposium on Circuits and Systems (ISCAS) > 1 - 4

2017 IEEE International Symposium on Circuits and Systems (ISCAS)

Lightweight convolutional neural network (CNN) on tiny embedded platforms can offer energy efficient solution for today's IoT devices. However, CNN implementation on embedded system faces processing bottleneck in convolutional layers and memory storage issues in fully connected layers. In past years, heterogeneous acceleration, where compute intensive tasks are performed on kernel specific cores,...

chapter

The actual cost of software switching for NFV chaining

Marcelo Caggiani Luizelli, Danny Raz, Yaniv Sa'ar, Jose Yallouz

2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM) > 335 - 343

2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM)

Network Function Virtualization (NFV) is a novel paradigm that enables flexible and scalable implementation of network services on cloud infrastructure. An important enabler for the NFV paradigm is software switching, which should satisfy rigid network requirements such as high throughput and low latency. Despite recent research activities in the field of NFV, not much attention was given to understand...

chapter

MOCHA: Morphable Locality and Compression Aware Architecture for Convolutional Neural Networks

Syed Mohammad Asad Hassan Jafri, Ahmed Hemani, Kolin Paul, Naeem Abbas

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) > 276 - 286

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Today, machine learning based on neural networks has become mainstream, in many application domains. A small subset of machine learning algorithms, called Convolutional Neural Networks (CNN), are considered as state-ofthe- art for many applications (e.g. video/audio classification). The main challenge in implementing the CNNs, in embedded systems, is their large computation, memory, and bandwidth...

chapter

Co-Run Scheduling with Power Cap on Integrated CPU-GPU Systems

Qi Zhu, Bo Wu, Xipeng Shen, Li Shen, more

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) > 967 - 977

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

This paper presents the first systematic study on co-scheduling independent jobs on integrated CPU-GPU systems with power caps considered. It reveals the performance degradations caused by the co-run contentions at the levels of both memory and power. It then examines the problem of using job co-scheduling to alleviate the degradations in this less understood scenario. It offers several algorithms...

chapter

Clustering Throughput Optimization on the GPU

Michael Gowanlock, Cody M. Rude, David M. Blair, Justin D. Li, more

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) > 832 - 841

2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

Large datasets in astronomy and geoscience often require clustering and visualizations of phenomena at different densities and scales in order to generate scientific insight. We examine the problem of maximizing clustering throughput for concurrent dataset clustering in spatial dimensions. We introduce a novel hybrid approach that uses GPUs in conjunction with multicore CPUs for algorithmic throughput...

Keywords:
KERNEL
THROUGHPUT

Publication date

Set your own date range

Content availability

Available (269)
None (4)

Keywords

LINUX (71)
HARDWARE (45)
PROTOCOLS (41)
SERVERS (38)
GRAPHICS PROCESSING UNITS (37)
COMPUTER ARCHITECTURE (35)
INSTRUCTION SETS (34)
BANDWIDTH (31)
PERFORMANCE EVALUATION (29)
FIELD PROGRAMMABLE GATE ARRAYS (28)
IP NETWORKS (26)
TRANSPORT PROTOCOLS (24)
GPU (21)
OPTIMIZATION (20)
BENCHMARK TESTING (19)
RECEIVERS (19)
PARALLEL PROCESSING (18)
DELAY (17)
GRAPHICS PROCESSING UNIT (17)
SWITCHES (17)
RANDOM ACCESS MEMORY (16)
CUDA (15)
DATA MINING (14)
MEMORY MANAGEMENT (14)
PIPELINES (14)
SOCKETS (13)
ENCODING (12)
GPGPU (12)
PROGRAM PROCESSORS (12)
SCHEDULING (12)
TCP (12)
VIRTUAL MACHINING (12)
ENGINES (11)
LOCAL AREA NETWORKS (11)
PERFORMANCE (11)
ALGORITHM DESIGN AND ANALYSIS (10)
ARRAYS (10)
CONTEXT (10)
CRYPTOGRAPHY (10)
DECODING (10)
FPGA (10)
INTERNET (10)
MONITORING (10)
SYNCHRONIZATION (10)
VIRTUAL MACHINES (10)
CLOUD COMPUTING (9)
COPROCESSORS (9)
DELAYS (9)
MULTIPROCESSING SYSTEMS (9)
RESOURCE MANAGEMENT (9)
SCALABILITY (9)
SCHEDULES (9)
YARN (9)
DRIVER CIRCUITS (8)
TELECOMMUNICATION CONGESTION CONTROL (8)
OPERATING SYSTEM KERNELS (7)
PIPELINE PROCESSING (7)
REAL TIME SYSTEMS (7)
REGISTERS (7)
CLOCKS (6)
COMPUTATIONAL MODELING (6)
CONGESTION CONTROL (6)
CONVOLUTION (6)
DIGITAL SIGNAL PROCESSING (6)
LINUX KERNEL (6)
MEASUREMENT (6)
OPTIMISATION (6)
PROGRAMMING (6)
RESOURCE ALLOCATION (6)
STREAMING MEDIA (6)
TELECOMMUNICATION TRAFFIC (6)
WIRELESS LAN (6)
CACHE STORAGE (5)
COMPUTER GRAPHIC EQUIPMENT (5)
CONTAINERS (5)
DEGRADATION (5)
DETECTORS (5)
EMBEDDED SYSTEMS (5)
ETHERNET NETWORKS (5)
MESSAGE SYSTEMS (5)
MICROPROCESSOR CHIPS (5)
MULTI-THREADING (5)
NETWORK INTERFACES (5)
OPENCL (5)
PREFETCHING (5)
PROCESSOR SCHEDULING (5)
QUALITY OF SERVICE (5)
ROUTING (5)
SHARED MEMORY (5)
SYSTEM-ON-CHIP (5)
TIME FACTORS (5)
VIRTUALIZATION (5)
WRITING (5)
ACCELERATION (4)
BUFFER STORAGE (4)
COMPLEXITY THEORY (4)
DATABASES (4)
EMULATION (4)
more

INFONA - science communication portal

Search results

Improvement of a TCP incast avoidance method using a fine-grained kernel timer

Enhancement in Data-Recovery and Re-Transmit Mechanisms of TCP

A network-centric TCP for interactive video delivery networks (VDN)

Accuracy/energy-flexible stochastic configurable 2D Gabor filter with instant-on capability

Mixed data layout kernels for vectorized complex arithmetic

POSTER: Accelerate GPU Concurrent Kernel Execution by Mitigating Memory Pipeline Stalls

Non-von-neumann heap for better streaming, capturing and storing of raw 8K video data

Enhancing VNF's performance using DPDK driven OVS user-space forwarding

OpenCL-based design pattern for line rate packet processing

Renovate high performance user-level stacks' innovation utilizing commodity network adaptors

Fast and efficient implementation of Convolutional Neural Networks on FPGA

Taming Performance Degradation of Containers in the Case of Extreme Memory Overcommitment

A Communication-Aware Container Re-Distribution Approach for High Performance VNFs

On Energy-Efficient Congestion Control for Multipath TCP

Rate-aware flow scheduling for commodity data center networks

PACENet: Energy efficient acceleration for convolutional network on embedded platform

The actual cost of software switching for NFV chaining

MOCHA: Morphable Locality and Compression Aware Architecture for Convolutional Neural Networks

Co-Run Scheduling with Power Cap on Integrated CPU-GPU Systems

Clustering Throughput Optimization on the GPU

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options