The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Accurate capacity measurement of Internet services is critical to ensure high-performing production computing environments. In this work, we present our solution of performing accurate capacity measurement. Referred to as "Redliner", it uses live traffic in production environments to drive the measurement, hence avoiding many pitfalls that prevent capacity measurement from obtaining accurate...
Accurate capacity measurement of Internet services is critical to ensure high-performing production computing environments. In this work, we present our solution of performing accurate capacity measurement. Referred to as “LiveRedliner”, it uses live traffic in production environments to drive the measurement, hence avoiding many pitfalls that prevent capacity measurement from obtaining accurate values...
Linux kernel feature of Cgroups (Control Groups) is being increasingly adopted for running applications in multi-tenanted environments. Many projects (e.g., Docker) rely on cgroups to isolate resources such as CPU and memory. It is critical to ensure high performance for such deployments. At LinkedIn, we have been using Cgroups and investigated its performance. This work presents our findings about...
Large-scale web services like LinkedIn serve millions of users across the globe. The user experience depends on high service availability and performance of the services. In such a scenario, capacity measurement is critical for these cloud services. Resources should be provisioned such that the service can easily handle peak traffic without experiencing bottlenecks or compromising on latency. In addition,...
Today's applications are increasingly using memory mapped files for managing large volumes of data in hoping to enjoy the performance benefits of memory mapping compared with traditional file IO. Memory mapped files uses the OS page caching mechanism to save expensive system call and copying. However, as we find out, a naive usage of memory mapped files will cause severe performance problem due to...
For enterprise applications that deal with large scale of data, storage IO is oftentimes the performance bottleneck. SSD (Solid State Drive) is increasingly being adopted by companies/applications to alleviate applications' IO bottleneck. However, not every application/product is justified to migrate to SSD from HDD (Hard Disk Drive), as such migration will incur more business cost due to SSD's higher...
Increasing adoption of Big Data in business environments have driven the needs of stream joining in realtime fashion. Multi-stream joining is an important stream processing type in today's Internet companies, and it has been used to generate higher-quality data in business pipelines. Multi-stream joining can be performed in two models: (1) All-In-One (AIO) Joining and (2) Step-By-Step (SBS) Joining...
For PaaS-deployed (Platform as a Service) customer-facing applications (e.g., online gaming and online chatting), ensuring low latencies is not just a preferred feature, but a must-have feature. Given the popularity and powerfulness of Java platforms, a significant portion of today's PaaS platforms run Java. JVM (Java Virtual Machine) manages a heap space to hold application objects. The heap space...
SSD (Solid State Drive) is being increasingly adopted to alleviate the IO performance bottlenecks of applications. Numerous measurement results have been published to showcase the performance improvement brought by SSD as compared to HDD (Hard Disk Drive). However, in most deployment scenarios, SSD is simply treated as a "faster HDD". Hence its potential is not fully utilized...
Data quality is essential in big data paradigm as poor data can have serious consequences when dealing with large volumes of data. While it is trivial to spot poor data for small-scale and offline use cases, it is challenging to detect and fix data inconsistency in large-scale and online (real-time or near-real time) big data context. An example of such scenario is spotting and fixing poor data using...
Modern cloud computing platforms (e.g. Linux on Intel CPUs) feature ACPI-based (Advanced Configuration and Power Interface) mechanism, which dynamically scales CPU frequencies/voltages to adjust the CPU frequencies based on the workload intensity. With this feature, CPU frequency is reduced when the workload is relatively light in order to save energy, while increased when the workload intensity is...
Ensuring low replication latency of database events is business-critical but challenging with big data. A proposed capacity-planning model helps achieve this goal by forecasting future traffic rates, predicting replication latency, and determining required replication capacity. The Web extra at http://youtu.be/ZupPlrS8dGA is a video of in which author Zhenyun Zhuang demonstrates Naarad, an open-source...
Cloud Computing promises a cost-effective and administration-effective solution to the traditional needs of computing resources. While bringing efficiency to the users thanks to the shared hardware and software, the multi-tenency characteristics also bring unique challenges to the backend cloud platforms. In particular, the JVM mechanisms used by Java applications, coupled with OS-level features,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.