The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In multi-tier storage systems with large amounts of data, most of the data is stored on inexpensive slower tiers such as cloud or tape to achieve cost savings. This also implies that retrieving the data from the slower storage tiers incurs high latency. Therefore, it would be beneficial to proactively prefetch data from slower tiers to faster tiers by predicting future data accesses. State-of-the-art...
Sales forecasting is widely recognized, and it is evidently improved the quality of business strategy. It is necessary to understand the sales trend of a service provider business to monitor or predict future income or profit loss. COPYTRADE is a photocopier service provider that houses its branches near the schools and inside the malls where students are commonly found. Since there were several branches...
Continuous training is crucial for creating and maintaining the right skill-profile for the industrial organization's workforce. There is a tremendous variety in the available trainings within an organization: technical, project management, quality, leadership, domain-specific, soft-skills etc. Hence it is important to assist the employee in choosing the best trainings, which perfectly suits her background,...
Nowadays, a hot challenge for supermarket chains is to offer personalized services to their customers. Market basket prediction, i.e., supplying the customer a shopping list for the next purchase according to her current needs, is one of these services. Current approaches are not capable of capturing at the same time the different factors influencing the customer's decision process: co-occurrence,...
Secondary use of biomedical data has gained much attention recently to facilitate rapid knowledge discovery in biomedicine. Association Rule Mining (ARM) has been a popular technique for biomedical researchers to perform exploratory data analysis and discover potential relationships among variables in biomedical datasets. However, ARM of a high-dimensional biomedical dataset may produce a large number...
Cataract is a cloudiness of eye lens and studies have reported many risk factors for the development of cataract. However, the cumulative effect of multiple factors along with clinical and systemic disease conditions have not been adequately tested due to a limitation in methodology. The collection of a large volume of Electronic Health Records (EHR) offers an opportunity to apply computational tools...
Recommender systems have attracted much attention in last decades, which can help the users explore new items in many applications. As a popular technique in recommender systems, item recommendation works by recommending items to users based on their historical interactions. Conventional item recommendation methods usually assume that users and items are stationary, which is not always the case in...
Our research group at Nagoya Institute of Technology is developing “MMDAgent” as a voice interaction toolkit. Using MMDAgent, system developers can create various speech dialogue contents. When developers create voice interaction contents, it is important to consider user needs. Therefore, an approach is necessary to elicit preference information of the user. In this paper, we propose a method to...
With the rapid development of hospital information technologies, more and more hospitals build electronic medical record (EMR) systems, which provides a comprehensive source for medical data mining and analysis. Most current EMR systems adopt a mixed structure. On the other hand, most data mining algorithms are designed for highly structured data. In this paper, we study the problem of interesting...
Software change recommendation seeks to suggest artifacts (e.g., files or methods) that are related to changes made by a developer, and thus identifies possible omissions or next steps. While one obvious challenge for recommender systems is to produce accurate recommendations, a complimentary challenge is to rank recommendations based on their relevance. In this paper, we address this challenge for...
Frappé is a code comprehension tool developed by Oracle Labs that extracts the code dependencies from a codebase and stores them in a graph database enabling advanced comprehension tasks. In addition to traditional text-based queries, such context-sensitive tools allow developers to express navigational queries of the form Does function X or something it calls write to global variable Y? providing...
It is more and more common to use function words as an important text feature of Chinese, such as the research on “A Dream of Red Mansions” of Li Xianping. But the effect of using all function words as a feature in distinguishing writers' writing style is not prominent. Our study finds that using the classical Chinese function words and sentence tail function words as a feature is better than differentiated...
We present the Modernizing Analytics for MELanoma (MAMEL) dataset: a real-world, dermatologyspecific research dataset specifically crafted to advance data mining and machine learning research in the field of melanoma diagnosis, analysis, and treatment. This dataset was collected and curated from Modernizing Medicine’s EMA DermatologyTM application, a cloud-based Electronic Health Record (EHR) platform...
We propose and demonstrate an approach for the often attempted problem of market prediction. We restrict our study to a widely purchased and well recognized commodity, crude oil, which experiences significant volatility. Robust debate exists over the applicability of the weak and semi-strong versions of the Efficient Market Hypothesis (EMH) to financial markets. In this paper we train nine learners...
A large amount of time-series data has been frequently used to extract the useful patterns and trends and to visualize them for better understanding. This work is focusing on visualizing personal lifelogging data for tracking back to personal histories. Thereby, we present several similarity measures between multi-dimensional data at two different time points. For human evaluation, the method has...
while e-commerce has grown quickly in recent years, more and more people are used to utilize this popular channel to purchase products and services on the Internet. Therefore, it becomes very important for shopping sites to predict precisely which items their customers would buy so as to increase sales or improve customer satisfaction. Traditional algorithms such as Collaborative Filtering, has been...
One of the most common causes of bugs is overlooking changes. To prevent bugs and improve the quality of the products, numerous studies have been undertaken on change guides based on logical couplings extracted from developers' past process histories, such as change history. While valuable change rules based on logical couplings can be gleaned found from the change history, these rules often fail...
Refactoring is an important technique to improve maintainability of software, and developers often use this technique during a development process. Before now, researchers have proposed some techniques finding refactoring opportunities for developers. Finding refactoring opportunities means identifying locations to be refactored. However, there are no specific criteria for developers to determine...
Social bots are regarded as the most common kind of malwares in social platform. They can produce fake messages, spread rumours, and even manipulate public opinions. Recently, massive social bots are created and widely spread in social platform, they bring negative effects to public and netizen security. Bot detection aims to distinguish bots from human and it catches more and more attentions in recent...
Each commit in repositories of version control systems should include code changes for only a single task. However, in real repositories, there are many commits for multiple tasks and tasks split into multiple commits. We call the latter split commits. In this research, we firstly investigate how many and what kinds of split commits are included in repositories. Then, we classify the found split commits...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.