The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Hyperspectral images(HSIs) provide hundreds of narrow spectral bands for the land-covers, thus can provide more powerful discriminative information for the land-cover classification. However, HSIs suffer from the curse of high dimensionality, therefore dimension reduction and feature extraction are essential for the application of HSIs. In this paper, we propose an unsupervised feature extraction...
Magnitude-only resting-state fMRI data have been largely investigated via independent component analysis (ICA) for exacting spatial maps (SMs) and time courses. However, the native complex-valued fMRI data have rarely been studied. Motivated by the significant improvements achieved by ICA of complex-valued task fMRI data than magnitude-only task fMRI data, we present an efficient method for de-noising...
With more companies turning towards cloud computing for storage and processing of their data, the security of the cloud becomes essential. However, cloud computing is vulnerable to many security threats, including data leakages, compromised credentials, presence of unauthorized users or entities, execution of insecure applications or programming interfaces and APIs, shared technology vulnerabilities,...
Data Mining is an efficient technique for knowledge discovery from existing databases. The existing algorithms performance degrades when applied to the imbalance dataset. The imbalance nature of twitter data set also hinders the process of efficient knowledge discovery. In this paper, we proposed an efficient approach for knowledge discovery from imbalance datasets specifically designed for opinion...
Software Product Line Engineering is a key approach to construct applications with systematical reuse of architecture, documents and other relevant components. To migrate legacy software into a product line system, it is essential to identify the code segments that should be constructed as features from the source base. However, this could be an error-prone and complicated task, as it involves exploring...
Infrastructure as Code (IaC) is the practice of specifying computing system configurations through code, and managing them through traditional software engineering methods. The wide adoption of configuration management and increasing size and complexity of the associated code, prompt for assessing, maintaining, and improving the configuration code's quality. In this context, traditional software engineering...
The paper presents an analysis of developer commit logs for GitHub projects. In particular, developer sentiment in commits is analyzed across 28,466 projects within a seven year time frame. We use the Boa infrastructure’s online query system to generate commit logs as well as files that were changed during the commit. We analyze the commits in three categories: large, medium, and small based on the...
Diabetes mellitus is a group of metabolic diseases characterized by hyperglycemia resulting from defects in insulin secretion, insulin action, or both. In current scenario diabetes mellitus has become the major health problem among the people of all ages globally. Early diagnosing of diabetic causing heart, kidney and eye complications is difficult and challenging. Data mining techniques are applied...
Latent Semantic Analysis is a novel method to extract the principal components of a text corpus which has been initially used for categorization and information search. However, due to the significant results obtained, similar to human processing, LSA has become much more than a simple method to analyze text. In this work, we propose to use LSA in order to infer similarity degree of syslog messages...
Nowadays, APT attacks bring extreme threat and challenge to the network information security. Based on analysis of big data technique, the paper presents an APT security protective framework, which integrates deep and three-dimensional defense strategies, besides, the big data are used to explore and analyze possible APT attacks as well as threat positioning and tracks.
Data mining is an emerging field of research in Information Technology as well as in agriculture. The present study focus on the applications of data mining techniques in tea plantations in the face of climatic change to help the farmer in taking decision for farming and achieving the expected economic return. This paper presents an analysis using data mining techniques for estimating the future yield...
Nowadays, the analysis of social networks, as well as the community evolution has become a hotly discussed topic in social computing field. In this paper, we focus on mining and tracking the dynamic communities based on social networking analysis. Based on a generic framework for the dynamic community discovery, a computational approach is developed to extract users' static and dynamic features for...
Normally when developers obtain defects list from users, the development team will decide which defects should be fixed first. The software maintenance plan, which consists of list of defects to be fixed sequentially, is mostly generated using developer experience to prioritize the defects. With the current strategy, the software maintenance plan may not serve well to customer needs. This research...
The association rules mining process enables the end users to analyze, understand, and use the extracted knowledge in an intelligent system or to support the decision-making processes. To find valuable association rules from a large number of redundant rules, this paper proposes a deeper mining process, multi-mode and high value association rules mining (MH-ARM). This method takes into account the...
Crashing of program is an annoying experience for users. Whenever a program crashes, an event log is generated. Sometimes built in crash reporting programs send crash reports automatically to developing site whereas sometimes, user is presented with an option to report the crash himself. This reporting is often useful for the development team to diagnose and fix the problem. It happens quite often...
We propose a traffic jam prediction method based on mining frequent patterns correlated to traffic jams. For traffic jam prediction at a given sensor, first, we apply a one-dimensional clustering scheme to identify automatically which sensors are and in what degree correlated to the given sensor in terms that certain volume values with a compact distribution co-occur frequently with the traffic jams...
With no limit on time and location [1], the number of users attracted by massive open online course (MOOC) has increased rapidly, and many platforms have been built to provide a variety of courses. All of these trigger an explosive growth in data volume. As we known, people have met big data in many areas and proposed many techniques and methods to deal with them. However, people still have no sense...
The proposed paper presents a novel scheme that can perform a precise extraction of knowledge from the complex and massive streaming of live data of the scene from the crowded place. The prime contribution of the proposed system is to perform enough processing over the raw and unstructured distributed data from multiple locations so that processing over distributed storage and mining can be done with...
Given a very large dataset of moderate-to-high dimensionality, how to mine useful patterns from it? In such cases, dimensionality reduction is essential to overcome the "curse of dimensionality". Although there exist algorithms to reduce the dimensionality of Big Data, unfortunately, they all fail to identify/eliminate non-linear correlations between attributes. This paper tackles the problem...
Time series classification is an important task in data mining that has been traditionally addressed with the use of similarity-based classifiers. The 1-NN DTW is typically considered the most accurate model for temporal data. Nevertheless, some authors have recently proposed ingenious alternatives to the 1-NN DTW by using diversity of time series representation or by using DTW for feature extraction...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.