The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We present the results of an experiment to assess the validity of prior polarities available in sentiment lexicons. We designed a ranking task that was elicited through pairwise comparisons and compared the results to those predicted by two popular sentiment lexicons. We find that the experiment results show a moderate level of agreement between the lexicons and human judgments.
Background: Code smells are indicators of quality problems that make a software hard to maintain and evolve. Given the importance of smells in the source code's maintainability, many studies have explored the characteristics of smells and analyzed their effects on the software's quality. Aim: We aim to investigate fundamental characteristics of code smells through an empirical study on frequently...
Driven by the dramatic growth of data both in terms of the size and sources, learning from heterogeneous data is emerging as an important research direction for many real applications. One of the biggest challenges of this type of problem is how to meaningfully integrate heterogeneous data to considerably improve the generality and quality of the learning model. In this paper, we first present a unified...
Social media serves as a unified platform for users to express their thoughts on subjects ranging from their daily lives to their opinion on consumer brands and products. These users wield an enormous influence in shaping the opinions of other consumers and influence brand perception, brand loyalty and brand advocacy. In this paper, we analyze the opinion of 19M Twitter users towards 62 popular industries,...
In this paper, we study relations ranking and object classification for multi-relational data where objects are interconnected by multiple relations. The relations among objects should be exploited for achieving a good classification. While most existing approaches exploit either by directly counting the number of connections among objects or by learning the weight of each relation from labeled data...
In order to balance the number of outbound and inbound tasks of each aisle and improve the working efficiency of the multi-tier shuttle system, three principles of storage location assignment were put forward: the principle of minimum correlation degree, the principle of equalization of product items and the principle of equalization of tasks, which were expected to be followed when storing products...
With the rising popularity of smartphones and the rapid growth of mobile applications, understanding the app usage behavior of mobile users is of growing importance for both app designers and service providers. Different from previous studies mining the correlation between apps and physical world factors, e.g. location, time, etc., in this paper we focus on the interdependency among apps and try to...
The growth of digital technologies results in the growth of digital crimes. Digital forensics aims to collect crime-related evidence from various digital media and analyze it. This survey reviews several tools and methods in the literature which extract pieces of evidence from the system and analyze them. It also dfscusses the challenges during the collection and analysis of low level data from the...
The method principal component (PCA) allows to allocate from a matrix of these several objects with a large amount of signs only 1–3 vectors containing 90–95% of information. Usually measuring problem of assessment of these main components is solved by the iterative NIPALS procedure or the algebraic SVD procedure, however both of these methods often give ambiguous estimates. For the purpose of elimination...
The construction of knowledge graph of dangerous goods (KGDG) is with great significance of inferring relative information of dangerous goods, developing corresponding policy for its storage and transport, preventing disaster caused by dangerous goods(DG), and providing emergency plan when the disaster happens. Since distributed representation of natural language is an effective method for knowledge...
Automatic essay evaluation (AEE) systems are designed to assist a teacher in the task of classroom assessment in order to alleviate the demands of manual subject evaluation. However, although numerous AEE systems are available, most of these systems do not use elaborate domain knowledge for evaluation, which limits their ability to give informative feedback to students and also their ability to constructively...
PLS is widely used in the quality control process system, but it has poor capability in some strong local nonlinear system for fault diagnosis. To enhance the monitoring ability of such type fault, a novel statistical model based on global plus local projection to latent structures (GPLPLS) is proposed. Firstly, the characteristics and nature of quality-related global and local partial least squares...
Cloud Computing represents one of the most significant shifts in information technology and it enables to provide cloud-based security service such as Security-as-a-service (SECaaS). Improving of the cloud computing technologies, the traditional SIEM paradigm is able to shift to cloud-based security services. In this paper, we propose the SIEM architecture that can be deployed to the SECaaS platform...
Today's high-performance computing (HPC) systems are heavily instrumented, generating logs containing information about abnormal events, such as critical conditions, faults, errors and failures, system resource utilization, and about the resource usage of user applications. These logs, once fully analyzed and correlated, can produce detailed information about the system health, root causes of failures,...
Image Multi-label Classification (IMC) assigns a label or a set of labels to an image. The big demand for image annotation and archiving in the web attracts the researchers to develop many algorithms for this application domain. The Multi-Instance Multi-Label Learning (MIML) is an important type of machine learning framework proposed recently for IMC. In this framework, an image is described with...
Blind signal extraction is particularly attractive to solve signal mixture problems while only one or a few source signals are desired. Many desired biomedical signals exhibit distinct periods. A sequential method based on second order statistics is introduced in this paper. One can choose to recover one source signal or all signals in a specific order. The validity and performance of the proposed...
Spatial clustering analysis is one kind of spatial data mining tools to explore the spatial auto-correlation of things that occur in a particular space, that is, whether the observed values of the spatial variables are related to the spatial position where they occur. In this paper, the cigarettes in 2013 in Guizhou province, China were collected, and the spatial correlation analysis was carried out...
The analysis of biological data is a challenging problem in bioinformatics and data mining field. Given the complexity of the analysis of biological information, several methods have been proposed for analyzing this biological information in databases mostly in the form of genetic sequences and protein structures. Actually, genetic sequences are represented by matrices that indicate the expression...
This paper proposes an improvement to the PageRank algorithm. Most existing PageRank algorithms expect a strong correlation among consecutively accessed webpages, which in reality should be a fuzzy relationship when a user accesses pages on an arbitrarily basis. We mine data from search-behavior logs by analyzing chronological sequential patterns, and cluster all webpages using fuzzy C clustering...
In the era of the Internet, people are active in multiple online services, and they usually have accounts on more than one online service. Each account is a virtual identity of the user. In order to trace individual's online behavior at any time and any places, linking virtual identities belonging to the same natural person across different online service domains is very important. Existing methods...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.