The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper discusses machine learning and data mining approaches to analyzing maritime vessel traffic based on the Automated Information System (AIS). We review recent efforts to apply machine learning techniques to AIS data and put them in the context of the challenges posed by the need for both algorithmic performance generalization and interpretability of the results in real-world maritime Situational...
Constructing accurate models that represent the underlying structure of Big Data is a costly process that usually constitutes a compromise between computation time and model accuracy. Methods addressing these issues often employ parallelisation to handle processing. Many of these methods target the Support Vector Machine (SVM) and provide a significant speed up over batch approaches. However, the...
Drivers use kernel extension functions to manage devices, and there are often many rules on how they should be used. Among the rules, utilization of paired functions, which means that the functions must be called in pairs between two different functions, is extremely complex and important. However, such pairing rules are not well documented, and these rules can be easily violated by programmers when...
Over recent years, the world has experienced a huge growth in the volume of shared web texts. Its users generate daily a huge volume of comments and reviews related to different aspects of their lives. In general, opinion mining/sentiment analysis refers to the task of identifying positive and negative opinions, emotions and evaluations related to an article, news, products, services, etc [1]. Arabic...
With the rapid maturity of internet and web technology over the last decades, the number of Indonesian online news articles is growing rapidly on the web at a pace we never experienced before. In this paper, we introduce a combination of rule-based and machine learning approach to find the sentences that have tropical disease information in them, such as the incidence date and the number of casualty,...
In our research, classification of music preference based on brain-wave is studied. We assume that there is a clear difference between brain-wave when hearing favorite music and it when hearing disgusting music, and we collect the brain-wave of human while hearing the music. And, there are two methods of collecting: one is separation of favorite/disgusting music clips, and the other is mixing of them...
We describe a new computational approach to edge detection. Our digital algorithm emulates propagation of light through a physical medium with specific nonlinear diffractive property. The method uses the phase profile of the output complex-amplitude image to identify the edges with different strength in a digital image. This technique is related to the recently introduced Discrete Anamorphic Stretch...
Semantic relation extraction is an important part of information extraction, it has application value in the automatic question answering system, retrieval system, ontology learning, semantic web annotation, and many other areas. Pattern representation method is context pattern in previous semi-Supervised semantic relation extraction based on bootstrapping, but it did not consider the role of the...
In recent years, data mining techniques have been used to identify companies who issue fraudulent financial statements. However, most of the research conducted thus far use datasets that are balanced. This does not always represent reality, especially in fraud applications. In this paper, we demonstrate the effectiveness of cost-sensitive classifiers to detect financial statement fraud using South...
With the recent advent of ubiquitous computing and sensor technologies, human mobility data can be acquired for monitoring and analysis purposes, e.g., Daily routine identification. Mining mobility data is challenging due to the spatial and temporal variations of the human mobility, even for the same activity. In this paper, we propose a methodology to first summarize indoor human mobility traces...
Clustering is an important analysis method commonly used in many areas, including data mining, image processing, statistics, biology, and machine learning. In this paper, we introduce a novel effective clustering method based on Euclidean Distance called Self-Increase Clustering (SIC) for detecting well-separated clusters that can be either convex or non convex sets. Unlike most of the prevalent clustering...
As many real-world data can elegantly be represented as graphs, various graph kernels and methods for computing them have been proposed. Surprisingly, many of the recent graph kernels do not employ the kernel trick anymore but rather compute an explicit feature map and report higher efficiency. So, is there really no benefit of the kernel trick when it comes to graphs? Triggered by this question,...
With the emergence of networked data, graph classification has received considerable interest during the past years. Most approaches to graph classification focus on designing effective kernels to compute similarities for static graphs. However, they become computationally intractable in terms of time and space when a graph is presented in a incremental fashion with continuous updates, i.e., Insertions...
Steganography is a concept of hiding information in order for data to remain safe and unhandled by eve droppers. In this paper we are demonstrating a way to transmit data from sender to receiver without being handled by eve through a new technique of steganography. We are using an audio file for hiding our data as audio are very less judged to changes made to them. Audio files in wav form are represented...
As an important branch of biomedical information extraction, Protein-Protein Interaction extraction (PPIe) from biomedical literatures has been widely researched, and machine learning methods have achieved great success for this task. However, the word feature generally adopted in the existing methods suffers badly from vocabulary gap and data sparseness, weakening the classification performance....
Physical memory forensic« has grown in popularity in recent years. Since malware typically operate in user space, it is important to reconstruct and track their process behavior. This paper focuses on detecting malware through a comparison of the information in the user space memory data structures. In order to expedite information extraction and ensure accuracy, the data in multiple memory management...
This paper presents a new sequential clustering algorithm based on sequential hard c-means clustering. The word sequential cluster extraction means that the algorithm extract one cluster at a time. The sequential hard c-means is one of the typical and conventional sequential clustering methods. The proposed new sequential clustering algorithm is based on Dave's noise clustering approach. A characteristic...
Chronic obstructive pulmonary disease (COPD) is a complex disorder classified as the 3rd cause of the death worldwide. So far, we know that this disease is progressive and can not be cured. In recent years, although some genes have been reported to be associated with COPD, the overlapped genetic associations can't be replicated. Therefore, it is difficult to synthesize and interpret these different...
Considering the lower accuracy of existing traffic sign recognition methods, a new traffic sign recognition method using histogram of oriented gradient - support vector machine (HOG-SVM) and grid search (GS) is proposed. First, the histogram of oriented gradient (HOG) is used to extract the characteristics of traffic sign. Then the grid search technique is applied to optimize the parameters of support...
Since mirror-like odd and even features in face recognition reflect the symmetrical and asymmetrical image information, respectively, their proper combination can improve the recognition rates to some extent. However, the face imaging process can easily be affected by external factors and encounter the noise signal, which disturbs the effect of face recognition based on combinational mirror-like odd...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.