The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
With large elastic and scalable infrastructures, the Cloud is the ideal storage repository for Big Data applications. Big Data is typically characterized by three V's: Volume, Variety and Velocity. Supporting these properties raises significant challenges in a cloud setting, including partitioning for scale out; replication across data centers for fault-tolerance; significant latency overheads due...
Recently, Jump-oriented Programming (JOP) attack has become widespread in various systems including server, desktop, and smart devices. JOP attack rearranges existing code snippets in program to make gadget sequences, and hijacks control flow of program by chaining and executing gadget sequences consecutively. However, existing defense schemes have limitations such as high execution overhead, high...
In this paper, we consider the tensor decomposition (TD) of Toeplitz Jacket (TJ) matrices for big data processing by using the conventional higher order singular value decomposition (HOSVD) algorithm and Tensor train (TT) decomposition. In order to use HOSVD algorithm and TT decomposition, we reshape the given matrix and make it as a tensor. Due to the property of Toeplitz matrices, we use a truncated...
Recent studies show that ubiquitous smartphone data, e.g., the universal cell tower IDs, WiFi access points, etc., can be used to effectively recover individuals' mobility. However, recording and releasing the data containing such information without anonymization can hurt individuals' location privacy. Therefore, many anonymization methods have been used to sanitize these datasets before they are...
Both M81 and B95-8 are distinct strains belonged to Epstein-Barr Virus (EBV). However, as M81's target cell is epithelial cell and that of B95-8 is B cell, and common EBV vaccine only causes an effect only on B95-8, we can recognize that these two EBVs have different characteristics. In this paper, we analyzed DNA sequence using three algorithms: Apriori, Decision Tree, and support vector machine...
We introduce a question-answering system that responds to a keywords-query by extracting information from linked data and generating reports in natural language (NL). Using entity disambiguation and distributed word similarity, we matched each keyword to a related entity and property in linked data. To extract keyword-related information, we used the entity and property to generate a SPARQL query...
Due to the growing number of unlabeled documents, it is becoming important to develop unsupervised methods capable of automatically extracting information. Topic models and neural networks represent two such methods, and parameter approximation algorithms are typically employed to estimate the parameters because it is not possible precisely to compute the parameters when using these methods. One of...
A number of reasoning studies on big ontology have been carried out in the recent years. However, most of the existing studies have focused heavily on Hadoop MapReduce. In this paper, we propose a reasoning approach for Resource Description Framework Schema (RDFS) that employs optimized methods based on Spark. Spark is a general distributed inmemory framework for large-scale data processing that is...
A question answering (QA) system constructs its answers automatically by querying a structured database known as a knowledgebase or an unstructured collection of documents and a set of questions. Paraphrase approaches are widely used to solve paraphrastic problems in natural language QA systems. In machine-learning-based Korean paraphrase, the system requires a large-scale mono/bi-lingual corpus....
What emotion do we feel when we see a situation? Multimodal sentiment analysis has been used to answer this question, but most of the research considers only low-level perceptual information such as textual, acoustic, and visual features. However, these features are not appropriate for the classification of situations as it is difficult to depict real-life complexities with low-level features. In...
There are numerous 2-dimensional matrix data for clustering including a set of documents, citation networks, web graphs, etc. However, many real-world datasets have more than three modes which require at least 3-dimensional matrices or tensors. Focusing on the clustering algorithm known as cross-association, we extend the algorithm to deal with a 3-dimensional matrix. Our proposed method is fully...
Counting triangles in networks is a fundamental problem in network science. In addition, because we are forced to manage very large real-world networks, current triangle counting algorithms naturally require a distributed computing system. In this paper, we propose a distributed triangle counting algorithm based on both the vertex-centric and node-iterator models and using the multi-level branching...
Due to the huge number of research articles in the biomedical domain, it becomes more and more important to develop methods to find relevant articles of our specific research interests. Keyword extraction is a useful method to find important topics from documents and summarize their major information. Unfortunately, it is hard to select appropriate keywords extracted by traditional method of keyword...
A large number of corporations are jointly developing various Cloud platforms. However, problems have occurred in resource management due to multiple number of platforms that were not integrated at the time of development of the Cloud federation service. If user uses heterogeneous platform, there arises difficulty in resources management. Therefore, there is a need to develop a federation monitoring...
The number of users who use location-based services (LBS) is increasing rapidly along with the proliferation of mobile devices such as the smartphone. However, LBS users have concerned about their privacy because the collected individual location information can pose a privacy violation. Therefore, it is no wonder that a lot of research is being conducted on topic such as location k-anonymity and...
DFNA5's mutations are known as the reason of autosomal dominant non-syndromic hearing loss (ADNSHL). To date, DFNA5 has been mostly experimented by particular families' DNA samples. By using three homo sapiens deafness sequences provided by ncbi, we investigated amino acid sequences in order to analyze relationships among mutants. We used apriori algorithm to find common amino acid which indicates...
In recent years, the amount of data produced by mobile devices and the Internet has increased rapidly. To facilitate the storage of such a large amount of data, open-source-based cloud storage services have also increased. However, most of administration tools which are specialized in managing open-source-based storage services have shortcomings such as lack of sufficient features and difficulty in...
As the demand for stereo 3D content increases in the various fields, many types of devices for displaying stereo contents have been developed such as TVs, laptops, and mobile devices. Thus, efficient rendering methods that can process massive stereo data on those devices are becoming more important as well as reducing a feeling of visual discomfort of the viewer. In this paper, we analyze binocular...
We developed Therapeutic Lifestyle Change Decision Aid (TLC DA) system to support an informed choice about which behavior change to work on when multiple unhealthy behaviors are present. The system collects significant amount of information which is used to generate tailored messages to consumers in order to persuade them in following certain healthy lifestyles. One of the current limitations of the...
In this paper, we present an approach to perform reasoning for scalable OWL ontologies in a Hadoop-based distributed computing cluster. Rule-based reasoning is typically used for a scalable OWL-Horst reasoning; typically, the system repeatedly performs many operations involving semantic axioms for big ontology triples until no further inferred data exists. Thus, the reasoning systems suffer from performance...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.