The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Compositional reuse of software libraries is important for productivity. To promote reliability and correctness, the field also needs a way to compose specifications for reuse. How-ever, specifications cannot be adapted by the use of wrappers in the same ways as code can, which leads to specifications being copied and modified. This copying and modification of specifications leads to poor maintainability...
This paper presents a formal method to verify execution time bounds of programs at the source level, where timing constraints along with other functional requirements are specified in the routines' contracts and are verified in a modular manner. The approach works based on a countdown time budget mechanism to guarantee the termination of the input program, and incorporates the concepts of separation...
Entity Resolution is the process of determining records (mentions) in a database that correspond to the same real-world entity. Traditional pairwise ER methods can lead to inconsistencies and low accuracy due to localized decisions. Leading ER systems solve this problem by collectively resolving all records using a probabilistic graphical model and Markov chain Monte Carlo (MCMC) inference. However,...
Healthcare has and continues to be an integral component in people's lives, especially for the rising elderly population. One such healthcare program that provides for the needs of the elderly is Medicare. It is important that any such program be affordable but, unfortunately, this is not always the case. Out of the many possible factors for the rising cost of healthcare, fraud is a major contributor,...
This paper describes a data driven approach to studying the science of cyber security (SoS). It argues that science is driven by data. It then describes issues and approaches towards the following three aspects: (i) Data Driven Science for Attack Detection and Mitigation, (ii) Foundations for Data Trustworthiness and Policy-based Sharing, and (iii) A Risk-based Approach to Security Metrics. We believe...
Personal data storages (PDSs) give individuals the ability to store their personal data in a data unified repository and control release of their data to data consumers. Being able to gather personal data from different data sources (e.g., banks, hospitals), PDSs will play strategic role in individual privacy management. As such, PDS demands for new privacy models for protecting personal data. In...
In this paper, we propose techniques for detecting anomalies in user accesses by learning profiles of normal access patterns of users based on both the syntactic and semantic features of past users queries stored in database logs. New accesses are checked upon these profiles and deviations are considered anomalous accesses which may be indications of potential insider attacks. We consider two scenarios...
Elastic systems utilize both human and machine working units to accomplish tasks that are eligible for crowdsourcing. The quality in the results of work completed by either type of computing unit is tantamount on the characteristics they bear. In this paper we draw parallels from our previous work into looking at the suitability of working units in completing viable tasks in crowdsourcing. We seek...
There is currently a big demand for automating big data analysis. In the data analysis field, data abstraction or summarization playes an important role in the extraction of generalized information from large scale data. We developped an artificial intelligence computer system with the aim of automating big data analysis and came up with a method that can abstract numerical type data (age, height,...
Language endangerment is one of the most urgent problems facing humanities, with roughly one language disappearing every two weeks. Currently most of the linguistics tools used for endangered language documentation and analysis are desktop-based standalone software systems that do not expose libraries or services to be integrated by user applications, nor do they sufficiently support data sharing...
The lasting popularity of many social Q&A websites, such as Yahoo! Answers and ResearchGate, has become valuable knowledge repositories for people to search for answers to questions in various aspects in life. Finding the most relevant questions is often a non-trivial task, and a fine-grained classification system of questions will be an important aid. Existing work mainly focused on classifying...
The development of distributed systems based on poorly specified abstractions can hinder unambiguous understanding and the creation of common formal analysis methods. In this paper, we outline the design of a system modeling language called DS2, and point out how its primitives are well matched with concerns that naturally arise during distributed system design. We present an operational semantics...
Question answering (QA) is an important research issue in natural language processing, and most state-of the-art question answering systems are based on statistical models. After witnessing recent achievements in Artificial Intelligent (AI), many businesses wish to apply those techniques to an automatic QA system that is capable of providing 24-hour customer services for their clients. However, one...
A tremendous growth and progress has shown the potential of big data (i.e structured, unstructured and semi-structured) to extract valuable information and do reliable prediction for several industries. Social networking data has created additional opportunities for data scientists and researchers to utilize the data points to advance the predictive and mining models and techniques. However, predictive...
Companies in today's world need to cope with an ever greater need for flexible and agile IT systems to keep up with the competition and rapidly changing markets. This leads to increasingly complex system landscapes that are often realized using service-oriented architectures (SOA). Companies often struggle to handle the complexity and the governance activities necessary after this paradigm shift....
Blockchain represents a technology for establishing a shared, immutable version of the truth between a network of participants that do not trust one another, and therefore has the potential to disrupt any financial or other industries that rely on third-parties to establish trust. In order to better understand the current ecosystem of Blockchain applications, a scalable proof-of-concept pipeline for...
Traditional machine learning requires data to be described by attributes prior to applying a learning algorithm. In text classification tasks, many feature engineering methodologies have been proposed to extract meaningful features, however, no best practice approach has emerged. Traditional methods of feature engineering have inherent limitations due to loss of information and the limits of human...
We consider two problems in the context of tree-structured data sets (e.g., XML): (1) searching for a data element, (2) synchronizing two data trees (replicas) stored at remote locations. We propose to compute bloom filters for the interior tree nodes, this bloom filter tree is used for both data search and synchronization. It is more efficient than tree traversal since it prunes out entire subtrees,...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.