The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In Data Mining (DM) projects, more specifically in the Data Understanding and the Data Preparation phases, several techniques found in the literature are used to detect and handle data quality problems such as missing data, outliers, inconsistent data or time-variant data. However, the main limitation in the application of these techniques is the complexity caused by a lack of anticipation in the...
This report describes the challenges and experiences with the incremental migration of a BPEL to a BPMN 2.0 process engine. The transition is motivated by a strategic reorientation towards the new standard as well as end of life of the previous product. The solution reflects the preliminary steps of integrating the new platform into the existing application and support for parallel operation. This...
Big Data is colloquially described in terms of the three Vs: Volume, Velocity, and Variety. Volume and velocity receive a disproportionate amount of research attention, however, variety is frequently cited by practitioners as the Big Data problem that “keeps them up at night” — the problem that resists direct attacks in terms of new algorithms, systems, and approaches. We find that the cloud-based...
The detection of vulnerabilities in computer systems and computer networks as well as the weakness analysis are crucial problems. The presented method tackles the problem with an automated detection. For identifying vulnerabilities the approach uses a logical representation of preconditions and postconditions of vulnerabilities. The conditional structure simulates requirements and impacts of each...
This article proposes a novel data model for text corpora and discusses the issues on corpus query. First, a formalized definition of the corpus data is presented. Second, a data model is proposed in terms of the relational model, which is also proved to be complete. On this basis, we extend the query semantics of the traditional corpus query that generates KWIC (Keyword in Context) concordances and...
Business process automation improves organizations' efficiency to perform work. In existing business process management systems, process instances run independently from each other. However, synchronizing instances carrying similar characteristics, i.e., sharing the same data, can reduce process execution costs. For example, if an online retailer receives two orders from one customer, there is a chance...
Threats to modern ICT systems are rapidly changing these days. Organizations are not mainly concerned about virus infestation, but increasingly need to deal with targeted attacks. This kind of attacks are specifically designed to stay below the radar of standard ICT security systems. As a consequence, vendors have begun to ship self-learning intrusion detection systems with sophisticated heuristic...
Space information system (SIS) is a typical complex system, which challenges traditional modeling methods. Agent-based modeling and simulation (ABMS) can be an appropriate and effective approach to modeling and simulation of SIS. Firstly, the advantages and basic concepts of ABMS are introduced briefly. Secondly, the complexity of SIS is analyzed, and it is concluded that the complexity of SIS results...
Discrete event simulation (DES) is a technique used extensively and effectively by large companies, however it is not widely used by small to medium sized enterprises (SMEs) due to complexity and related costs being prohibitively high. In SMEs, DES-related data can be stored in a variety of formats and it is not always evident what data is required (if even available) to support a DES model in relation...
One of the main objectives of the software engineers is to provide software related solutions for social problems and to increase the availability of social welfare. In that sense, the quality of the software is directly related to address the users' needs and their level of satisfaction. To reflect user requirements to the software processes, the correct design of the database model provides a critical...
A database instance can become inconsistent with respect to its integrity constraints (ICs), for instance, after update operations. When this happens, it is possible to compute the repairs of the database. A minimal repair is a new database instance that satisfies the ICs, is obtained by applying update operations over the original instance, and differs minimally from the original instance. We can...
Based on object deputy database, newly proposed web data management system (WDMS) provides user with personal data spaces to flexibly manage their various web data. Limited to database capacity, WDMS should gather data that user need from Web according to user demand implied in their personal data spaces. However, user demand that expressed in SQL sentences in data spaces can't be comprehended and...
One often finds that multiple values of the same entity reside in a database. While all of these values were once correct, most of them may have become stale and inaccurate. Worse still, the values often do not carry reliable timestamps. With this comes the need for studying data currency, to identify the current value of an entity in a database and to answer queries with the current values, in the...
The Graphs are very powerful and widely used tool for data representation in various fields of science and engineering. Due to their versatile representational power graphs are widely used for dealing with structural information in different domains such as pattern recognition, computer vision, networks, biochemical applications, psycho-sociology, image interpretation, and many others. In many applications,...
In this paper, we consider relational databases containing uncertain attribute values, in the situation where some knowledge is available about the more or less certain value (or disjunction of values) that a given attribute in a tuple can take. We propose a possibility-theory-based model suited to this context and extend the operators of relational algebra in order to handle such relations in a “compact”...
Frequent items detection is one of the valuable techniques in many applications, such as network monitor, network intrusion detection, worm virus detection, and so on. This technique has been well studied on deterministic databases. However, it is a new task on emerging uncertain database. In this paper, a new definition of frequent items detection on uncertain data is defined. Based on it, two efficient...
We propose a probabilistic, non-intrusive method for quality assessment of speech that takes into consideration the bounded character of the preference scores. The quality ratings are modeled as iid Beta random variables, whose mean and precision are parametrized directly in terms of the signal features. Maximum likelihood estimation is used to learn the model parameters in view of a training database...
Data(base) reverse engineering is the process through which the missing technical and/or semantic schemas of a database (or, equivalently, of a set of files) are reconstructed. If carefully performed, this process allows legacy databases to be safely maintained, extended, migrated to modern platforms or merged with other, possibly heterogeneous, databases. Although this process is mostly pertinent...
XML is a commonly used data representation format for Web applications. One of the reasons for the attractiveness of XML is its flexibility to store unstructured, semi-structured and structured data.However, supporting this flexibility is challenging from a technical perspective and several approaches have been proposed for storage of XML. The focus of this paper is hybrid storage, combining relational...
We show that a large fraction of the data-structure lower bounds known today in fact follow by reduction from the communication complexity of lopsided (asymmetric) set disjointness! This includes lower bounds for: (a) high-dimensional problems, where the goal is to show large space lower bounds; (b) constant-dimensional geometric problems, where the goal is to bound the query time for space O(n polylg...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.