The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Understanding the relationship between coverage and test-templates (a generic term we use to describe the inputs for the random stimuli generator) is an important layer in understanding the state and progress of the verification process. Today, this is extremely hard to achieve and is based on expert knowledge. Template Aware Coverage (TAC) is a novel approach to meeting this challenge. Based on collecting...
With the rapid deployment of a number of sensors, it is crucial to efficiently manage their data streams with heterogeneous properties. To achieve various sensor applications such as discovery and mashup, a method of retrieving meaningful information from raw sensor data is required. However, it is hard to analyze and represent the sensor data since sensors generate streaming data of different patterns...
As an extension to hidden Markov models, the hidden semi-Markov models allow the probability distribution of staying in the same state to be a general distribution. Therefore, hidden semi-Markov models are good at modeling sequences with succession of homogenous zones by choosing appropriate state duration distributions. Hidden semi-Markov models are generative models. Most times they are trained...
The problem of data stream classification is challenging because of many practical aspects associated with efficient processing and temporal behavior of the stream. Two such well studied aspects are infinite length and concept-drift. Since a data stream may be considered a continuous process, which is theoretically infinite in length, it is impractical to store and use all the historical data for...
Uncertain and imprecise datasets are more and more characterizing actual database applications. These kind of data are likely to be captured by so-called probabilistic data models, which are attracting a great deal of interest from a large community of database researchers. Effectively and efficiently computing OLAP data cubes over probabilistic data is a relevant research challenge that naturally...
A robust mixture model-based clustering algorithm using genetic techniques is proposed in this paper. In many engineering and application domains, noisy samples and outliers often exist in data collections, causing negative effects on performance of data mining methods if they are not made aware of these elements. Classical probabilistic mixture-based clustering is one known to be very sensitive to...
In designed experiments, we often encountered non-normal response variables. The data transformations (Transf) approached are frequently employed to deal with these problems. One has to realize that analyzing such data based on transformations posed many drawbacks. A better approach in dealing with these problems is by using the Generalized Linear Model (GLM). The problem becomes more complicated...
Naive Bayes classifier is a simple but useful model. Bayesian network is also a kind of probabilistic graphical model, and the simple one can also have well result. A kind of two-layered Bayesian network is used to predict data in MLSP2008 competition and have got the best result in the 5 submitted valid entries.
Bayesian networks (BNs) are probabilistic graphical models that are widely used for building diagnosis- and decision-support expert systems. The construction of BNs with the help of human experts is a difficult and time consuming task, which is prone to errors and omissions especially when the problems are very complicated. Learning the structure of a Bayesian network model and causal relations from...
In this paper, we introduce the concept of Simplex Decompositions and present a new Semi-Nonnegative decomposition technique that works with real-valued datasets. The motivation stems from the limitations of topic models such as Probabilistic Latent Semantic Analysis (PLSA), that have found wide use in the analysis of non-negative data apart from text corpora such as images, audio spectra, gene array...
Classifiers based on parametric or non-parametric learning methods have different advantages and disadvantages. To take advantage of the strengths of both methods, we propose an algorithm that combines a parametric model (logistic regression) with a non-parametric classification method (k-nearest neighbors). This combination is based on a measure of appropriateness that uses a heuristic to decide...
We propose five heuristic rules for improving the quality of the final results from fully automated sequential procedures in steady-state simulation of communication networks. The goal is to reduce the probability of finishing too early. The effectiveness of these rules of thumb is quantitatively assessed on the basis of the results of coverage analysis of a method of sequential output data analysis...
Probability density estimation is a very important technology which has been widely used in data mining and data analysis. In this paper, we generalize the traditional Parzen window method to data streams and propose a new method of tilted Parzen window (TPW) for probability density estimation. To adapt to the evolvement of the data streams, we use the tilted window size that is proportional to datapsilas...
In this paper, we address the problem of extending a relational database system to facilitate efficient real-time application of dynamic probabilistic models to streaming data. We use the recently proposed abstraction of model-based views for this purpose, by allowing users to declaratively specify the model to be applied, and by presenting the output of the models to the user as a probabilistic database...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.