The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In applications of Web data integration, we frequently need to identify whether data objects in different data sources represent the same entity in the real world. This problem is known as entity resolution. In this paper, we propose a generic framework for entity resolution for relational data sets, called BARM, consisting of the Blocker, Attribute matchers and the Record Matcher. BARM is convenient...
This article presents an agent-based modeling and simulation to design a decision support system for healthcare emergency department to aid in setting up management guidelines to improve it. This ongoing research is being performed by the Research Group in Individual Oriented Modeling at the University Autonoma of Barcelona with close collaboration of the hospital staff team. The objective of the...
In this paper, an effective hybrid optimization strategy by incorporating the metropolis acceptance criterion of Simulated Annealing (SA) into crossover operator of Genetic Algorithm (GA), is used to simultaneously optimize the input feature subset selection, the type of kernel function and the kernel parameter setting of SVR, namely GASA-SVR. The developed GASA-SVR model is being applied for monthly...
This paper describes a set of software tools which facilitate the automated processing of clinical voice (laryngeal) evaluation results, thus assisting doctors in voice disorder diagnostics, treatment and related research. The software tools include a website and a number of client/server applications developed by the authors at the Massachusetts General Hospital (MGH) for patient voice data entry,...
In this paper we investigate the problem of providing durability for Web Service transactions in the presence of system failures. We show that the popular lazy replica update propagation method is vulnerable to loss of transactional updates in the presence of hardware failures. We propose an extension to the lazy update propagation approach to reduce the risk of data loss. Our approach is based on...
The ability to hide private data and confidential patterns from potential adversaries while still maintaining data mining value is an important aspect in privacy preserving data mining. In this paper, we study a nonnegative matrix factorization technique, where we show how to define objective functions and derive corresponding multiplicative update functions. We then use that knowledge to propose...
Feature selection is an important preprocessing step when learning from bioinformatics datasets. Since these datasets often have high dimensionality (a large number of features), selecting the most important ones both improves performance and reduces computation time. In addition, when the features in question are genes (as is the case for microarray datasets), knowing the important genes is useful...
Aspects of modern information systems challenging computational and statistical analysis are dynamic complexity, high dimensionality and inherent stochasticity. We outline the use of geometric methods to provide information neighbourhoods for data visualization and monitoring of algorithms, and dynamics of stochastic behaviour trajectories. Geometrization of models of real phenomena give valuable...
More advanced and complex applications in social networks, gaming, entertainment, medical research, and GIS, to name a few, drive the requirements to process large data sets, commonly called BigData. As mentioned in an IBM report [1], BigData is characterized not only by volume (terabytes and petabytes) but also by velocity (speed requirements) and variety (types of data sets). The manipulation of...
In end-of-life cancer care, nurse clinicians strive to deliver care to minimize pain, manage debilitating symptoms, and improve patients' quality of life as they approach end of life. To improve symptom management in end-of-life care, research suggests that nurse clinicians require access to contextually relevant, medical knowledge resources at the point-of-care to assist with clinical decision making...
Feature reduction is a major problem in data mining. Though traditional methods such as feature ranking and subset selection have been widely used, there has been little consideration given to assuring satisfactory performance of a learning machine in relation to the minimum of features required or the “critical dimension”. This critical dimension is unique to a specific dataset, learning machine,...
The growing number of textual reports poses a great challenge for investigative analysis. However, text visualization has the potential to address this problem by automating the analysis of text reports, thus reducing workloads and providing new insights for crime analysts. We are developing a crime report visualization system for such investigative analysis. Our system leverages natural language...
The incorporation of soft human-generated data into the fusion process is an emerging trend in the data fusion community. This paper describes an extension of our original Random Set (RS) theoretic soft/hard data fusion system from single-target to multi-target tracking case. Leveraging recent developments in the RS theoretic data fusion community, we propose a novel soft measurement-to-track association...
In this paper is offered one new approach for efficient processing and analysis of groups of multispectral images of same objects. It comprises several tools: the Inverse pyramid decomposition for still images; the invariant object representation with Modified Mellin-Fourier transform, and the hierarchical search in image databases, for which the invariant representation is used. The new approach...
With increasing opportunities for cheaper outsourcing of data, more and more organizations are seriously considering this option to reduce storage and processing costs. However, it has also given rise to the possibilities of security and privacy violations of data in outsourced environments. In this paper, we look at the privacy aspect, often referred to as data confidentiality. Our solution employs...
Automatic gender classification is an interesting and challenging task that impacts important applications in biometrics, security, surveillance, and human-computer interaction systems. This paper presents an effective facial feature descriptor based on the directional ternary pattern (DTP) for gender classification. The DTP operator encodes the texture information of a local neighborhood by labeling...
We do a survey of some of the most important principles of anonymization present in the literature. We identify different kinds of attacks that can be thrown against an anonymized dataset and give formulas for the maximum probability of success for each. For each principle, we identify whether it is monotonous, what attacks it is suited to counter, if any, and what principles imply other principles...
A nonlinear statistical ensemble prediction modeling method has been developed for predicting monthly mean rainfall using Particle Swarm Optimization (PSO) algorithm and neural network (NN) technique. Comparison results of prediction experiments show that the PSO-NN ensemble prediction (PNNEP) model is superior to the traditional linear statistical forecast method in prediction capability. Computation...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.