The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Software testing is a fundamental software engineering activity for quality assurance that is also traditionally very expensive. To reduce efforts of testing strategies, some design metrics have been used to predict the fault-proneness of a software class or module. Recent works have explored the use of machine learning (ML) techniques for fault prediction. However most used ML techniques can not...
Association rule mining plays an important job in knowledge and information discovery. Often the number of the discovered rules is huge and many of them are redundant, especially for multi-level datasets. Previous work has shown that the mining of non-redundant rules is a promising approach to solving this problem, with work in focusing on single level datasets. Recent work by Shaw et. al. has extended...
The application of scientific methodology to clinical practice is typically realized through recommendations, policies and protocols represented as Clinical Practice Guidelines (CPG). CPG help the clinician in his choices, improving the patient care process.The representation of Guidelines and their introduction in medical information system can lead to efficient Clinical Decision Support Systems...
One of the aspects of a clustering algorithm that should be considered for choosing an appropriate algorithm in an unsupervised learning task is stability. A clustering algorithm is stable (on a dataset) if it results in the same clustering as it performed on the whole dataset, when actually performs on a (sub)sample of the dataset. In this paper, we report the results of an empirical study on the...
In this paper, a new dynamic clustering algorithm based on random sampling is proposed. The algorithm addresses well known challenges in clustering such as dynamism, stability, and scaling. The core of the proposed method isbased on the definition of a function, named the Oracle,which can predict whether two random data points belongto the same cluster or not. Furthermore, this algorithm isalso equipped...
In this paper, we use a flock of agent-based swarm intelligence approach for simultaneously clustering and visualizing high-dimensional Web usage data. Our approach is based on improvements that overcome several limitations of the FClust algorithm. Our proposed approach is a hybrid, combining the strengths of the spherical k-means algorithm for fast clustering of high-dimensional data sets in the...
Learning from imbalanced datasets is a well known problem in the data mining community. Many techniques have been proposed to alleviate the problems associated with class imbalance, including data sampling and boosting. While data sampling has received the bulk of the attention from the research community, our results show that boosting often results in better classification performance than even...
We present a new ensemble learning method that employs a set of regional classifiers, each of which learns to handle a subset of the training data. We split the training data and generate classifiers for different regions in the feature space. When classifying new data, we apply a weighted voting among the classifiers that include the data in their regions. We used 10 datasets to compare the performance...
Maximum likelihood (ML) method for estimating parameters of Bayesian networks (BNs) is efficient and accurate for large samples. However, ML suffers from overfitting when the sample size is small. Bayesian methods, which are effective to avoid overfitting, have difficulties for determining optimal hyperparameters of prior distributions with good balance between theoretical and practical points of...
The design of course timetables for academic institutions is a very hectic job due to the exponential number of possible feasible timetables with respect to the problem size. This process involves lots of constraints that must be respected and a huge search space to be explored, even if the size of the problem input is not significantly large. On the other hand, the problem itself does not have a...
The Web service composition (WSC) problem on behavioral descriptions deals with the automatic construction of a coordinator web service to control a set of web services to reach the goal states. As such, WSC is one of the fundamental techniques to enable the Service Oriented Architecture on the Web. Despite its importance and implications, however, very few studies exist on the computational complexities...
The aim of educational systems is to design a sequence of learning objects on a set of topics tailored to the learner's goals and individual properties. However, some of the main difficulties actual educational systems have to face is the generation of learning routes for multiple learners, the lack of an explicit management of time and resources or the synchronization of group activities. We claim...
Control architectures, such as the LAAS architecture, CLARATY and HARPIC, have been developped to provide autonomy to robots. To achieve a robot's task, these control architectures plan sequences of sensorimotor behaviors. Currently carried out by roboticians, the design of sensorimotor behaviors is a truly complex task that can require many hours of hard work and intensive computations. In this paper,...
With the semantic Web progress, encoding of knowledge bases as ontologies has increased. Information retrieval applications are employing this knowledge organization to enhance quality of results by returning documents semantically related and relevant to initial user's query. The proposed fuzzy information retrieval model retrieves information providing a framework to encode a knowledge base composed...
In this paper, a high-speed, adaptive depth segmentation method is proposed, which results in superior performance over current employed segmentation algorithms when applied in real-time tracking applications. Existing segmentation methods are difficult to implement in real-time due to their slow performance, whereby enhancing their run-time they are better applicable in real-time approaches, i.e...
Walksat-like algorithms are considered among the most powerful local search methods to solve the satisfiability problem. Such algorithms introduce a diversification mechanism based on a random walk strategy. This one is controlled by a noise parameter for which the optimal value setting is strongly dependent on the treated instance. In this paper, we propose to extend a previous work in order to reduce...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.