The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Linear Discriminant Analysis (LDA) is widely-used for supervised dimension reduction and linear classification. Classical LDA, however, suffers from the ill-posed estimation problem on data with high dimension and low sample size (HDLSS). To cope with this problem, in this paper, we propose an Adaptive Wishart Discriminant Analysis (AWDA) for classification, that makes predictions in an ensemble way...
As part of an ongoing research into extracting mission-critical information from Search and Rescue speech communications, a corpus of unscripted, goal-oriented, two-party spoken conversations has been designed and collected. The Sheffield Search and Rescue (SSAR) corpus comprises about 12 hours of data from 96 conversations by 24 native speakers of British English with a southern accent. Each conversation...
Credit data, the data that describes the attributes of customer credit collected by enterprises or institutions, which contains a wealth of credit information, is the important basis of customer credit scoring. Using data mining technology to analyze the credit data and evaluate credit of customer has become a highly efficient method for customer credit estimation. Related research has become a hot...
Mass estimation, an alternative to density estimation, has been shown recently to be an effective base modelling mechanism for three data mining tasks of regression, information retrieval and anomaly detection. This paper advances this work in two directions. First, we generalise the previously proposed one-dimensional mass estimation to multidimensional mass estimation, and significantly reduce the...
Outlier mining is an important branch of data mining and has attracted much attention recently. The density-based method LOF is widely used in application. However, the complexity of the method is quadratic to size of the dataset, and it is very sensitive to its parameters MinPts. In this paper, we propose a new outlier detection method based on Voronoi diagram, called Voronoi based Outlier Detection...
Software defect(bug) repositories are great source of knowledge. Data mining can be applied on these repositories to explore useful interesting patterns. Complexity of a bug helps the development team to plan future software build and releases. In this paper a prediction model is proposed to predict the bug's complexity. The proposed technique is a three step method. In the first step, fix duration...
Function Points (FP) are widely used as a basis to estimate software development cost and efforts. At the requirements level several estimation tools have been developed, but these tools use unified modeling language (UML) diagram. However, not all requirements documents include supplementary UML diagram. This paper describes the development of an automated tool to estimate size of software projects...
Training sequences for data-aided timing estimation in multi-input multi-output systems are designed. It is observed that for low complexity implementation, the sequences must necessarily satisfy the zero cross-correlation zone property. By restricting our search to a more tractable subset of this class of sequences, we are able to minimize the modified Cramer-Rao bound in closed form and obtain sequences...
Target intention inference is an important aspect of situation assessment. The evidence system of targets' intention inference is discussed according to the independent relationship between targets' intention and input evidence. The targets' intention probability inference model is proposed based on static Bayesian network. In order to expand the application domain and predigest the parameter learning...
Estimating the cost of development is one of the most crucial and daunting tasks for a software project manager. A lot of cost estimation models were reported in the literature but many of these models became obsolete because of the rapid changes in technology. Earlier cost estimation models used the size of the ultimate software product as the primary factor which, in many cases, was difficult to...
Mean shift spectral clustering (MSSC) brings us an alternative for image segmentation. However, owing to being based on the classical Parzen window estimator (PW) and employing the full data sample for density estimation, the usefulness of MSSC is weakened. In this paper, the improved mean shift spectral clustering (IMSSC) algorithm is proposed by replacing PW with the reduced set density estimator...
LMMSE algorithm is one of the best linear channel estimations for time domain synchronous orthogonal frequency division multiplexing (TDS-OFDM) system. The research on the computational complexity reduction of LMMSE algorithm remains to be a challenging topic. Since the combination of fast Fourier transform with singular value decomposition (FFT-SVD) simplify the channel autocorrelation matrix computation...
Wireless sensor networks (WSNs) have been of high interest during the past couple of years. One of the most important aspect of WSN research is location estimation. As a good solution of fine grained localization Reichenbach et al. introduced the distributed least squares (DLS) algorithm, which splits the costly localization process in a complex precalculation and a simple postcalculation which is...
A Web-based knowledge-gathering task is a common, but complex activity carried out by several users on the Web. The complexity is of two types: One is the inherent complexity involved in the task, which is essentially the information need of the user, and the other is the perceived complexity by the user - that varies according to the proficiency of the user, in the particular subject matter of the...
Relay networks have attracted a lot of attention for their potential to increase spatial diversity. However, limited attention has been paid to practical detector design and implementation issues. In this paper, we investigate the design of maximum-likelihood detectors for half-duplex amplify-and-forward relay networks in intersymbol interference (ISI) channels. In particular, we study the case when...
The increasing attention on global scheduling algorithms for identical multiprocessor platforms produced different, independently developed, schedulability tests. However, the existing relations among such tests have not been sufficiently clarified, so that it is difficult to understand which strategy provides the best performances in a particular scenario. In this paper, we will summarize the main...
Fractal dimension is widely adopted in spatial databases and data mining, among others as a measure of dataset skewness. State-of-the-art algorithms for estimating the fractal dimension exhibit linear runtime complexity whether based on box-counting or approximation schemes. In this paper, we revisit a correlation fractal dimension estimation algorithm that redundantly rescans the dataset and, extending...
Individual heterogeneity is important information but has not yet been considered in most researches on software outsourcing. This paper makes explicit of individual heterogeneity by using linear mixed model (LMM) based on dataset of Japan software outsourcing. Estimates prove the existence of individual heterogeneity. We also find well-defined and easy-to-monitor software are preferred in Japanese...
Distributed video coding is a coding paradigm that allows complexity to be shared between encoder and decoder, in contrast with conventional video coding. To improve coding efficiency, accurate motion estimation in encoder is required. However, the encoder of distributed video coding performs incomplete motion estimation because of increasing complexity. Therefore, coding efficiency is decreased by...
Assessment of the (Multi) Similarity among a set of protein structures is achieved through an ensemble of protein structure comparison methods/algorithms. This leads to the generation of a multitude of data that varies both in type and size. After passing through standardization and normalization, this data is further used in consensus development; providing domain independent and highly reliable...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.