The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The Gaussian process framework models a function as a stochastic process such that the training data results into a finite number of jointly Gaussian random variables, whose properties can then be used to infer the statistics (the mean and variance) of the function at test values for the input. The computation can be implemented in a batch setting, i.e., one-shot over the entire training data, or...
Real world social networks typically consist of actors (individuals) that are linked to other actors or different types of objects via links of multiple types. Different types of relationships induce different views of the underlying social network. We consider the problem of labeling actors in such multi-view networks based on the connections among them. Given a social network in which only a subset...
Bipolar magnetic regions (BMRs) are the corner-stone of solar variability. They are tracers of the large-scale magnetic processes that give rise to the solar cycle, shapers of the solar corona, building blocks of the large-scale solar magnetic field, and significant contributors to the free-energetic budget that gives rise to flares and coronal mass ejections. Surprisingly, no homogeneous catalog...
Big data platforms like Hadoop and Spark are being widely adopted both by academia and industry. In this paper, we propose a runtime intrusion detection technique that understands and works according to the memory properties of such distributed compute platforms. The proposed method is based on runtime analysis of memory access patterns of tasks running on the slave nodes of a distributed compute...
Nearly all existing dimension reduction methods on 2D matrix-valued image predictors are unsupervised or supervised without preserving matrix structure, which can result in loss of the structure-specific relation between the response and predictors. In this paper, we propose a kernel-based solution for supervised dimension reduction which preserves the matrix structure of the reduced predictors. This...
The classification of graphs is a key challenge within many scientific fields using graphs to represent data and is an active area of research. Graph classification can be critical in identifying and labelling unknown graphs within a dataset and has seen application across many scientific fields. Graph classification poses two distinct problems: the classification of elements within a graph and the...
Understanding ongoing topics and their evolutions in social media is of great importance. Although topic analysis is not a novel research question, social media environment has presented new challenges. First, with insufficient co-occurrence information, short text have undermined many word co-occurrence oriented topic models' applicability. Second, real time message streams make traditional discretized...
While kernel methods using a single Gaussian kernel have proven to be very successful for nonlinear classification, in case of learning problems with a more complex underlying structure it is often desirable to use a linear combination of kernels with different widths. To address this issue, this paper presents a classification algorithm based on a jointly convex constrained optimization formulation...
Community detection has been an important task for social and information networks. Existing approaches usually assume the completeness of linkage and content information. However, the links and node attributes can usually be partially observable in many real-world networks. For example, users can specify their privacy settings to prevent non-friends from viewing their posts or connections. Such incompleteness...
Container based virtualization is rapidly growing in popularity for cloud deployments and applications as a virtualization alternative due to the ease of deployment coupled with high-performance. Emerging byte-addressable, nonvolatile memories, commonly called Storage Class Memory or SCM, technologies are promising both byte-addressability and persistence near DRAM speeds operating on the main memory...
Traditionally, investors try to estimate short term portfolio volatility based on daily return. When tick-by-tick data are available, investors use different volatility estimators based on high-frequency data to evaluate the portfolio risk in the hope of outperforming those based on low-frequency data. In this paper, we optimize block realized kernel estimator in Hautsch et al. (2015) and propose...
Scaling-up scientific data analysis and machine learning algorithms for data-driven discovery is a grand challenge that we face today. Despite the growing need for analysis from science domains that are generating ‘Big Data’ from instruments and simulations, building high-performance analytical workflows of data-intensive algorithms have been daunting because: (i) the ‘Big Data’ hardware and software...
In this paper, we pose and address some of the unique challenges in the analysis of scientific Big Data on supercomputing platforms. Our approach identifies, implements and scales numerical kernels that are critical to the instantiation of theory-inspired analytic workflows on modern computing architectures. We present the benefits of scalable kernels towards constructing algorithms such as principal...
Limited access to supervised information may forge scenarios in real-world data mining applications, where training and test data are interconnected by a covariate shift, i.e., having equal class conditional distribution with unequal covariate distribution. Traditional data mining techniques assume that both training and test data represent an identical distribution, therefore suffer in presence of...
Seaports play a vital role in the global economy, as they operate as the connection corridors to all other modes of transport and as engines of growth for the wider region. But ports today are faced with numerous unique challenges and for them to remain competitive, significant investments are required. In support of greater transparency in policy making, decisions regarding investment need to be...
The Support Vector Machine (SVM) is a classical classification algorithm that has a wide range of application. With kernel function, SVM can dispose the datasets that are not linearly separable in their original feature space, making it more flexible in practical use compared with linear model. However, its complexity in training is an obstacle to large-scale dataset handling. This paper proposes...
We introduce a kernel formulation of the recently proposed minimum density hyperplane approach to clustering. This enables the identification of clusters that are not linearly separable in the input space by mapping them into a feature space. This mapping also extends the applicability of the minimum density hyperplane to datasets whose features are not necessarily continuous. The location of minimum...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.