The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Lack of safety and efficacy are the two major reasons for the failures of drug candidates in drug discovery and development. Reliable prediction of blood-brain barrier permeation even before chemical synthesis still remains as one of the major challenges in drug discovery. New approaches and models that are reliable and can reduce experimental evaluations of pre-clinical candidates are in urgent need...
The very short-term bus reactive load forecasting allows the electrical system operator to determine the optimal amount of energy to supply the demand with quality, safety and reliability. With this premise, this paper used a knowledge data discovery approach to handle the forecast, from raw data to results analysis, using the neural network for data mining. Two forecasting models were developed:...
Deep learning techniques have been successfully applied to solve many problems in climate and geoscience using massive-scaled observed and modeled data. For extreme climate event detections, several models based on deep neural networks have been recently proposed and attend superior performance that overshadows all previous handcrafted expert based method. The issue arising, though, is that accurate...
This paper explores recent achievements and novel challenges of the annoying privacy-preserving big data stream mining problem, which consists in applying mining algorithms to big data streams while ensuring the privacy of data. Recently, the emerging big data analytics context has conferred a new light to this exciting research area. This paper follows the so-depicted research trend.
The data anonymization landscape has become quite complex in the last decades. On the methodology side, the statistical disclosure control methods designed in official statistics have been supplemented by a number of privacy models proposed by computer scientists. On the data side, static data sets now coexist with big data, and particularly data streams. In the quest for a unified and conceptually...
Online job boards are used by millions of job seekers, who browse through the postings for jobs that match their interest. Queries are crafted using terminology generated by the users, which may not match the language used in the job postings. Semantic enrichment methods attempt to fill such a lexical gap by re-writing the queries based on richer terms, which are mined using behavioral logs. However,...
Predictive modeling of nested geospatial data is a challenging problem as the models must take into account potential interactions among variables defined at different spatial scales. These cross-scale interactions, as they are commonly known, are particularly important to understand relationships among ecological properties at macroscales. In this paper, we present a novel, multi-level multi-task...
Besides the text content, documents and their associated words usually come with rich sets of meta information, such as categories of documents and semantic/syntactic features of words, like those encoded in word embeddings. Incorporating such meta information directly into the generative process of topic models can improve modelling accuracy and topic quality, especially in the case where the word-occurrence...
Driven by the dramatic growth of data both in terms of the size and sources, learning from heterogeneous data is emerging as an important research direction for many real applications. One of the biggest challenges of this type of problem is how to meaningfully integrate heterogeneous data to considerably improve the generality and quality of the learning model. In this paper, we first present a unified...
We introduce a dynamical spatio-temporal model formalized as a recurrent neural network for forecasting time series of spatial processes, i.e. series of observations sharing temporal and spatial dependencies. The model learns these dependencies through a structured latent dynamical component, while a decoder predicts the observations from the latent representations. We consider several variants of...
In machine learning, data augmentation is the process of creating synthetic examples in order to augment a dataset used to learn a model. One motivation for data augmentation is to reduce the variance of a classifier, thereby reducing error. In this paper, we propose new data augmentation techniques specifically designed for time series classification, where the space in which they are embedded is...
Many real-world applications are characterized by temporal data collected from multiple modalities, each sampled with a different resolution. Examples include manufacturing processes and financial market prediction. In these applications, an interesting observation is that within the same modality, we often have data from multiple views, thus naturally forming a 2-level hierarchy: with the multiple...
Selecting an efficient classifier for medical data is considered as one of the most important part of today's computer aided diagnosis. The performance of single classifiers such as decision tree classifier can be increased by ensemble method. However, this approach relies on the data quality and missing values. In this paper, we propose a new ensemble classifier to overcome overfitting and biasness...
Adverse Events (AEs) are a significant concern in healthcare, since it is among the leading causes of morbidity and mortality[12]. According to the Food and Drug Administration (FDA), between 2006 and 2014, there was a 232% increase in AE cases reported to have caused mortality[13]. In fact, the volume of all AE cases reported to the FDA has increased by almost five fold since 1997[13]. Pharmaceutical...
Personal networks formed within scientific communities and the collaborations they yield are one of the driving forces behind innovation and new discoveries. Luckily, successful collaboration produces analyzable data points in the form of publications that allow us to learn and understand some of the connections and collaborative structures in a scientific community. Co-author information is one important...
A transient is defined as a process where a system transforms into an abnormal state from a normal state. In order to enhance safety and achieve greater economic benefits, it is very important to detect and identify transients in a timely manner during the operation of nuclear power plants. Thus, according to the ideas employed in image processing and photo retouching, we propose a novel method for...
Cassava flour (Manihot esculenta Crantz) is produced in different regions of Brazil and is part of Brazilian eating habit. The product is characterized by having energy value, containing fiber and carbohydrates. However, the cassava flour houses have generated negative impacts on the environment by burning firewood. Solar energy can be an alternative free, clean and rewable energy source. This paper...
One of the most important and challenging problems in recommendation systems is that of modeling temporal behavior. Typically, modeling temporal behavior increases the cost of parameter inference and estimation. Along with it, it also poses the constraint of requiring a large amount of data for reliably learning the parameters of the model. Therefore, it is often difficult to model temporal behavior...
It is very crucial for news aggregator websites which are recent in the market to actively engage its existing users. A recommendation system would help to tackle such a problem. However, due to the lack of sufficient amount of data, most of the state-of-the-art methods perform poorly in terms of recommending relevant news items to the users. In this paper, we propose a novel approach for Item-based...
Networks are models representing relationships between entities. Often these relationships are explicitly given, or we must learn a representation which generalizes and predicts observed behavior in underlying individual data (e.g. attributes or labels). Whether given or inferred, choosing the best representation affects subsequent tasks and questions on the network. This work focuses on model selection...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.