The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Social media offers a wealth of insight into howsignificant events -- such as the Great East Japan Earthquake, the Arab Spring, and the Boston Bombing -- affect individuals. The scale of available data, however, can be intimidating: duringthe Great East Japan Earthquake, over 8 million tweets weresent each day from Japan alone. Conventional word vector-basedevent-detection techniques for social media...
A time series discord is a subsequence that is maximally different to all the rest subsequences of a longer time series. Classic discord discovery has been used for detecting anomalous or interesting pattern, which usually represents the most unusual subsequences within a time series. However, an anomalous or interesting pattern may happen twice or more times so that any instance of this pattern is...
Pointwise matches between two time series are of great importance in time series analysis, and dynamic time warping (DTW) is known to provide generally reasonable matches. There are situations where time series alignment should be invariant to scaling and offset in amplitude or where local regions of the considered time series should be strongly reflected in pointwise matches. Two different variants...
New data sources from sensor networks and Internet-of-Things applications promise a wealth of interaction data that can be naturally represented as time-varying networks. This brings forth new challenges for the identificationand removal of time-varying graph anomalies that entail complex correlations of topological features and temporal activity patterns. Here we present an anomaly detection approach...
In many real-world networks, interactions between entities are observed at specific moments in continuous time, such as email, SMS messaging, and IP traffic. The majority of methods for analyzing such data first aggregate communication over designated time blocks, resulting in one or more discrete time series, to which existing tools can be applied. However, regardless of how the block lengths are...
A new criterion is introduced for determining the order of an autoregressive model fit to time series data. The proposed technique is shown to give a consistent and asymptotically efficient order estimation. It has the benefits of the two well-known model selection techniques, the Akaike information criterion and the Bayesian information criterion. When the true order of the autoregression is relatively...
Dynamic community detection algorithms tryto solve problems that identify communities of dynamicnetwork which consists of a series of network snapshots. Toaddress this issue, here we propose a new dynamiccommunity detection algorithm based on incrementalidentification according to a vertex-based metric calledpermanence. We incrementally analyze the communityownership of partial vertices, so as to...
Classifying sequential data is an important problem in machine learning with applications in time series, sensor streams, and image analysis. The ordered structure of sequential data presents a difficulty for the standard classification models, which has motivated the task of generating features for vector-based discriminative models. Shapelet methods, which have been extensively studied in this topic,...
Using a proper model to characterize a time series is crucial in making accurate predictions. In this work we use time-varying autoregressive process (TVAR) to describe non-stationary time series and model it as a mixture of multiple stable autoregressive (AR) processes. We introduce a new model selection technique based on Gap statistics to learn the appropriate number of AR filters needed to model...
The development of accurate flood prediction model could reduce number of fatalities. In this paper, water level time series, spatio-temporal precipitation and hydrological data are used for flood prediction. Since our data is high dimensional and not all features are correlated to flood, our proposed algorithm is designed to find influential spatial features, or features at locations which are highly...
The paper proposes the vicinities merging algorithm for prediction with side information. The algorithm is based on specialist experts techniques. We use vicinities in the side information domain to identify relevant past examples, apply standard learning techniques to them, and then use prediction with expert advice tools to merge those predictions. Guarantees from the theory of prediction with expert...
Lack of the global knowledge of land-cover changes limits our understanding of the earth system, hinders natural resource management and also compounds risks. Remote sensing data provides an opportunity to automatically detect and monitor land-cover changes. Although changes in land cover can be observed from remote sensing time series, most traditional change point detection algorithms do not perform...
In this paper, we focus on the problem of how to design a methodology which can improve the prediction accuracy as well as speed up prediction process for stock market prediction. As market news and stock prices are commonly believed as two important market data sources, we present the design of our stock price prediction model based on those two data sources concurrently. Firstly, in order to get...
It is becoming increasingly common for organizations to collect very large amounts of data over time, and to need to detect unusual or anomalous time series. For example, Yahoo has banks of mail servers that are monitored over time. Many measurements on server performance are collected every hour for each of thousands of servers. We wish to identify servers that are behaving unusually. We compute...
We demonstrate a near real-time service monitoring system for detecting and diagnosing issues from high-dimensional time series data. For detection, we have implemented a learning algorithm that constructs a hierarchy of detectors from data. It is scalable, does not require labelled examples of issues for learning, runs in near real-time, and identifles a subset of counter time series as being relevant...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.