The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Users' repetitive daily or weekly activities may constitute user profiles. For example, a user's frequent command sequences may represent normative pattern of that user. To find normative patterns over dynamic data streams of unbounded length is challenging. For this, an unsupervised learning approach is proposed in our prior work by exploiting a compressed/quantized dictionary to model common behavior...
Insider threats are veritable needles within the haystack. Their occurrence is rare and when they do occur, are usually masked well within normal operation. The detection of these threats requires identifying these rare anomalous needles in a contextualized setting where behaviors are constantly evolving over time. To this refined search, this paper proposes and tests an unsupervised, ensemble based...
Insider threat detection requires the identification of rare anomalies in contexts where evolving behaviors tend to mask such anomalies. This paper proposes and tests an incremental learning algorithm based on unsupervised learning that addresses this challenge by maintaining repetitive sequences in a compressed dictionary to identify anomaly over dynamic data streams of unbounded length. For unsupervised...
Insider threat detection requires the identification of rare anomalies in contexts where evolving behaviors tend to mask such anomalies. This paper proposes and tests an ensemble-based stream mining algorithm based on supervised learning that addresses this challenge by maintaining an evolving collection of multiple models to classify dynamic data streams of unbounded length. The result is a classifier...
Evidence of malicious insider activity is often buried within large data streams, such as system logs accumulated over months or years. Ensemble-based stream mining leverages multiple classification models to achieve highly accurate anomaly detection in such streams even when the stream is unbounded, evolving, and unlabeled. This makes the approach effective for identifying insider threats who attempt...
In view of the need for a highly distributed and federated architecture, a robust query expansion has great impact on the performance of information retrieval. We aim to determine ontology-driven query expansion terms using different weighting techniques. For this, we consider each individual ontology and user query keywords to determine the Basic Expansion Terms (BET) using a number of semantic measures...
Resolving semantic heterogeneity across distinct data sources remains a highly relevant problem in the GIS domain requiring innovative solutions. Our approach, called GSim, semantically aligns tables from respective GIS databases by first choosing attributes for comparison. We then examine their instances and calculate a similarity value between them called entropy-based distribution (EBD) 1 ...
In view of the need for highly distributed and federated architecture, ranking ontologies from different data sources in a specific domain have great impact on the performance of web applications. Since ontologies for a same domain usually overlap, we aim to rank ontologies based on the commonality of overlapping entities and distance between each pair of ontologies. Overlapping entities are determined...
In this paper, we propose a near real-time effective face recognition system for consumer applications. Since the nature of application domain requires real time result and better accuracy, it poses a serious challenge. To address this challenge, we study various classification techniques, namely, support vector machine (SVM), linear discriminant analysis (LDA) and K nearest neighbor (KNN). We observe...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.