The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Aspect extraction is one of the most important tasks for text mining. Semi-supervised methods have been proposed to solve this problem. However, the seed terms have to be given in advance in these methods. The current methods categorize the aspects without expanding more aspects terms. And most of the methods are based on English corpus, there is a great space for the research on the aspect extraction...
Visualizing information extracted from text is helpful for intuitively understanding the information. Extracting and visualizing personal relationships from text is one of the promising applications of this approach. Existing methods usually estimate personal relationships from direct co-occurrences of personal names that appear in a text. In our previous work, we proposed a method for extracting...
Unlike large companies, start-up companies usually do not have the available resources to afford traditional mass marketing campaigns such as TV commercials or magazine advertisements. However, social networking services (e.g., Facebook, Twitter, Weibo, etc.) provide a more economically more viable opportunity for these new companies to directly communicate with their potential customers. Social networks...
Sentiment analysis or opinion mining is the field of computational study of people's opinion expressed in written language or text. Sentiment analysis brings together various research areas such as natural language processing, data mining and text mining, and is fast becoming of major importance to organizations as they integrate online commerce into their operations. This paper proposes improved...
One key step in text mining is the categorization of texts, i.e., to put texts of the same or similar contents into one group so as to distinguish texts of different contents. However, traditional word-frequency-based statistical approaches, such as VSM model, failed to reflect the complicated meaning in texts. This paper ushers in domain ontology and constructs new conceptual vector space model in...
An effective XML cluster method called neighbor center clustering algorithm (NCC) is presented in this paper, whose similarity is obtained through both structural and content information contained in XML files. Structural similarity is measured by the idea of Longest Common Subsequence, while content similarity is achieved using TF-IDF principles. It reduces computation complexity by avoiding direct...
This paper first studies the methods of web documents mining and text clustering, and summaries the fuzzy clustering algorithms and similarity measure functions, then proposes a modified similarity function which can solve the problems of feature selection and feature extraction in high-dimensional space. Finally, this paper puts forward to a dynamic fluzzy clustering algorithm(DCFCM) by combining...
Online Social Networks are so popular nowadays that they are a major component of an individual's social interaction. They are also emotionally-rich environments where close friends share their emotions, feelings and thoughts. In this paper, a new framework is proposed for characterizing emotional interactions in social networks, and then using these characteristics to distinguish friends from acquaintances...
Abstract-Scientific and technical literature is a useful resource where people can extract interesting knowledge or patterns by text mining tools. Text mining technologies have been widely used to reveal topics and the structure of topics. In this paper, the selected articles in the form of textual data are represented by the network structure at first, and then text clustering algorithm is applied...
The field of text mining seeks to extract useful information from unstructured textual data through the identification and exploration of interesting patterns. The techniques employed usually do not involve deep linguistic analysis or parsing, but rely on simple "Bag-Of-Words" (BPW) text representations based on vector space. In this paper we combine the BOW representation and Appriori algorithm...
In recent years, the text data of text mining has gradually become a new research topic. Among them, the study of the text clustering has attracted wide attention. This paper proposes an improved fuzzy clustering-text clustering method based on the fuzzy C-means clustering algorithm and the edit distance algorithm. We use the feature evaluation to reduce the dimensionality of high-dimensional text...
Text clustering is an automatic technique to group texts using the approach of feature extraction and term connection to calculate the similarities among subject contents of texts. Since the properties of terms in Chinese text (e.g. segmentation and annotation) are not as clear as the other languages, extracting and distinguishing features from Chinese text is therefore much more difficult, which...
The boom of opinion-rich resources such as online review Websites, discussion groups, personal blogs and forums on the Web has attracted many research efforts on opinion mining. Positive and negative opinions represented in review documents are helpful information for governments to improve their services, for companies to market their products, and for customers to purchase their commodities. In...
In text mining processes, the importance indices of the technical terms play a key role in finding valuable patterns from various documents. Further, methods for finding emergent terms have attracted considerable attention as an important issue called temporal text mining. However, many conventional methods are not robust against changes in technical terms. In order to detect remarkable temporal trends...
The main problem for generating an extractive automatic text summary is to detect the most relevant information in the source document. For such purpose, recently some approaches have successfully employed the word sequence information from the self-text for detecting the candidate text fragments for composing the summary. In this paper, we employ the so-called n-grams and maximal frequent word sequences...
Text clustering is an important task of text mining. The purpose of text clustering is grouping similar text documents together efficiently to meet human interests in information searching and understanding. The procedure of clustering should involve a cognitive process of text understanding or comprehension.This paper introduces an innovative research effort, CogHTC, a hierarchical text clustering...
Based on Discovery Feature Sub-space Model (DFSSM), this paper proposes a new web text clustering algorithm which characterizes self-stability and powerful antinoise ability. The definitions of cluster and distance measures in the concept space being given. It can distinguishes the most meaningful features from the Concept Space without the evaluation function. The application in the modern long-distance...
This paper presents a novel model for social network analysis in which, rather than analyzing the quantity of relationships (co-authorships, business relations, friendship, etc.), we analyze their communicative content. Text mining and clustering techniques are used to capture the content of communication and to identify the most popular themes. The social analyst is then able to perform a study of...
The 3 most important issues for anomaly detection based intrusion detection systems by using data mining methods are: feature selection, data value normalization, and the choice of data mining algorithms. In this paper, we study primarily the feature selection of network traffic and its impact on the detection rates. We use KDD CUP 1999 dataset as the sample for the study. We group the features of...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.