The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Analyzing User-Generated Content present in social media has become mandatory for companies looking for maintaining competitiveness. These data contain information such as consumer opinions, and recommendations that are seen as rich sources of information for the development of decision support systems. When observing the state of the art, it was found that there is a lack of antecedents that address...
With the wide spread of SNS (including Twitter, Facebook, and Flickr), there is a great demand for analyzing the associated Web contents consisting of a vast amount of opinions posted from anonymous users. Such opinions usually have explicit or implicit polarities. The polarity determination for short texts like Twitter is, however, very difficult. In this paper, we propose a method for sentiment...
Internet has become the world's largest information repository, especially the explosive growth of the text data on the web, the disadvantages that it need much more time to acquire and update web pages, and is not high precision have become more obvious. The text mining algorithm based on focused crawler is proposed in this paper, it classifies and integrates the whole web pages by topic using topic...
In the field of deep learning persistent research is going on to train the system by applying various algorithms and techniques. With a view to developing a well trained system many language corpus are built and then let the system to recognize the data. In this paper, we have proposed and implemented an automated comment advisor system that suggest emotion for comments after extracting writings (status...
Using topic modeling, we analyse the titles and abstracts of nearly 10,000 papers from 20 years published in 11 top-ranked Software Engineering(SE) conferences between 1993 to 2013. Seven topics are identified as the dominant themes in modern software engineering. We show that these topics are not static, rather, some of them are becoming decidedly less prominent over time (modeling) while others...
Text mining discover and extract useful information from documents, whenever increase the size and number documents leads to redouble features. The huge features for the documents adds challenge to text mining called high dimension. The aim of this proposed study is minimize the high dimension of the documents, and improve Arabic text mining using clustering. In order to achieve this goal, we propose...
Vast growth of biomedical databases has increased most of the researchers focus on the field of Text Mining. The documents appear in unstructured format. To process and discover knowledge from these data, the unstructured databases must be converted to structured format. For this task Text mining plays a vital role. Text preprocessing is an essential step in text mining. The common preprocessing tasks...
Text mining is widely applied in biology to infer relationships between biological entities. In biology, disease-gene relationships are important to discover the cause of disease. Therefore, we propose a useful method called SSL, which infers disease-related genes, using sentence structure and literature data. Using sentence structure, the proposed method decreases the number of candidate disease-related...
At the time that the knowledge explosion comes, the quantity of information increases more and more quickly for the rapid development of computer and network, and new words in the Internet are emerging in an endless stream. Automatic extraction of new words has become an essential prerequisite for many NLP assignments such as machine translation, Chinese word segmentation and sentiment analysis. This...
The vast and ever-increasing text posting in "social networks" such as Facebook and Twitter, during the last 15 years, has produced an immense and rich text repository for several areas of knowledge. Therefore, text mining has recently become a very active and attractive area of research in computer science. The limited current understanding of the knowledge represented in these repositories...
Inferring Bloom's Taxonomy among knowledge units is important and challenging. This paper proposes a novel method that can identify the revised Bloom's Taxonomy levels among knowledge units in the semantic cognitive graph (SCG) by using a graph triangularity. The method determines significant relationships among knowledge units by utilizing triangularity of knowledge units in the computer science...
Semantic analysis among knowledge units in the text is a very interesting problem in numerous applications. Beside the semantic relationships expressed in the text, relationships are also encoded in knowledge structures in our brains. However, the relationships among knowledge units are highly sophisticated and require a human judgment. In this paper, we propose a Graph-Tringluarity-based system for...
A powerful and flexible organization of documents can be obtained by mixing fuzzy and possibilistic clustering. In such organization, documents can belong to more than one cluster simultaneously with different compatibility degrees. Clusters represent topics, which are identified by one or more descriptors extracted by a proposed method. In this manuscript, we investigated whether or not the descriptors...
The increasing amount of text information on the Internet web pages affects the clustering analysis. The text clustering is a favorable analysis technique used for partitioning a massive amount of information into clusters. Hence, the major problem that affects the text clustering technique is the presence uninformative and sparse features in text documents. The feature selection (FS) is an important...
New knowledge evaluation should be based on a gain from plan refinement with taking into account new information connected with such new knowledge - value of perfect information (VPI). Therefore, knowledge discovery (KD) using text mining (TM) is an ontology learning (OL) where ontology structure provides planning means of expression.
In today's world, many real world examples are based on multi label classification. A single document may belong to a set of class labels simultaneously. The process of ranking i.e. strict ordering of class labels is of great concern here. We have used the concept of quantifiers for ranking of class labels. We have proposed eight new quantifiers, which calculate the degree of membership of class labels...
With the development of social media applications, short text mining is becoming more and more important. Due to the sparseness of short text data, both the feature correlation information (word co-occurrence) and data contiguity information (context information) are less reliable, thus most existing text mining methods which are designed to address regular text data are less efficient in short text...
Correspondence analysis is a popular research technique in this Big Data era in order to analyze correspondence relation between data in multiple categories expressed by cross tabulation. Correspondence analysis is frequently utilized for text mining and for analyzing questionnaire survey with mutually exclusive choices, which is also applicable as a visualization technique for data mining. However,...
Most of the exist Web search engines utilize matching the query keywords to pieces of information approach to identify of the data satisfying user's request. These methods are not only inefficient, but also wasted a lot of user's time to find a satisfactory results. In order to improve the problem above, we presented a different approach to identify user's request that attracts more interest is called...
In this digital era most of the information is made available in digital form. For many years, people have held the hypothesis that using phrases for a representation of document and topic should perform better than terms. In this paper we are examine and investigate this fact with considering several state of art datamining methods that gives satisfactory results to improve the effectiveness of the...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.