The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In web pages, the reviews are written in natural language and are unstructured-free-texts scheme. Online product reviews is considered as a significant informative resource which is useful for both potential customers and product manufacturers. The task of manually scanning through large amounts of review one by one is computational burden and is not practically implemented with respect to businesses...
Collaborative filtering is the most widely used and successful technology for building recommender systems. However it faces challenges of scalability and recommendation accuracy. Collaborative filtering can be divided into memory based and model based. The former is more accurate while the latter performs better in scalability. This paper proposes a hybrid user model. The recommender system based...
Analysis the positive and negative sentiments about each topic of the product are very useful to the customers and manufacturers. In this paper we propose a new topic sentiment mixture model which we call Semi-supervised Co-LDA model to obtain the positive and negative opinions from the reviews about each product. The Semi-supervised Co-LDA can model the topic and sentiment of the product reviews...
At present, several feature meta-models have been come up with. However, they can't meet the requirements of dynamic Internet environment or software reuse. This paper proposes a feature meta-model based on ontology. In particular, it can adapt to various types of changes in the dynamic environment and can make design and implementation more convenient than traditional methods. Finally, we give out...
In recent years, the MP3 music objects become the popular type of music file in many internet audio applications, including the surveillance system. But, less attention was received to the content-based classification of audio data. While Cloud Services were blooming, the classification of MP3 music has better more and more important. It is necessary to process much audio data when Cloud Computing...
Recently, some researchers have found that the abounding search engines cannot support exploratory search effectively. In such case, it requires the search engines know better about the imprecise queries provided by the end users. Actually, it's hard for the users to formulate the queries, not alone understand by the engines. However, in our study, we find that the search logs in the web community...
Most of the previous researches on sentiment analysis concentrate on the binary distinction of positive vs. negative. This paper presents the multi-class sentiment classification problem that attempt to mine the implied rating information from reviews. We use four machine learning methods and two feature selection methods to find out whether or not the multi-class sentiment classification problem...
Since the emergence of BLOG, it not only represents a new network technology, but also means the beginning of a new life style. How to utilize and mine the BLOG content which contains hidden sentiment and real-time update is a big challenge in the data-mining domain. As most of the existing method for network text's topic mining is achieved through clustering text's topic and label which are labeled...
The system of arms information extraction based on the ontology, consists of two parts: knowledge base, processing program. It realizes the arms category determination based on text categorization, and realizes the arms object determination based on named entity recognition. It realizes the information extraction according to information extraction rules based on syntax and semantic constraint. It...
Detecting anomaly nodes from graphs is an important objective in many applications ranging from social networks to World Wide Web. Recently several methods have been proposed to address this problem. A limitation of most of these methods is that they are based on the random walk of the graph, and often fail to be effective. In this paper, we propose a new framework to detect anomaly nodes within a...
Really Simple Syndication(RSS) has been widely used in our daily lives, but RSS doesn't always collect interesting articles, user has to sift through every subscription for articles they like. The ranking of unread RSS articles has the potential power to release user from this heavy burden. Although user preferences can be learned from explicit feedbacks such as rating or tagging, implicit feedback...
Along with the rapidly development of the information retrieval and web technology, web entity retrieval has become a new popular way for getting specific information, such as looking for a book or a movie. Like document retrieval, generally there are too many results returned for a query, so ranking is still a necessary step during the entity retrieval process. This paper will focus on the ranking...
There are lots of ranking algorithms used in Web information retrieval. However, current algorithms have some problems: these algorithms are based on different calculation formulas to calculate the documents and query similarity or train a lot of training data to get corresponding calculation formula which calculate documents and query similarity. We know that this process is a very complex, and sometimes...
Information hiding technology is a hot spot in information security, and is applied in the fields of digital multimedia copyright protection and secret communication. According to the analysis of the characteristics of browser in parsing HTML of the web page and the little capacity available for information hided in web page, a new efficient web page information hiding method with tag attributes has...
Deep Web information integration has become more and more important due to its rich and high quality data. How to select the most appropriate web database relevant to user's requirement is challenging. However, the existing researches only focus on the data source selection method and ignore the expression of user's query requirement. This paper presents a user query requirements modeling language,...
The rapid development of Web 2.0 bring the flourish of web reviews. Web reviews are usually released in form of structured records. As the important information source for many popular applications(e.g. monitoring and analysis of public opinion), review records need to be extracted accurately from web pages. To the best of our knowledge, little work in literatures has systemically investigated this...
Along with the rapid popularity of the Internet, crime information on the web is becoming increasingly rampant, and the majority of them are in the form of text. Because a lot of crime information in documents is described through events, event-based semantic technology can be used to study the patterns and trends of web-oriented crimes. In our research project on cyber crime mining, we construct...
Knowledge discovery is the non-trivial process of identifying valid, novel, potentially useful and ultimately understandable patterns in data. The complicated computational environment with ultra-large-scale, heterogeneous, highly-dynamic, and semantic-implicit data in the 21st century puts forward new problems and challenges for traditional knowledge discovery. As a solution, Semantic Web and Cloud...
The use of folksonomies involves several problems due to its lack of semantics associated with them. The nature of these structures makes difficult the process to enrich them semantically by the association of meaningful terms of the Semantic Web. This task implies a phase of disambiguation and another of expansion of the initial tagset, returning an increased contextualised set where synonyms, hyperonyms,...
The explosive growth of the Internet inevitably leads to the proliferation of harmful information such as pornography, drug and violence. A great deal of filtering techniques based on image and text categorization is proposed in the literature. Among them, text-based filtering plays a leading role for its good performance. Existing text filtering algorithms can be seen as a classical text categorization...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.