The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Blog clustering is an important approach of web public opinion analysis. In this paper, an integrated graph-based approach for representing and clustering Chinese blogs by embedded sentiment is proposed. Graph-based representation and relevant clustering algorithm are applied. This graph-based blogs representation model considers not only sentiment words but also some structural information. Experimental...
Analysis the positive and negative sentiments about each topic of the product are very useful to the customers and manufacturers. In this paper we propose a new topic sentiment mixture model which we call Semi-supervised Co-LDA model to obtain the positive and negative opinions from the reviews about each product. The Semi-supervised Co-LDA can model the topic and sentiment of the product reviews...
Recently, some researchers have found that the abounding search engines cannot support exploratory search effectively. In such case, it requires the search engines know better about the imprecise queries provided by the end users. Actually, it's hard for the users to formulate the queries, not alone understand by the engines. However, in our study, we find that the search logs in the web community...
Since the emergence of BLOG, it not only represents a new network technology, but also means the beginning of a new life style. How to utilize and mine the BLOG content which contains hidden sentiment and real-time update is a big challenge in the data-mining domain. As most of the existing method for network text's topic mining is achieved through clustering text's topic and label which are labeled...
Really Simple Syndication(RSS) has been widely used in our daily lives, but RSS doesn't always collect interesting articles, user has to sift through every subscription for articles they like. The ranking of unread RSS articles has the potential power to release user from this heavy burden. Although user preferences can be learned from explicit feedbacks such as rating or tagging, implicit feedback...
Information hiding technology is a hot spot in information security, and is applied in the fields of digital multimedia copyright protection and secret communication. According to the analysis of the characteristics of browser in parsing HTML of the web page and the little capacity available for information hided in web page, a new efficient web page information hiding method with tag attributes has...
This paper presents an entity answer extraction method based on list web table. Firstly, extract table from page using the features of web page table and label, segment the table that includes the potential entity answers by calculating the relevance of web table's title and query context, merge the table elements of each column according to table properties, and merge the web table's title with the...
The rapid development of Web 2.0 bring the flourish of web reviews. Web reviews are usually released in form of structured records. As the important information source for many popular applications(e.g. monitoring and analysis of public opinion), review records need to be extracted accurately from web pages. To the best of our knowledge, little work in literatures has systemically investigated this...
Along with the rapid popularity of the Internet, crime information on the web is becoming increasingly rampant, and the majority of them are in the form of text. Because a lot of crime information in documents is described through events, event-based semantic technology can be used to study the patterns and trends of web-oriented crimes. In our research project on cyber crime mining, we construct...
The use of folksonomies involves several problems due to its lack of semantics associated with them. The nature of these structures makes difficult the process to enrich them semantically by the association of meaningful terms of the Semantic Web. This task implies a phase of disambiguation and another of expansion of the initial tagset, returning an increased contextualised set where synonyms, hyperonyms,...
To deal with the problem of too many answers returned from a Web database in response to a user query, this paper proposes a novel categorization approach which takes advantages of the user contextual preferences to construct a navigational tree in order to reduce the information overload. Based on the user original query, we first speculate how much the user cares about each attribute in the specified...
Internet public opinions receive more and more attention currently as such opinions impact much more on government decision making and the entire social opinion. The accuracy validation and tendency analysis on some critical Internet public opinions become more important for government and social to understand the status accurately. In this paper, a quick tendency forecast method is proposed and implemented...
With the widespread of Internet application, more and more enterprises build their Web sites and provide business information through Web pages. Web page classification could be used to assign the enterprise Web pages to one or more predefined business categories. On the purpose of Internet-based enterprises administration in E-government system, algorithms and application related to web page classification...
The problem of extracting data from a Web page has been studied by many works. In this paper, we present a novel approach that extracts data records from Web pages based on techniques of XML encoding. Firstly, our approach formats a given Web data page into an XML document. Then instead of using DOM-based approaches, we make use of XML encoding model to transform the XML document into a linear sequence...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.