The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Question and Answering (Q&A) platforms are an important source for information and a first place to go when searching for help. Q&A sites, like StackOverflow (SO), use reward systems to incentivize users to answer fast and accurately. In this paper we study and predict the response time for those questions on StackOverflow, that benefit from an additional incentive through so called bounties...
3,4,3ʹ,4ʹ-tetrachloroazobenzene (TCAB) is not commercially manufactured but formed as an unwanted by-product in the manufacturing of 3,4-dichloroaniline (3,4-DCA) or metabolized from the degradation of chloranilide herbicides, like propanil. While a considerable amount of research has been done concerning the toxicological and ecotoxicological effects of propanil and 3,4-DCA, limited information is...
Image search and recommendation engines try to extract relevant images for a user's information need. Existing approaches use manual tags of networks like Flickr or the surrounding webpages to create context to foster the search. Pinterest as a new upcoming social bookmarking service allows us to gain more context for an image than before. By using board headline, pin descriptions, and the actual...
Finding potential customers in social networks is a hard challenge for today's businesses. But by listening to the noise of social network posts, we identify users, who express a demand for a certain product. We achieve this identification with a two-stage text categorization classifier: First, we detect whether the post expresses a demand for some product in general. Second, we detect, which product...
In recent years, blogs have become a very popular way to publish information, express opinions and hold discussions. Hence researchers and industry have interest in analyzing the blogosphere. Due to the increasing diversity of blog usage, the initial categorization into web genres is the first necessary step before any analyses. In this research, we focus on the distinction between traditional blogs,...
The amount of newspaper and blog articles keeps growing and the analysis of these unstructured data gains importance as well in research and in the business environment. As special kind of articles we like to focus on interviews. In contrast to regular articles, interviews consist of two or more speakers with different viewpoints. We propose a semi-supervised approach to detect webpages containing...
The number of documents on the web increases rapidly and often there is an enormous information overlap between different sources covering the same topic. Since it is impractical to read through all posts regarding a subject, there is a need for summaries combining the most relevant facts. In this context combining information from different sources in form of stories is an important method to provide...
The blogosphere allows analysts to track opinions and sentiments of individuals, groups or the general public with large sample sizes regarding many topics. Essential for the sentiment analysis are visualizations. The visual understanding of large corpora's sentiment is far more effective than relying on textual representations of the analyzed content. Users are very interested in changes in the public...
In this paper we come up with a novel approach for the early detection of events in blog entries. The detection of trend is already discussed pretty often. Nevertheless, in our understanding the detection of events goes one step further. The presented algorithms detects unique happenings at a given point in time by perceiving unusual frequent occurrences of words or word groups. We introduce an implementation...
Hierarchical Cluster Labeling helps users to quickly understand and analyze hierarchical clusters. This may be used to enhance search engine results or interactive browsing like it is being used in the Blog Intelligence application. The hierarchical organization of data helps to represent different levels of detail. Hierarchical clustering may be quite common, but there are few good solutions for...
A lot of research efforts are going on in the area of mining emotions within the world wide web. The BlogIntelligence application is analyzing tons of blog posts and extracts emotions out of this big amount of data. Therefore we thought about how to visualize these emotions in a very meaningful way. While we applied a smart map as a proven technique, we overcame conceptual and technical challenges...
Being able to identify locations associated to a Web resource is essential for providing location-based Web applications. However, geographical information in Web documents is rarely supplied in a machine-readable way and therefore not easily discoverable. As a consequence, it is necessary to extract geographical keywords from Web documents and to associate locations with them. This method is called...
Information about upcoming trends is considered to be a valuable source of knowledge for both, companies and individuals. A large number of market analysts working at monitoring a particular business field, with many employing manual methods to do so. Since the amount of available data on the internet is far too high for humans to monitor, which carries a major risk of substantial amount of information...
Blogs, news portal and discussion forums are of high interest for today’s social interaction research. But the automatic information extraction from the raw html page of those media channels is still a well-known problem. We introduce a novel approach to infer website templates based on the syndication format of blogs and news portals, called feeds. In comparison to related approaches that infer templates...
Information about upcoming trends is a valuable knowledge for both, companies and individuals. Detecting trends for a certain topic is of special interest. According to the latest information over 200 million blogs exist in the World Wide Web. Hence, every day millions of posts are published. These blogs contain an enormous think tank of open-source intelligence. Considering the continuously growing...
Current ranking algorithms, such as Page Rank, Technorati authority, and BI-Impact, favor blogs that report on a diversity of topics since those attract a large audience and thus more visitors, links, and comments. On the other side, niche blogs with a very specific topic only attract a small audience and thus have only a small reach. This results in a low ranking from today's blog retrieval systems...
Current blog search engines use rankings, such as BIImpact or B2Rank, focusing on the link structure and thereat criteria externally extracted for blogs. A good, but due to the unavailability, not often used criteria is the visitor engagement. This metric can leverage the quality of a ranking extremely. For this reason, we propose to gather visitor information from log authors by providing a new blog...
The massive adoption of social media has provided new ways for individuals to express their opinions online. The blogosphere, an inherent part of this trend, contains a vast array of information about a variety of topics. Thus, it is a huge think tank that creates an enormous and ever-changing archive of open source intelligence. Modeling and mining this vast pool of data to extract and describe meaningful...
Data intensive applications, e.g. in life sciences, pose new efficiency challenges to the service composition problem. Since today computing power is mainly increased by multiplication of CPU cores, algorithms have to be redesigned to benefit from this evolution. In this paper we present a framework for parallelizing service composition algorithms investigating how to partition the composition problem...
The massive adoption of social media has provided new ways for individuals to express their opinions online. The blogosphere, an inherent part of this trend, contains a vast array of information about a variety of topics. It is thus a huge think tank that creates an enormous and ever-changing archive of open source intelligence. Modeling and mining this vast pool of data to extract, exploit and describe...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.