The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We study online compound decision problems in the context of sequential prediction of real valued sequences. In particular, we consider finite state (FS) predictors that are constructed based on the sequence history, whose length is quite large for applications involving big data. To mitigate over training problems, we define hierarchical equivalence classes and apply the exponentiated gradient (EG)...
Effective Big Data Mining requires scalable and efficient solutions that are also accessible to users of all levels of expertise. Despite this, many current efforts to provide effective knowledge extraction via large-scale Big Data Mining tools focus more on performance than on use and tuning which are complex problems even for experts. Weka is a popular and comprehensive Data Mining workbench with...
Underground caves and their specific structures are important for geomorphological studies. In this paper we present a new tool to identify and map speleothems by surveying cave chambers interiors. One of the research problems that we had to solve was that we were dealing with a great number of points that resulted from the Laser scan. The cave chamber was surveyed using Terrestrial Laser Scanning...
Frequent item set mining is an exploratory data mining technique that has fruitfully been exploited to extract recurrent co-occurrences between data items. Since in many application contexts items are enriched with weights denoting their relative importance in the analyzed data, pushing item weights into the item set mining process, i.e., Mining weighted item sets rather than traditional item sets,...
The processing of large volumes of RDF data require an efficient storage and query processing engine that can scale well with the volume of data. The initial attempts to address this issue focused on optimizing native RDF stores as well as conventional relational databases management systems. But as the volume of RDF data grew to exponential proportions, the limitations of these systems became apparent...
Support vector machine (SVM) is a popular classifier dealing with small-scale datasets. It has outstanding performance compared to other classifiers. However the execution time is extremely long when training Big Data. The Graphics Processing Unit (GPU) is a massively parallel device which performs very well as a co-processor. NVIDIA proposed a programming platform, CUDA, in 2006, which makes it much...
In the last few years, the data generated by social networking systems have become interesting to analyze local and global social phenomena. A useful metric to identify influential people or opinion leaders is the between ness centrality index. The computation of this index is a very demanding task since its exact calculation exhibits O(nm) time complexity for unweighted graphs. This complexity has...
In this paper we propose a Twitter recommender based on a semantic description of users' interests. To express interests we use friendship information, which is readily available in users' profiles, not only in Twitter but in the majority of Social Networks, thus presenting substantial advantage in terms of computational complexity with respect to methods based on content mining. To obtain a synthetic...
In this paper, we present a novel method in making recommendations by leveraging Tie Strength, an integrated social relationship measurement calculated from various user information gathered from social media. Moreover, the proposed method adopts Least Absolute Errors in factorization scheme to reduce the sensitivity to data outliers. We have conducted comprehensive experiments over the real datasets...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.