The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In Web environment, in order to provide appropriate Web services to users' needs it becomes important to quickly and accurately extract from Web documents contents such as main-content, menu-list, article-list, comments and so on. In this paper, we propose an efficient method that extracts various contents from Web documents. In the method, text blocks are separated from the document and context information...
Query Expansion is an important component for information retrieval systems. It makes possible the reformulation of the initial user query by adding new terms. In this paper, we propose a new approach for term selection in the relevance feedback process. This approach, based on Rocchio formula, is an adaptation to the XML information retrieval context. It can resolve two major problems specific to...
Recommender systems are a subclass of information filtering system and are widely used in the ecommerce domain [13]. They filter huge amount of data to provide personalized recommendations on services or products to users. Most of the existing approaches to develop a recommender system do not take into account contextual information such as weather, day, time, distance and location to provide recommendations...
The paper presents an approach, namely iHDev, to recommend developers who are most likely to implement incoming change requests. The basic premise of iHDev is that the developers who interacted with the source code relevant to a given change request are most likely to best assist with its resolution. A machine-learning technique is first used to locate source code entities relevant to the textual...
In the Internet of Things (IoT) era, we need to face increased masses of cross-domain data stored in different formats (either relational, XML, JSON, textual) and data streams (produced by sensors), that can be highly or loosely structured and need to be integrated for analysis. Recently, many NoSQL systems (e.g., MongoDB, Cassandra, HBASE) have been born for coping the scalability issues of current...
Long-term preservation of data, and especially the long-term preservation of digital videos is a challenging task. In this paper, we summarize the challenges that need to be addressed in this context. We suggest the use of a high-level file format for the long-term preservation of digital videos. Based on this idea, we introduce an XMT-based approach for the long-term preservation of digital videos...
We present To MaR, a scalable application that supports the efficient integration of legacy applications within a MapReduce environment. The work is motivated by scenarios for scalable content processing developed in the context of the EC project SCAPE. ToMaR specifically addresses the need for extracting data sets from large volumes of binary content based on existing, content-specific applications...
Data file layout inference refers to the problem of identifying the organizational characteristics associated with a structured text file, where every record in a text file shares the same structural properties. These properties include: character encoding, record length, field length (indicated by delimiting characters or fixed length), field position, and field semantic content. Within this paper,...
Framework-based1 applications are quite popularly used in current commercial applications. Framework-based applications are often controlled by XML configuration files. However, most of these frameworks are complex or not well documented, which poses a great challenge for programmers to correctly utilize them. To overcome these difficulties, we propose a new method to recommend XML configuration snippets...
In the current economic, budget tightening and competitive times, organizations need to be customer focused and provide customized service to customers to ensure their loyalty. To achieve this, Customer Relationship Management (CRM) systems help organizations to deal with and answer various customer queries. However with a change in the type of information being created (for example from structured...
Extracting information from semistructured documents is a very hard task, and is going to become more and more critical as the amount of digital information available on the Internet grows. Indeed, documents are often so large that the data set returned as answer to a query may be too big to convey interpretable knowledge. In this paper, we describe an approach based on Tree-Based Association Rules...
Since XML documents can appear in any semi-structured form, structural and integrity constraints are often imposed on the data that are to be modified or processed. These constraints are formally defined in a schema. But, despite the obvious advantages, the presence of a schema is not mandatory and many XML documents are not joined with any. Consequently, no integrity constrains are specified as well...
With the rise of XML as a standard for representing business data, XML data warehousing appears as a suitable solution for decision-support applications. In this context, it is necessary to allow OLAP analyses on XML data cubes. Thus, XQuery extensions are needed. To define a formal framework and allow much-needed performance optimizations on analytical queries expressed in XQuery, defining an algebra...
In the past several years, various optimization algorithms had been implemented on the project total cost minimization problem. Lately, it was developed an application that serves as a central platform that integrates all those previous implementations. Currently, such platform allows the access and execution of each one of the other utilities as modules/plugins. Each of which can be configured to...
The goal of this work is develop and test of a new software archetype, to aid the competence management process in Post-Graduate of Production Engineering Courses. This system will be designed using JADE Agent Framework, to read and analyze XML data. Those technologies have been used to build an innovative environment for software building. The research methodology used in this scientific work is...
Metadata repository acts like a backbone to a data warehouse as it stores and manages the metadata that is the basis for all the operations of a data warehouse. The generalized metadata repository presented in this paper is a comprehensive approach for creating a data warehouse from multiple and heterogeneous data sources in a semi-automatic way. The approach addresses all the issues involved in fetching...
This position paper presents some techniques for revealing personalization and users' calling behavioral patterns by using XML-based communication records from simulations. A short description of the most common methods applied is provided, along with a case study for potential application employing real telecommunication call detail records (CDRs). Finally, we describe an intelligent architecture...
A user interface description language (UIDL) consists of a specification language that describes various aspects of a user interface under development. A comparative review of some selected user interface description languages is produced in order to analyze how they support the various stages of user interface development life cycle and development goals, such as support for multi-platform, device-independence,...
TermPedia is a human language technology (HLT) application for document enrichment that automatically provides definitions for technical terms (TTs). A technical term (TT) may hinder document comprehension if it is introduced without any definition or explanation. In some cases when a term is defined, the definition may contain additional technical terms that instigate a similar problem. This is why...
Resource-Oriented Architecture is a new Web service modeling method. Based on Capability-Injection (CI) pattern, ROA could be extended for telecommunication value-added service description and creation. With the CI based ROA requirements analysis, a Pub/Sub based capability injection method is proposed in this paper. The Pub/Sub capability utilization will transform the value-added service control...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.