The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Semantic similarity of texts is one of the important areas of Natural Language Processing, and there are several approaches to measure similarity: statistical, WordNet based, and hybrid. For all of these approaches, a lexical knowledge is used such as corpus or semantic network. WordNet is one of the most preferred and mature lexical knowledge base. In this study, we have focused on measuring semantic...
This paper presents an approach to proceed with semantic annotation in historical documents from the 19th century that discuss the constitution of the mother language, the Portuguese Language in Brazil. The objective is to generate a group of semantically annotated documents in agreement with a domain ontology. To provide this domain ontology, the Linguistic Instrument Ontology was employed, and it...
The paper presents the importance of a proper understanding of the novel content trough the perspective of bridge relationship type. Furthermore, we are interested in developing an instrument that can automatically identify the semantic bridge relationships for any sort of text. In this regard, we also suggest the automatic recognizing of anaphoric relationships with a high precision. This can be...
Continuous Delivery (CD) enables mobile developers to release small, high quality chunks of working software in a rapid manner. However, faster delivery and a higher software quality do neither guarantee user satisfaction nor positive business outcomes. Previous work demonstrates that app reviews may contain crucial information that can guide developer's software maintenance efforts to obtain higher...
This research mainly focused on automation of Unified Modeling Language (UML) diagrams from the analyzed requirement text using Natural Language Processing (NLP). The proposed system is an efficient and accurate way to obtain elements of the use case and class diagrams from proposed methods. This research mainly focuses on the design phase of a software. Nowadays everybody needs a quick and reliable...
In this era databases are used in many fields like banking, human resources, universities etc. Everyone needs to deal with databases for the extraction of required data. But it is very difficult for a common user to retrieve information from the database using database query languages. Access any kind of data from database using natural language like English is a convenient and easy method instead...
Getting semantic argument representation of a sentence is necessary in natural language processing, such as information extraction and question answering. In the semantic role labeling, selection of features become influential on its performance and also affect the recall and precision produced. The problem now is how to combine the features and combinations such as what is used in order to get the...
One of the major problems in software development process is managing software artefacts. While software evolves, inconsistencies between the artefacts do evolve as well. To resolve the inconsistencies in change management, a tool named “Software Artefacts Traceability Analyzer (SAT-Analyzer)” was introduced as the previous work of this research. Changes in software artefacts in requirement specification,...
A lot of work has been performed for many languages other than Arabic in sentence compression. Unfortunately, there is a lack of effort devoted to Arabic sentence compression. One of the reasons behind the lack of work in Arabic sentence compression is the absence of Arabic sentence compression corpora. In order to build and evaluate sentence compression systems, parallel corpora consisting of source...
Treebanks are essential resources for both data-driven approaches to natural language processing (NLP) and empirical linguistic researches. Developing these resources is time- and cost-consuming and requires specialized expertise. Therefore, they should be designed to be reused for different purposes. Currently, there are several dependency treebanks for some languages which are annotated in CoNLL...
This paper presents methods for automatic generation of phonetic databases (The Morphological and Phonetic Dictionary, The Phonetic Dictionary of Syllables, The Rhyming Dictionary) for a natural language, starting from a set of linguistic knowledge bases. The knowledge bases are developed by means of the GRAALAN (Grammar Abstract Language) system. The exemplification of this process will be described...
With the prominent advances in Web interaction and the enormous growth in user-generated content, sentiment analysis has gained more interest in commercial and academic purposes. Recently, sentiment analysis of Arabic user-generated content is increasingly viewed as an important research field. However, the majority of available approaches target the overall polarity of the text. To the best of our...
For the purpose of localization, only textual output is not sufficing the need of Machine Translation unless until it is in a usable format. In Indian scenario, localization as an industry has not been recognized yet which had led to lack of Language Standards leading to varied translation quality. Localization is the process of adapting a product or service to a particular language, culture, and...
Artefact management in a software development process is a challenging problem. Often there is a wide variety of artefacts, which are maintained separately within a software development process, such as requirement specifications, architectural concerns, design specifications, source codes and test cases, which are essential to software engineering. Artefact inconsistency is a major problem since...
This paper presents an approach to text analysis using time series and text mining, in order to extract knowledge from texts in order to analyze the types of texts that are closely related to the time factor, texts whose content can be represented on a temporal axis, such as logs chats. These techniques applied to the document type chat for determining correlations between words that occur most frequently...
In most of the existing Intelligent Tutoring System, the initial clustering of students is based on their previous semester marks. This paper introduces SoftTutor, an ITS for teaching complex data structures in C for 4th semester B. Tech (Computer Science and Information Technology)students. Here instead of using their previous semester marks, a pretest is conducted by SoftTutor which includes questions...
A labeled text corpus made up of Turkish papers' titles, abstracts and keywords is collected. The corpus includes 35 number of different disciplines, and 200 documents per subject. This study presents the text corpus' collection and content. The classification performance of Term Frequcney — Inverse Document Frequency (TF-IDF) and topic probabilities of Latent Dirichlet Allocation (LDA) features are...
Information available in different formats cannot be understood by a computer or a machine due to lack of a proper knowledge representation mechanism. It always requires more human effort in feeding the knowledge to the computers or the knowledgebase. XML covers the basic level of knowledge representation, but is incapable of utilizing the concepts and semantics in a proper way. Onto_X is an effort...
Recently, virtual surgery technologies have much progress and many simulators have been developed for education, planning rehearsal and so on. However, development of VR-based surgical simulator takes much more labor and cost not only for implementing simulation modules but also for setting surgical environment. Considering that operative surgery manuals describe the knowledge of manipulations and...
Business analysis helps companies in making decisions, and this involves doing evaluations and comparisons on business data. Our research focus on the development of a question answering (QA) system capable of interpreting and answering comparative and evaluative questions under the domain of business intelligence (BI). The paper describes the architecture, the approaches, the question and predicate...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.