The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Natural language processing methods are widely used to study the relationship between traditional Chinese medicine (TCM) prescriptions and diseases in textual data, and the results can discover the essence of TCM literature. In this paper, we get TCM treatment information from the abstract text at first by using the web crawlers. Second, the eigenvectors will be selected from the cleaned abstract...
Clinical summarization means the collection and synthesis of a patient's significant data, undertaken in order to support health-care providers in the process of patient care. Considering that medical information comes from multiple sources, a system for the automatic generation of problem lists could prove to be very effective in terms of saving time in the analysis of large amounts of medical data...
Authorship analysis deals with the identification of authors which is a problem of text data mining and classification. There are numerous techniques and algorithms that have been published so far, in the field of stylometry. In this regard, the primary objective of the present review is to provide the status of the different studies carried out on authorship analysis based on the important research...
In current times, there has been a surge in the amount of collected data from computational systems. The vast amount of data can be useful in many applications and fields, particularly so in Big Data Analytics. However with a large collection of data there is a difficulty discovering important information. Automatic Document Summarization (ADS) systems are suitable for the task of outlining useful...
With the high usage of internet today, people started sharing much of the information with each other online. In this paper, we propose to monitor user activity for any hazardous behavior like terrorism on Gmail and Twitter. We apply Natural Language Processing (NLP) techniques like POS tagging, Chunking, Stemming, and WordNet Processing to extract the keyword and check to see whether the information...
Visualization is the process of representing data graphically and interacting with these representations in order to gain insight into the data and to assist human information processing by reducing demands on attention, working memory, and long-term memory. The graphical representation of data is also used in the Web as a mean which carries visual and easy to understand information. However, graphics...
The results of the requirements engineering process are predominantly documented in natural language requirements specifications. Besides the actual requirements, these documents contain additional content such as explanations, summaries, and figures. For the later use of requirements specifications, it is important to be able to differentiate between legally relevant requirements and other auxiliary...
Service requirements documentation plays a crucial role on the quality of service-oriented systems to be developed. A large amount of service requirements are documented in the form of natural language, which are usually human-centric and therefore error-prone and inaccurate. In order to improve the quality of service requirements documents, we propose a service requirements modeling and validation...
The notion of events plays a crucial role in narrative texts. Events are the situations that happen or occur at a particular place and time. Extraction and representation of event has a significant role in many of the natural language text and applications like text summarization, question answering systems etc. Several methods were developed so far but they addressed the problem in domain aspect...
Lot of time is spent on E-Mails for communication in today's IT world, peoples prefer to send email for business purpose and information exchange. Email management is necessary because once our inbox is full of mails we avoid to read out one by one in that case some important email may get missed. Always user try to avoid unnecessary email reading for that a better email management system is required...
Advancements in social media technology have resulted in the booming of massive public data. The availability of these huge data sets offers numerous research opportunities for deriving meaningful cause-effect relationships for many applications. One important application domain is the cause of side effects of drugs. In this paper, we applied supervised learning to extract useful cause-and-effect...
Nowadays, most of the data on the Web is still in the form of unstructured text. Knowledge extraction from unstructured text is highly desirable but extremely challenging due to the inherent ambiguity of natural language. In this article, we present an architecture of an information extraction system based on the concept of Embedded Controlled Language that allows for extracting formal semantic knowledge...
With the prominent advances in Web interaction and the enormous growth in user-generated content, sentiment analysis has gained more interest in commercial and academic purposes. Recently, sentiment analysis of Arabic user-generated content is increasingly viewed as an important research field. However, the majority of available approaches target the overall polarity of the text. To the best of our...
Search engines can return ranked documents as a result for any query from which the user struggle to navigate and search the correct answer. This process wastes user's navigation time and due to this the need for automated question answering systems becomes more urgent. We need such a system which is capable of replying the exact and concise answer to the question posed in natural language. The best...
Software developers have long been supported by a variety of tools, such as version control systems (e.g., GIT), issue tracking systems (e.g., BugZilla), and mailing list services (e.g., Mailman). These tools accumulate a wide range of information that is recorded in the repositories these tools store their data in. This information is comprised of two significantly different types of data: structured...
The Internet provides many sources of different opinions, expressed through user reviews of products, blogs, and forum discussions. Systems which could automatically summarize these opinions would be immensely useful for those who wish to use this information to make decisions. The previous work in automatic summarization has completely focused on extractive summarization, in which key sentences are...
Multiword expressions (MWEs) are important for practical applications, such as machine translation (henceforth, MT), multilingual information retrieval, data mining and other natural language processing. A method of combining similarity measure and statistical tool is proposed for automatically extracting English MWEs from the corpus of Chinese government white papers and work reports from 1991 to...
In software development, the knowledge of developers, architects and end users is spread out across dozens of development artifacts. Historically, structured development artifacts such as source code have been the primary focus of software engineering research, but the last couple of years have seen a dramatic increase of research on unstructured data, such as free-form text requirements and specifications,...
Telugu is an Indian language spoken by more than 50 million people in the country. Language is very rich in literature, and it requires advancements in computational approaches. Applications like machine translation, speech recognition, speech synthesis and information retrieval need a powerful morphological generator to give morphological forms of nouns and verbs. The existing Telugu morphological...
In many NLP applications, text topic identification is a common problem. Traditional topic identification method always generated a single-layered topic structure which is usually inaccurate topic division even if generated manually by the human experts. This paper proposed a concept of hierarchical topic which used multi-layer topic tree structure to represent the text or text set. Secondly, this...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.