The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
True Muslims around the world believe in Al Quran and Hadith. Al Quran is the principal religious text of Islam, a revelation from Allah. Hadith is also one of the fundamental sources of Islamic references and guidance for the Muslims after the Holy Book, Al-Quran. Hadith is referred to a report, statement, act, story, narration or discourse. Hadith, originally in Arabic, covers a wide range of issues...
Phishing email fraud has been considered as one of the main cyber-threats over the last years. Its development has been closely related to social engineering techniques, where different fraud strategies are used to deceit a naïve email user. In this work, a latent semantic analysis and text mining methodology is proposed for the characterisation of such strategies, and further classification using...
depend probabilistically both on other properties of that object and on properties of related objects. In this paper an attempt is made to heed keywords extraction. The keywords are not only essential for academic papers but also important for web page retrieval, text mining, and document classification. In this paper, a C
We develop and analyze an unsupervised and domain-independent method for extracting keywords from single documents. Our approach differs from the previous ones in the way of identifying candidate keywords, pruning the list of candidate keywords with several filtering heuristics and selecting keywords from the list of
A document surrogate is usually represented in a list of words. Because not all words in a document reflect its content, it is necessary to select important words from the document that relate to its content. Such important words are called keywords and are selected with a particular equation based on Term Frequency
This paper proposes an extended vector space model (VSM), which is called M2VSM (meta keyword-based modified VSM). When conventional VSM is applied to document clustering, it is difficult to adjust the granularity of cluster in terms of topic. In order to solve the problem, M2VSM considers meta keywords such as
Keyword selection is one of the most important tasks for patent retrieval. However, few researchers have focused on how to choose keywords appropriately in comparison with to improve retrieval performance via techniques from Bibliometics, such as patent counts, citation and so on. The paper has proposed, thus, a new
This paper presents a keyword extraction technique that can be used for tracking topics over time. In our work, keywords are a set of significant words in an article that gives high-level description of its contents to readers. Identifying keywords from a large amount of on-line news data is very useful in that it can
The demand for extracting keywords related to national issues from various sources and using them to retrieve R&D information has increased rapidly recently. In order to satisfy this demand, three methodologies are proposed in this study: a hybrid methodology for extracting and integrating national issue
Keywords Extraction plays a very important role in the text-mining domain, since the keywords can represent the asserted main point in a document. Based on the term network and deleting actor index, an effective keywords extraction algorithm is proposed to extract high frequent terms as well as important terms with
propose Term-Frequency and Inverse Document Frequency (TF-IDF) method to rank keywords of top twenty most followed Instagram users based on image captions of Instagram. The objective of this research is to automatically know the main idea of Instagram users based on 50 recent image captions posted. In our experiments, TF-IDF
This study at first used the text mining method to analyze the keywords of the Chinese news reports related to Macan's gambling industry from June to September 2012. The study got 19 major keywords at the first step. In order to comprehend the influence of each keyword in each document, the study applied the Fruit Fly
Keywords are subset of words or phrases from a document that can describe the meaning of the document. Many text mining applications can take advantage from it. Unfortunately, a large portion of documents still do not have keywords assigned. On the other hand, manual assignment of high quality keywords is time
literature infrastructure was obtained using bibliometrics and literature of the co-keyword network was visualized. It show how co-word analysis techniques can be used to study R&D in enterprises. The results of the study can help support strategic decision-making on the direction of S&T programs in enterprises.
Due to the exponential growth of available text documents in digital form, it is of great importance to develop techniques for automatic document classification based on the textual contents. Earlier document classification techniques have used keyword-based features and related statistics to achieve good results when
Pittsburgh dataset. We experimented with 3 different topic modeling methods including LDA and 2 ICD-based methods and a keyword search method for the identification of delirium related documents and sentences in clinical notes. As expected, the keyword search method is highly specific but insufficiently sensitive when searching
this problem by automatically dividing the social network of a Twitter user into personal cliques, and annotating each clique with keywords to identify the common ground of a clique. Our proposed clique annotation method extracts keywords from the tweet history of the clique members and individually weights the extracted
Nowadays, blogs are one of important web services to publish and share various information. Accordingly, evaluation of various keywords in blogs is one of the important research topics for effective and efficient classification and retrieval of blogs in the blogosphere. In this paper, we propose a method to identify
document. We think that our graph captures many properties of the text documents and can be used for different application in the field of text mining and NLP, such as keyword extraction and to know the nature of the document. Our approach to construct a semantic graph is independent of any language. We performed an
-specific keywords. An automated profiling algorithm is proposed for this purpose, which starts from generic/noisy reviewer profiles extracted using Google Scholar and derives custom conference-centric reviewer and paper profiles. Each reviewer is expert on few sub-topics, whereas the pool of reviewers and the conference may
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.