The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In recent years due to increased competition between companies in the services sector, predict churn customer in order to retain customers is so important. The impact of brand loyalty and customer churn in an organization as well as the difficulty of attracting a new customer per lost customer is very painful for organizations. Obtaining a predictive model customer behaviour to plan for and deal with...
The Support Vector Machine(SVM) is well known in machine learning and artificial intelligence for its high performance in data classification, regression and forecasting. Usually for large scaled dataset, an incremental training algorithm is applied for tuning or balancing the training cost and the accuracy in SVM applications. This paper presents an improved incremental training approach for large...
With the increased volume of patent databases during the past years, it becomes necessary for companies to correctly classify and identify innovative patents in a timely manner though the use of automation. Although many patent classification methods have been proposed, the accuracy remains the most challenging factor for the success of a classification model. This paper presents an empirical study...
Named entities in a text are the atomic elements that represent the name of something, and the name can be a person name, name of an organization, name of a place or location etc. In the field of information extraction the identification and classification of named entities are quite an important task. The identification and classification of the named entities in a text into some pre-defined classes...
Named Entity Recognition (NER) is used to classify each word of a document into predefined named entity classes and is important for Natural Language Processing (NLP) tasks such as information retrieval, question answering system, machine translation etc. Mising is a Tibeto-Burman language spoken by over 500,000 Mising people who inhabit in Assam. Mising is a resource-constrained language. The corpus...
Named Entity Recognition (NER) is a very important task in the field of computational linguistics. In the following paper, we have discussed NER in Hindi using Hidden Markov Model (HMM). We have also discussed the challenges faced while performing NER in Indian languages.
Research that explores the use of machine learning for automatic security classification of information objects is about to emerge. In this paper we investigate the opportunity to increase the machine learning performance by taking advantage from time information that is "hidden" in the documents of the training set. This paper presents a technique to do so, and confirms that this is a promising...
With the increasing risk of data leakage, information guards have emerged as a novel concept in the field of security which bears similarity to spam filter that examine the content of the exchanged messages. A guard is defined as a high-assurance device used to control the information flow, typically from a domain with a "high" level of confidentiality, such as a corporate or military network,...
Email communication carrying malicious attachments or links is often used as an attack vector for initial penetration of the targeted organization. Existing defense solutions prevent executables from entering organizational networks via emails, therefore recent attacks tend to use non-executable files such as PDF. Machine learning algorithms have recently been applied for detecting malicious PDF files...
Maintenance costs can be substantial for organizations with very large and complex software systems. This paper describes research for reducing anomaly report turnaround time which, if successful, would contribute to reducing maintenance costs and at the same time maintaining a good customer perception. Specifically, we are addressing the problem of the manual, laborious, and inaccurate process of...
Although data mining techniques are made tremendous progress, "knowledge-poor" is still a large gap of the current data mining systems. Few researches notice the fact that useful knowledge not only is the final results of an intelligent classification, clustering or prediction algorithm, but also runs through the whole process of data mining in which much potential useful information is...
The paper has done an empirical study in five large department stores of Baoding to explore the mechanism of influencing factors on customer loyalty. This study finds that traffic facilities, product value, reputation factor, Consumer individual factor, service function quality and shopping environment are six main dimensions in customer loyalty influencing dimensions.And then, taking Baoding department...
In this paper, a hybrid pattern for Chinese organization names based on Support Vector Machine(SVM) is proposed, which fuses multiple features. With consideration of the features of Chinese organization names, local features and global features are abstracted, and feature-vectors are expressed in binary, the training collection is established. From the experimental results on testing set for 1998...
Feature selection is a process to select a subset of original features. It can improve the efficiency and accuracy by removing redundant and irrelevant terms. Feature selection is commonly used in machine learning, and has been wildly applied in many fields. we propose a new feature selection method. This is an integrative hybrid method. It first uses Affinity Propagation and SVM sensitivity analysis...
This paper reports about the development of a Named Entity Recognition (NER) system for Bengali by combining the outputs of the classifiers like Maximum Entropy (ME), Conditional Random Field (CRF) and Support Vector Machine(SVM) using a majority voting approach. The training set consists of approximately 150 K word forms and has been manually annotated with the four major NE tags such as Person name,...
The asset valuation is specialized work, and it requests the certified public valuer (CPV) to have the very strong specialized competent ability. Along with the development of the economic in China, the reform of stock system, the tax reform, and the application of fair value in new business accounting standards, the new evaluation domain develops unceasingly, and which sets the new request to the...
This paper establishes a set of operation benefit assessment index system of expressway enterprises effectively, the starting points of which are operation character of expressway enterprises and content of operation benefit assessment; after turning assessment indexes into data, I carry on synthetic assessment of operation benefit in expressway enterprise unifying SVM model established in this article...
Organization name recognition is the most difficult part in named entity recognition, in order to reduce the use of tagged corpus and use a large amount of untagged corpus, we firstly present using semi-supervised machine learning algorithm co-training combining with conditional random fields model and support vector machines on Chinese organization name recognition. Based on the principles of compatible...
In this paper we explore the way to allow a user to interactively organize a multimedia database through a dynamic interface, creating its own "audiovisual concepts" freely. The user defines distances on a small subset of documents, using low-level audio and video off-line automatically extracted descriptors. The semi-supervised learning process, relying on support vector regression used...
A TBL based post-processing approach is proposed for Japanese named entity recognition (NER) in this paper. Firstly, tuning rules are automatically acquired from the results of Japanese NER by error-driven learning. And then, the tuning rules are optimized according to given threshold conditions. After filtered, the rules are used to revise the results of Japanese NER. Above all, this approach could...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.