Search results

Items from 1 to 20 out of 417 results

chapter

Machine Identification of High Impact Research through Text and Image Analysis

Marko Stamenovic, Sam Schick, Jiebo Luo

2017 IEEE Third International Conference on Multimedia Big Data (BigMM) > 98 - 104

2017 IEEE Third International Conference on Multimedia Big Data (BigMM)

The volume of academic paper submissions and publications is growing at an ever increasing rate. While this flood of research promises progress in various fields, the sheer volume of output inherently increases the amount of noise. We present a system to automatically separate papers with a high from those with a low likelihood of gaining citations as a means to quickly find high impact, high quality...

chapter

Unstructured data treatment for big data solutions

Shintaro Sato, Akihiro Kayahara, Shin-ichi Imai

2016 International Symposium on Semiconductor Manufacturing (ISSM) > 1 - 4

2016 International Symposium on Semiconductor Manufacturing (ISSM)

We constructed a system infrastructure capable of processing unstructured data, with the aim of practical application of the system for document data analysis in the manufacturing industry. Using past ISSM research paper data, papers were classified and verified. Using morphological analysis, the extracted parts of speech were used as feature quantities, and machine learning was executed. Since effective...

chapter

Classifying sentiments in Nepali subjective texts

Lal Bahadur Reshmi Thapa, Bal Krishna Bal

2016 7th International Conference on Information, Intelligence, Systems & Applications (IISA) > 1 - 6

2016 7th International Conference on Information, Intelligence, Systems & Applications (IISA)

With the advent of the online social media such as Facebook, Twitter and blogs, the way people perceive things around them has dramatically changed. One simple example could be how people today buy a mobile phone. If in the past, shopping involved moving from one store to the other, these days one cares more about the opinions expressed by people in product reviews rather. There is an increasing tendency...

chapter

Handwritten and machine printed text separation from Kannada document images

Rajmohan Pardeshi, Mallikarjun Hangarge, Srikanth Doddamani, K.C. Santosh

2016 10th International Conference on Intelligent Systems and Control (ISCO) > 1 - 4

2016 10th International Conference on Intelligent Systems and Control (ISCO)

Handwritten and machine printed (H&P) text separation from document images is a precursor to advance the performance of the OCR system. This paper demonstrates the competence of frequency domain features for the classification of H&P text words. We propose wavelet-like discrete cosine transform (WDCT) based features. We conduct an experiment on a large dataset of 2000 text words of popular...

chapter

Text localization in video/scene images using Kirsch Directional Masks

B.H. Shekar, Smitha M.L.

2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI) > 1436 - 1440

2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI)

Text plays its vital role in visual content analysis and understanding. Videos contain text with diversity in its text patterns and complex backgrounds. In this paper, we propose an approach based on compass operator for detecting the edges. We obtain the edge maps by convolving the Kirsch Directional Masks along eight different directions for the preprocessed video frame. The resultant images are...

chapter

Classifying emotion in Thai youtube comments

Phakhawat Sarakit, Thanaruk Theeramunkong, Choochart Haruechaiyasak, Manabu Okumura

2015 6th International Conference of Information and Communication Technology for Embedded Systems (IC-ICTES) > 1 - 5

2015 6th International Conference of Information and Communication Technology for Embedded Systems (IC-ICTES)

To add more value on YouTube, a popular portal of social media clips, it is worth recognizing automatically the mood of a media clip using the comments given to such clip. This paper presents a method to classify emotion of a Thai media clip on YouTube using the comments given to the clip. Six basic emotions considered are Anger, Disgust, Fear, Happiness, Sadness and Surprise. Performances using three...

chapter

Opinion mining and analysis: A literature review

Vandana Singh, Sanjay Kumar Dubey

2014 5th International Conference - Confluence The Next Generation Information Technology Summit (Confluence) > 232 - 239

2014 5th International Conference- Confluence The Next Generation Information Technology Summit

Sentiment analysis or opinion mining consist of many different fields like natural language processing, text mining, decision making and linguistics. It is a type of text analysis that classifies the text and makes decision by extracting and analyzing the text. Opinions can be categorized as positive and negative and measures the degree of positive or negative associated with that event (people, organization,...

chapter

Recognition of Spatial Relations in Mathematical Formulas

Fotini Simistira, Vassilis Papavassiliou, Vassilis Katsouros, George Carayannis

2014 14th International Conference on Frontiers in Handwriting Recognition > 164 - 168

2014 14th International Conference on Frontiers in Handwriting Recognition (ICFHR)

A critical issue in recognition of mathematical expressions is the identification of the spatial relations of the symbols or/and sub-expressions that comprise the entire mathematical formula. This paper addresses the problem of structural analysis of mathematical expressions by constructing appropriate feature vectors to represent the spatial affinity of the objects (mathematical symbols or sub-expressions)...

chapter

Evaluating Feature Sets and Classifiers for Sentiment Analysis of Financial News

Pal Christian S. Njolstad, Lars S. Hoysaeter, Wei Wei, Jon Atle Gulla

2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) > 2 > 71 - 78

2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)

Work on sentiment analysis has thus far been limited in the news article domain. This has mainly been caused by 1) news articles lacking a clearly defined target, 2) the difficulty in separating good and bad news from positive and negative sentiment, and 3) the seeming necessity of, and complexity in, relying on domain-specific interpretations and background knowledge. In this paper we propose, define,...

chapter

Local descriptors to improve off-line handwriting-based gender prediction

Nesrine Bouadjenek, Hassiba Nemmour, Youcef Chibani

2014 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR) > 43 - 47

2014 6th International Conference of Soft Computing and Pattern Recognition (SoCPaR)

Gender prediction based on the handwritten text becomes to earn a considerable importance for the document analysis community Gender prediction based on the handwritten text becomes to earn a considerable importance for the document analysis community. It is helpful for person identification as well as in some situations where one needs to classify population according to women-men categories. However,...

chapter

A Two Level Algorithm for Text Detection in Natural Scene Images

Li Rong, Wang Suyu, Zhixin Shi

2014 11th IAPR International Workshop on Document Analysis Systems > 329 - 333

2014 11th IAPR International Workshop on Document Analysis Systems (DAS)

In this paper we present a two-level method to detect text in natural scene images. In the first level, connected components (referred as CCs) are got from the images. Then candidate text lines are extracted and groups of connected components that align in horizontal or vertical direction are got. We think CCs in these groups have high probability are texts. To validate which CC is text, a SVM is...

chapter

Ruling lines removal in handwritten documents

Farzad Alipour, Karim Faez, Sahar Seifzadeh

2013 8th Iranian Conference on Machine Vision and Image Processing (MVIP) > 80 - 83

2013 8th Iranian Conference on Machine Vision and Image Processing (MVIP)

In this paper, we present a method for removing ruling lines from handwritten documents, making no damage to the existing characters. It is argued that ruling lines have a predictable position in the page, but their thickness and the distance between them may differ from one document to another, which is estimated with simple algorithm. Another important challenge in this regard is detecting the edge...

chapter

Evaluation of SVM, MLP and GMM Classifiers for Layout Analysis of Historical Documents

Hao Wei, Micheal Baechler, Fouad Slimane, Rolf Ingold

2013 12th International Conference on Document Analysis and Recognition > 1220 - 1224

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

This paper presents a comparison between three classifiers based on Support Vector Machines, Multi-Layer Perceptrons and Gaussian Mixture Models respectively to detect physical structure of historical documents. Each classifier segments a scaled image of historical document into four classes, i.e., areas of periphery, background, text and decoration. We evaluate them on three data sets of historical...

chapter

A Two-Stage Approach for Word Spotting in Graphical Documents

Arundhati Tarafdar, Umapada Pal, Partha Pratim Roy, Nicolas Ragot, more

2013 12th International Conference on Document Analysis and Recognition > 319 - 323

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

Presence of multi-oriented characters, connected characters with graphical lines, intersection of text and symbols with graphical lines/curves etc. are very common in graphical documents. As a result word spotting in graphical documents is still a challenging task that we try to solve (partially) in this paper. The proposed approach proceeds in two stages. In the first stage, recognition of isolated...

chapter

Alternatives for Page Skew Compensation in Writer Identification

Jin Chen, Daniel Lopresti

2013 12th International Conference on Document Analysis and Recognition > 927 - 931

2013 12th International Conference on Document Analysis and Recognition (ICDAR)

Traditionally, page images undergo pre-processing before the later stages of document analysis are applied. One common pre-processing step is to calculate and correct for the presence of simple page skew through a compensating rotation. Such operations modify the original input image, however, and in doing so may discard or obscure useful information. In this paper, we examine the impact of page deskewing...

chapter

Mathematical Formula Identification in PDF Documents

Xiaoyan Lin, Liangcai Gao, Zhi Tang, Xiaofan Lin, more

2011 International Conference on Document Analysis and Recognition > 1419 - 1423

2011 International Conference on Document Analysis and Recognition (ICDAR)

Recognizing mathematical expressions in PDF documents is a new and important field in document analysis. It is quite different from extracting mathematical expressions in image-based documents. In this paper, we propose a novel method by combining rule-based and learning-based methods to detect both isolated and embedded mathematical expressions in PDF documents. Moreover, various features of formulas,...

chapter

Table Detection in Noisy Off-line Handwritten Documents

Jin Chen, Daniel Lopresti

2011 International Conference on Document Analysis and Recognition > 399 - 403

2011 International Conference on Document Analysis and Recognition (ICDAR)

Table detection can be a valuable step in the analysis of unstructured documents. Although much work has been conducted in the domain of machine-print including books, scientific papers, etc., little has been done to address the case of handwritten inputs. In this paper, we study table detection in scanned handwritten documents subject to challenging artifacts and noise. First, we separate text components...

chapter

The ICDAR2011 Arabic Writer Identification Contest

Abdelaali Hassaïne, Somaya Al-Maadeed, Jihad Mohamad Alja'am, Ali Jaoua, more

2011 International Conference on Document Analysis and Recognition > 1470 - 1474

2011 International Conference on Document Analysis and Recognition (ICDAR)

Arabic writer identification is a very active research field. However, no standard benchmark is available for researchers in this field. The aim of this competition is to gather researchers and compare recent advances in Arabic writer identification. This competition was hosted by Kaggle, it has attracted thirty participants from both academia and industry. This paper gives details on this competition,...

chapter

Fast Rule-Line Removal Using Integral Images and Support Vector Machines

Jayant Kumar, David Doermann

2011 International Conference on Document Analysis and Recognition > 584 - 588

2011 International Conference on Document Analysis and Recognition (ICDAR)

In this paper, we present a fast and effective method for removing pre-printed rule-lines in handwritten document images. We use an integral-image representation which allows fast computation of features and apply techniques for large scale Support Vector learning using a data selection strategy to sample a small subset of training data. Results on both constructed and real-world data sets show that...

chapter

A CRF Based Scheme for Overlapping Multi-colored Text Graphics Separation

Ritu Garg, Ehtesham Hassan, Santanu Chaudhury, M. Gopal

2011 International Conference on Document Analysis and Recognition > 1215 - 1219

2011 International Conference on Document Analysis and Recognition (ICDAR)

In this paper, we propose a novel framework for segmentation of documents with complex layouts. The document segmentation is performed by combination of clustering and conditional random fields (CRF) based modeling. The bottom-up approach for segmentation assigns each pixel to a cluster plane based on color intensity. A CRF based discriminative model is learned to extract the local neighborhood information...

Keywords:
SUPPORT VECTOR MACHINES
TEXT ANALYSIS

Publication date

Set your own date range

Content availability

Available (404)
None (13)

Keywords

FEATURE EXTRACTION (163)
TRAINING (153)
TEXT CATEGORIZATION (150)
SUPPORT VECTOR MACHINE (139)
CLASSIFICATION ALGORITHMS (137)
PATTERN CLASSIFICATION (133)
DATA MINING (125)
LEARNING (ARTIFICIAL INTELLIGENCE) (93)
ACCURACY (91)
TEXT CLASSIFICATION (89)
MACHINE LEARNING (88)
SVM (86)
CLASSIFICATION (77)
NATURAL LANGUAGE PROCESSING (68)
INTERNET (51)
FEATURE SELECTION (45)
KERNEL (45)
INFORMATION RETRIEVAL (42)
SUPPORT VECTOR MACHINE CLASSIFICATION (37)
TEXT MINING (30)
SVM CLASSIFIER (29)
BAYES METHODS (25)
TESTING (21)
INDEXING (20)
IMAGE CLASSIFICATION (19)
NATURAL LANGUAGES (19)
ARTIFICIAL NEURAL NETWORKS (18)
HANDWRITING RECOGNITION (18)
STATISTICAL ANALYSIS (18)
WEB SITES (18)
DOCUMENT IMAGE PROCESSING (17)
PIXEL (16)
ALGORITHM DESIGN AND ANALYSIS (15)
COMPUTATIONAL LINGUISTICS (15)
HIDDEN MARKOV MODELS (15)
IMAGE SEGMENTATION (15)
INFORMATION FILTERING (15)
NIOBIUM (15)
PATTERN CLUSTERING (15)
SPEECH (15)
VECTOR SPACE MODEL (15)
CONFERENCES (14)
INFORMATION EXTRACTION (14)
MACHINE LEARNING ALGORITHMS (14)
SEMANTICS (14)
WORD PROCESSING (14)
DECISION TREES (13)
SENTIMENT CLASSIFICATION (13)
WEB PAGES (13)
CHARACTER RECOGNITION (12)
DATABASES (12)
IMAGE EDGE DETECTION (12)
ONTOLOGIES (12)
PROBABILITY (12)
SUPERVISED LEARNING (12)
DOCUMENT CLASSIFICATION (11)
ENTROPY (11)
NEURAL NETS (11)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (11)
ROUGH SET THEORY (11)
SENTIMENT ANALYSIS (11)
COMPUTERS (10)
CONTEXT (10)
DISTANCE MEASUREMENT (10)
EQUATIONS (10)
FILTERING (10)
HANDWRITTEN CHARACTER RECOGNITION (10)
MEDICAL INFORMATION SYSTEMS (10)
PROBABILITY DENSITY FUNCTION (10)
SHAPE (10)
TEXT PROCESSING (10)
TRAINING DATA (10)
BAYESIAN METHODS (9)
DICTIONARIES (9)
ELECTRONIC MAIL (9)
FUZZY SET THEORY (9)
GENETIC ALGORITHMS (9)
K-NEAREST NEIGHBOR (9)
MATHEMATICAL MODEL (9)
NAIVE BAYES (9)
TAGGING (9)
VECTORS (9)
CHINESE TEXT CATEGORIZATION (8)
COMPUTATIONAL MODELING (8)
GRAMMARS (8)
KNN (8)
LATENT SEMANTIC INDEXING (8)
MUTUAL INFORMATION (8)
PREDICTION ALGORITHMS (8)
SUPPORT VECTOR MACHINE CLASSIFIER (8)
TEXT DETECTION (8)
UNSOLICITED E-MAIL (8)
ARTIFICIAL INTELLIGENCE (7)
BLOGS (7)
CHINESE TEXT (7)
CLUSTERING ALGORITHMS (7)
DECISION TREE (7)
DOCUMENT HANDLING (7)
more

INFONA - science communication portal

Search results

Machine Identification of High Impact Research through Text and Image Analysis

Unstructured data treatment for big data solutions

Classifying sentiments in Nepali subjective texts

Handwritten and machine printed text separation from Kannada document images

Text localization in video/scene images using Kirsch Directional Masks

Classifying emotion in Thai youtube comments

Opinion mining and analysis: A literature review

Recognition of Spatial Relations in Mathematical Formulas

Evaluating Feature Sets and Classifiers for Sentiment Analysis of Financial News

Local descriptors to improve off-line handwriting-based gender prediction

A Two Level Algorithm for Text Detection in Natural Scene Images

Ruling lines removal in handwritten documents

Evaluation of SVM, MLP and GMM Classifiers for Layout Analysis of Historical Documents

A Two-Stage Approach for Word Spotting in Graphical Documents

Alternatives for Page Skew Compensation in Writer Identification

Mathematical Formula Identification in PDF Documents

Table Detection in Noisy Off-line Handwritten Documents

The ICDAR2011 Arabic Writer Identification Contest

Fast Rule-Line Removal Using Integral Images and Support Vector Machines

A CRF Based Scheme for Overlapping Multi-colored Text Graphics Separation

Filter options

Publication date

Content availability

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options