The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
This paper describes a database of on-line handwritten patterns mixed of text, figures, tables, maps, diagrams and so on. Now, pen-based and touch-based interfaces are spreading into people and their surfaces are getting large. People can write and draw mixed objects without paying attention on the difference of objects or the mode change. Moreover, they may write text in any direction in combination...
This paper introduces a new offline handwriting database that was developed to be employed in performance evaluation, result comparison and development of new methods related to handwriting analysis and recognition. The database can particularly be used for signature verification, writer recognition and writer demographics classification. In addition, the database also supports isolated digit recognition,...
Codebook-based representations have been effectively employed for writer identification. Most of the codebook-based methods generate a codebook by clustering a set of patterns extracted from an independent data set. The probability of occurrence of the codebook patterns in a given writing is then used to characterize its author. This study investigates the hypothesis that the codebook is merely a...
In this paper we present a new dual mode, twin-folio structured English handwriting dataset IBM_UB_1. IBM_UB_1 is our first major release from a large multilingual handwriting corpus. Containing over 6000 pages of handwritten matter, this dataset can not only be used for unconstrained handwriting recognition, more importantly, the dataset's unique twin-folio structure presents a natural fit for research...
The paper conducts a research on the word frequency, readability, sentence length, lexical density and word syllables of college English test papers band 4 in the past five years. Based on the research, a list of words with high frequency is sorted out through text analysis computing tools.
The study on the degree of the textual comprehensibility based on French corpus comes under the umbrella of corpus-involved text research. A systematic random sampling is employed in the present paper to compare the six different groups extracted from one and the same sampled French text. The formula, viz. A+BX≤C, is provided here to equidistantly extract linguistic fragments from the famous French...
Style-based text authorship identification extracts features from authorship-known texts, constructs classifier and then identifies disputed texts. Authorship identification belongs to the domain of style classification and is a branch of text classification. In contrast with text classification which deals with the content of texts, authorship identification focuses on the form property of texts...
Libraries and museums are digitizing their collections of historical culture objects to enable public access, such as historical Chinese calligraphy. These collections are only available in image format, lacking practical technology to offer the basic search service for public access. This paper proposes a quick search approach by a coarse-to-fine strategy. First, long list of calligraphy characters...
On the face of it, scoring student essays would seem to push AI capabilities to their limits. After all, students express themselves through their writing in vastly different ways. Furthermore, they might misunderstand the essay questions they've been asked to write about, or drift off the topic in the course of writing. Even so, for decades now, researchers have known ways to automatically evaluate...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.