The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We deal with the problem of initial analysis of data coming from evaluation sheets of subjects with Autism Spectrum Disorders (ASDs). In our research, we use an original evaluation sheet including questions about competencies grouped into 17 spheres. In the paper, we are focused on a feature selection problem. The main goal is to use appropriate data to build simpler and more accurate classifiers...
We critically examine the problem of quality assessment of algorithms for Boolean matrix factorization. We argue that little attention is paid to this problem in the literature. We view this problem as a multifaceted one and identify key aspects with respect to which the quality of algorithms should be assessed. Because of its utmost importance, we focus on assessment of quality of sets of factors...
The traditional relation extraction methods require the pre-defined relation types and a corpus with human tags. The information extracted by the current open relation extraction (ORE) methods is incomplete, and the relation types are finite. To solve the above problems, we propose ClausORE, which is an n-ary ORE method for Chinese text and extracts the entities and relations between entities from...
Data mining has been defined as "The nontrivial extraction of implicit, previously unknown, and potentially useful information from data". Clustering is the automated search for group of related observations in a data set. The K-Means method is one of the most commonly used clustering techniques for a variety of applications. This paper proposes a method for making the K-Means algorithm...
The partition clustering problem is the one of the hardest problems in nowadays research. Also Partition clustering problem is the basic problem in data mining. In most cases, the partition clustering algorithm is NP problem. In this paper, there introduced an algorithm to solve partition clustering problem in polynomial time by using some reduction technology. This algorithm is availability after...
Outlier detection is an interesting data mining task, which detects rare events. This paper focuses on the method of outlier detection based on frequent pattern (FP method for short). First we analyze the drawback of this method, and then an improved method (LFP method for short) has been presented. Finally, we evaluate the two methods by using several datasets and the experiment results show that...
In this paper we present a sorting algorithm, which uses the methodology of insertion sort efficiently to give a much better performance than the existing sorting algorithms of the O(n2 ) class. We prove the correctness of the algorithm and give a detailed time complexity analysis of the algorithm. We also describe various applications of the algorithms.
Broadcasting is an information dissemination problem in a connected network, in which one node, called the originator, disseminates a message to all other nodes by placing a series of calls along the communication lines of the network. Once informed, the nodes aid the originator in distributing the message. Finding the minimum broadcast time of a vertex in an arbitrary graph is NP-complete. The problem...
Sorting is an important concept in the field of Computer Sciences. In this paper we presents an incremental sorting algorithm, which traverses the list in both directions (right and left) comparing first element of the list, with next two elements, and comparing last element with previous two elements in the list. Similarly by iteratively scanning the list, brings the list in sorting order.
This research article presents a new character-based indexing algorithm which generates all locations of target text to the inverted list in existed bit form. This algorithm is efficiently to search in the case of streaming character input that needs to immediately response.
Pair-wise testing is widely used to detect faults in software systems. In many applications where pair-wise testing is needed, the whole test set can not be run completely due to time or budget constraints. In these situations, it is essential to prioritize the tests. In this paper, we drive weight for each value of each parameter, and adapt UWA algorithm to generate an ordered pair-wise coverage...
A large family of p-phase highest coordinate sequences, where p is an odd prime, is constructed by using the highest coordinate of the generalized Kerdock codes over the ring Zpl, for l ges 3. Utilizing the local Weil bound and spectral analysis over the additive group of Zpl, we derive an estimate of the correlation of the sequences. And the result shows that these sequences have low nontrivial autocorrelation...
The quality of an immune-based negative selection algorithm hardly depends on quality of generated detectors. First, they should cover a nonself space in sufficient degree to guarantee high detection rates. Second, the duration of classification is proportional to the cardinality of detector's set. A time reaction for anomalies is especially important in on-line classification systems, e.g. spam and...
Identifying similarity of strings is an essential step in data cleaning and data integration processes. However, information on the Web is mostly composed of semi-structured and unstructured data, and mixes with a variety of inaccurate information, such as noise data, repeat characters and the abbreviated name. This makes traditional string similarity algorithms aiming at some particular environment...
The matching has been recognized as a plausible solution for the semantic heterogeneity problem in many traditional applications, such as schema integration, ontology integration, data warehouses, data integration, and so on. In this paper, we introduce a new method of Schema Matching based on top-k schema mapping and user feedback. The essence of this method is the simultaneous generation of K best...
To improve the efficiency of attribute reduction and obtain the minimal attribute reduction, the notion of consistent simplified decision table is proposed. The concentrated discernibility set is created in the consistent simplified decision table, and an algorithm of completeness attribute reduction based on concentrated discernibility set is put forward. The least information of attribute reduction...
This paper presents two algorithms of string pattern matching. These algorithms employ the inverted lists to accommodate the string pattern to be searched for. The first solution scans the text in a single pass for all occurrences of string pattern. The second solution, which improves the first one, takes the comparison times equal to the length of pattern plus the number of comparisons that lead...
Secure multi-party computation (SMC) deals with the problem of secure computation among participants who are not trusted by others. Privacy preserving computational geometry is a special area in SMC and has been applied to various of areas such as military, commerce and governments et al. In this paper, we will propose an efficient secure protocol, which is based on numerical computation, for the...
Triangle counting is an important problem in graph mining. The clustering coefficient and the transitivity ratio,two commonly used measures effectively quantify the triangle density in order to quantify the fact that friends of friends tend to be friends themselves. Furthermore, several successful graph mining applications rely on the number of triangles. In this paper, we study the problem of counting...
Learning is a central task in computer science, and there are various formalisms for capturing the notion. One important model studied in computational learning theory is the PAC model of Valiant (CACM 1984). On the other hand, in cryptography the notion of "learning nothing'' is often modelled by the simulation paradigm: in an interactive protocol, a party learns nothing if it can produce a...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.