The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Oriented graphs belong to a part of Mathematics - Combinatorics called Graph Theory. One of the fundamental terms here is a tree. The tree structures have widespread use not only in Mathematics. They can be used in Decision Theory as data mining tools as well. In the present paper we point out to the use of decision trees as models for financial services, namely, by credit scoring, fraud and churn...
The Unified Modeling Language (UML) is widely taught in academia and has good acceptance in industry. However, there is not an ample dataset of UML diagrams publicly available. Our aim is to offer a dataset of UML files, together with meta-data of the software projects where the UML files belong to. Therefore, we have systematically mined over 12 million GitHub projects to find UML files in them....
The analysis of clinical pathways from event logs provides new insights about care processes. In this paper, we propose a new methodology to automatically perform simulation analysis of patients' clinical pathways based on a national hospital database. Process mining is used to build highly representative causal nets, which are then converted to state charts in order to be executed. A joint multi-agent...
Quantitative research has been extensively applied in sociology. The traditional way of using data statistical computing tools of R, SPSS and Stata on the stand-alone machine can't deal with the challenges of big data; furthermore, the demand of complex computing in mobile condition is increasing due to the fieldwork characteristics of sociological researchers. Considering the computing needs of sociological...
Person names and location names are essential building blocks for identifying events and social networks in historical documents that were written in literary Chinese. We take the lead to explore the research on algorithmically recognizing named entities in literary Chinese for historical studies with language-model based and conditional-random-field based methods, and extend our work to mining the...
Life cycle assessment (LCA) as a decision support tool for evaluating the environmental load of products has been widely used in many fields. However, applying LCA in the building industry is expensive and time consuming. This is due to the complexity of building structure along with a large amount of high-dimensional heterogeneous building data. So far building environmental impact assessment (BEIA)...
In this paper we focus on the task of clustering in data mining applications. We introduce a formulation of a new clustering algorithm by modelling the system as a cooperative game in strategic form using game theory. The goal is to partition a dataset into k clusters. Our approach has been applied to both simulated and real-world datasets. In addition, we have implemented functions based on the calculation...
The motivation behind a spatio-temporal visual saliency model is to extract salient information from two distinct pathways: static (intensity) and dynamic (motion). Consequently, the information from these pathways is combined to get the final visual saliency map. Since the response of the pathways is different, the step of combination of the maps is important. As a consequence, we study six recent...
With the aim of getting more accurate and more reliable stock price predicted results, this paper proposes an effective method which is fuzzy rough set and data mining technology. Firstly, stock prices were classified to some groups according to their different time attribute by using fuzzy set and rough set means. Then we calculated truth values of these groups respectively based on the given fuzzy...
In this work we present a software tool that allow identify tendencies that describe the evolution in a discipline of scientific knowledge, where information resources are classified. The tool search support the data mining as part of discovery knowledge process and the identification is supported by production analysis of information resources in science and technology and his visualization in graphs...
In order to utilize a best hot rolling process, intelligent database system involving with rolling conditions rolling materials was developed in this research work. The system design is to consider how to optimize manufacturing condition of rolling process and rolling materials. The system was constructed on internet environment, so that all data and information can be obtained conveniently on internet...
Several flaviviruses are important human pathogens, including dengue virus, a disease against which neither a vaccine nor specific antiviral therapies currently exist. QSAR study was carried out with the purpose of searching new competitive dengue inhibitors with similar properties to the existence inhibitors (i.e. data set). The approach began with the development of rigorously validated QSAR model...
This paper presents annotation specification as attribute-value pair for representing and capturing strategic information for solving decision problems in the context of Economic Intelligence (also referred to as competitive intelligence). The aim of using this approach is to facilitate information reuse for similar problems. While most of available annotation tools allow annotator to add annotation...
Worldwide health scientists are producing, accessing, analyzing, integrating, and storing massive amounts of digital medical data daily, through observation, experimentation, and simulation. If we were able to effectively transfer and integrate data from all possible resources, then a deeper understanding of all these data sets and better exposed knowledge, along with appropriate insights and actions,...
To improve the intelligibility and efficiency of knowledge expression for the land evaluation, a land evaluation method combining simplified fuzzy classification association rules with fuzzy decision is proposed in this paper. To reduce the complexity of the land evaluation models and improve the efficiency and intelligibility of fuzzy classification association rules further, an algorithm to eliminate...
To model market dynamics is a challenge that has attracted the interest of practitioners and researchers alike. This problem has been addressed from the perspective of Game Theory, in models that explicitly include profit-maximization schemes for the companies, and also from the point of view of Data Mining, with models that consider multivariate functions to model customer demands and related phenomena...
With the popular J2EE architecture, Using the MVC design pattern and two common framework of Struts and Hibernate in JavaWeb area, Design and implementation of comprehensive statistical information management system, and describes the system implementation process.
Query term suggestion that interactively expands the queries is an indispensable technique to help users formulate high-quality queries and has attracted much attention in the community of web search. Existing methods usually suggest terms based on statistics in documents as well as query logs and external dictionaries, and they neglect the fact that the topic information is very crucial because it...
In this paper we propose the utilisation of an evolutionary approach for the task of classifying microarray data by using prior knowledge in the form of existing gene sets. The purpose of the work is to obtain an accurate classification model that uses a biologically relevant, and previously defined, gene set. The proposed algorithm will be integrated within geneCBR, a successful system able to perform...
The model of Open Innovation is the best choice to the firms that can not afford research and development (R&D) costs but intent continues playing the innovation game. This model offers to any firm the companies spread worldwide and in all research fields as possible partners in R&D. However, the possible partnership can not be restricted in the manager's know-who. The patent documents can...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.