The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The paper presents the Big data definition and the main characteristics description The model of association between entities and characteristics is constructed. The method of heterogeneous data sharing and bringing to relational data model “entity-characteristic” was created. The testing results of developed methods and algorithms are presented.
Curriculum design and implementation in higher medical education can be a great challenge. Although there are well-defined standards, such as the Curriculum Inventory and Competency Framework by MedBiquitous Consortium, existing systems are incapable of a visual representation of the various components, attributes, and relations. In this paper, we present the MEDCIN platform, a pilot tool which uses...
Data query comparison of large data should process a large number of datasets, which is the core for basic research projects. In the selection of analysis factors, how to choose a better way to provide data analysis library can be very helpful for quickly analysis and search comparison. In this study, we try to design and study, improve the speed of data query and comparison in big datasets, and find...
This paper focuses on use of Matlab for data mining. There is wide range of data mining software where free or cheaper solutions offer similar possibilities. We wanted to try Matlab for these purposes. Our data consists of parameters, which describes cloud usage at IT company that offers cloud services. We used phases from the CRISP-DM methodology in our work. We built clustering and classification...
The classical ways to access data, i.e. through a search engine or by querying a database, have today become insufficient when confronted with the new needs of data exploration and interpretation: at a user request, a search engine retrieves a very large collection of documents, while the query engine of a DBMS grants access to all the data that satisfy certain criteria. Once the documents or the...
Healthcare information systems (HISs) are multifarious in nature. They are thus best implemented using multiple data stores because one database won't fulfill all the storage requirements of such complex applications. The amalgamation of different databases within an application is known as polyglot persistence. To achieve a polyglot-persistent solution, different database types must available. As...
As scientific applications like Computational Fluid Dynamics (CFD) simulations generate more and more data, co-processing becomes the most cost effective way to process the vast amount of data generated by these simulation. In a co-processing environment, analysis and/or visualization of intermediate results occur concurrently to the simulation itself. Improved efficiency and early insight into the...
We present in this paper a new biomimetic method nammed CL-AntInc for data incremental clustering. This algorithm uses the behavior of real ants. We deal with the issue of data volume through a clustering heuristic. Dynamic graphs are constructed according to a simulation of colonial odors and pheromone mechanisms. We used numerical databases extracted from the Machine Learning Repository. The experimental...
Recent developments of modern technologies such as cloud computing, wearable sensor devices and big data have significantly impacted people's daily lives, and offer real potential for an Internet-wide, people-centric ecosystem. These advances in technology will considerably extend human capabilities in acquiring, consuming and sharing personal health information. A future in which we are all equipped...
Simulation training system can improve the trainer's ability, a rapid response and safety operating capability in the operation of real ship, to reduce operation accident of sailing. Intelligent examination scoring system uses the professional theory and practical experience as the evaluation criteria, and analyzes the operator's operation process to realize the automatic scoring through the program...
Clinical data warehouse has been developed as a fundamental data infrastructure for large scale TCM clinical data management and decision support services. However, as a key component, data extraction, transforming and loading (ETL) is a complicated and labor intensive task to ensure high data quality before all kinds of data analyses. This paper introduces an enhanced ETL technique framework, which...
Interval-valued data arise in practical situations such as recording monthly interval temperatures at meteorological stations, daily interval stock prices, etc. This paper introduces a multinomial logistic regression method for interval-valued data in order to classify items described by interval-valued variables into a pre-defined number of a priori classes. Applications of the proposed approach...
The knowledge-based simulation experiment data integrative analysis tool that utilizes the knowledge model of sample data extraction and the database of analysis project is designed to implement the integrative analysis operation including simulation experiment data import, sample data extraction, data analysis, simulation system characters appraisement and the generation of analysis results report...
Scientific progress depends increasingly on collaborative efforts that involve exchange of data and re-analysis of previously recorded data. With increasing complexity of the data it becomes more difficult to access both data and metadata for application of specific analysis methods, for exchange with collaborators, or for further analysis some time after the initial study was completed. This effort...
Online Analytical Processing (OLAP) was widely used to visualize complex data for efficient, interactive and meaningful analysis. Its power comes in visualizing huge operational data for interactive analysis. Data mining techniques (DM) are strong at detecting pattern sand mining knowledge from historical data. OLAP and DM are believed to be able to complement each other to analyze large data sets...
Visual analytics aims at combining interactive data visualization with data analysis tasks. Given the explosion in volume and complexity of scientific data, e.g., associated to biological or physical processes or social networks, visual analytics is called to play an important role in scientific data management. Most visual analytics platforms, however, are memory-based, and are therefore limited...
The CONNecticut Joint University Research (CONNJUR) team is a group of biochemical and software engineering researchers at multiple institutions. The vision of the team is to develop a comprehensive application that integrates a variety of existing analysis tools with workflow and data management to support the process of protein structure determination using Nuclear Magnetic Resonance (NMR). The...
In this paper we propose a web log mining-based network user behavior analysis scheme, which plays an important role in network structure optimization and website server configuration. Based on clustering and regression model, we studied the network user's visit model in a university by analyzing a large amount of web log data which is collected from the university campus network. The data analyzing...
Numerous studies have generated cost estimating relationships (CERs) for transportation projects via data analysis. Some studies collected data from databases, while others sourced data from conventional paper-based formats. When cost data were not in a consistent format, many studies failed to discuss the streamlining of pattern recognition. This work adopts a standard procedure for identifying CERs...
In this paper we describe a methodology and an automatic procedure for inferring accurate and easily understandable expert-system-like rules from forensic data. This methodology is based on the fuzzy set theory. The algorithms we used are described in detail, and were tested on forensic data sets. We also present in detail some examples, which are representative for the obtained results.
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.