The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
The problem of limited minority class data is encountered in many class imbalanced applications, but has received little attention. Synthetic over-sampling, as popular class-imbalance learning methods, could introduce much noise when minority class has limited data since the synthetic samples are not i.i.d. samples of minority class. Most sophisticated synthetic sampling methods tackle this problem...
Currently, open source projects receive various kinds of issues daily, because of the extreme openness of Issue Tracking System (ITS) in GitHub. ITS is a labor-intensive and time-consuming task of issue categorization for project managers. However, a contributor is only required a short textual abstract to report an issue in GitHub. Thus, most traditional classification approaches based on detailed...
The construction of knowledge graph of dangerous goods (KGDG) is with great significance of inferring relative information of dangerous goods, developing corresponding policy for its storage and transport, preventing disaster caused by dangerous goods(DG), and providing emergency plan when the disaster happens. Since distributed representation of natural language is an effective method for knowledge...
Advanced pattern mining to extract the hidden but useful information by using proper structure is vital important for efficient information mining in large-scale practical datasets. The existing algorithms have not been capable of effective solving the fuzziness uncertainty of items and confirming the appropriate structure of studied patterns. In order to generate more proper practical patterns, a...
Decision tree algorithms are very popular in the field of data mining. This paper proposes a distributed decision tree algorithm and shows examples of its implementation on big data platforms. The major contribution of this paper is the novel KS-Tree algorithm which builds a decision tree in a distributed environment. KS-Tree is applied to some real world data mining problems and compared with state-of-the-art...
Traditional Chinese medicine (TCM) is a holistic medical approach and the formula's composition discipline is still a mystery. Detecting a formula's structure and herb communities/clusters in TCM Formula networks (TCMF) is a mainly existing problem in data mining of the data sets. In this paper, we devise a novel community similarity calculating method in the process of clustering, which is called...
Because of the rapid growth of open source software, how to choose software from many alternatives becomes a great challenge. Traditional ranking approaches mainly focus on the characteristics of the software themselves. In this paper we investigate the market demands for software engineers, and propose a novel approach for ranking software by analyzing the market requirements for special software...
In this paper the telemetry data of Satellite TX-I are analyzed in order to have a better understanding of the satellite operating status, and to lay the foundation for fault detection task. Given the high dimensional data, the locally linear embedding (LLE), a kind of manifold learning schemes, is applied to perform dimensionality reduction and feature extraction. Furthermore the data-driven fault...
With the dramatic growth of E-commerce's popularity, the amount of products and services reviews are increasing rapidly. Many researchers put effects on mining user opinions from reviews. There are many research works in opinion mining. However, all current methods are focusing on how to handling the opinion mining for English language, Chinese, Japanese and so on. There lacks works on how to conduct...
Open Source Forge (OSF) websites provide information on massive open source software projects, extracting these web data is important for open source research. Traditional extraction methods use string matching among pages to detect page template, which is time-consuming. A recent work published in VLDB exploits redundant entities among websites to detect web page coordinates of these entities. The...
To help handle battlefield information superiority to decision superiority (i.e. to rapidly arrive at better decisions than adversaries can respond to), many scientific, technical and technological challenges must be addressed. The most critical of those are information fusion and management at different levels, communication. This paper decribes battlefield information as data streams and mining...
Formal Concept Analysis and Rough Set Theory provide two different methods for data analysis and knowledge processing. The basis of Rough Set Theory is an equivalence relation on a universe of objects, and that of Formal Concept Analysis is an ordered hierarchical structure - concept lattice. This paper discusses the basic connection between Formal Concept Analysis and Rough Set Theory, and then we...
CUDA is a new computing architecture introduced by NVIDIA Corporation, aiming at general purpose computation on GPU. The architecture has strong compute power in the compute-intensive applications and data-intensive applications, so in recent years, how the framework is applied to the scientific computing has become a hot research. The iterative method for solving systems of linear equations in engineering...
Mining concept drifting data stream is a challenging area for data mining research. Recent years have witnessed an averaging ensemble classifier which is based on the learnable assumption, although this ensemble classifier is an efficient algorithm for mining concept-drifting data streams, it is still inadequate to represent real-world data streams with noisy data. In this paper, we propose a novel...
Market orientation is regarded as one of the executive tools of marketing concept, and having been one of the most important research topics in academia, including its antecedents and consequences and the relationship between market orientation and corporate performance. The existing theories are not unanimous about the impact of market orientation on corporate performance. The underlying causation...
In this paper, we have analyzed the relationship among government control, diversification and corporate performance. The investigation has been performed using panel data procedure for a sample of 320 Chinese companies listed on the Shanghai stock exchange during the period from 2001 to 2006. We find that, diversification under government control has negative influence on firm performance, while...
This paper discusses the data, the manipulation and services, which forms the foundation of information system. And then an fine-granularity access control model to object resource is built, which can be extended to many information system. The fine-granularity access control to internal resources of information system can be realized very easily. The access authority and application becomes more...
We propose a scheme of multiple 16QAM signals generation at 40 Gbit/s using a novel transmitter. The scheme is based on a dual-parallel Mach-Zehnder modulator (DPMZM) and a following phase modulator (PM). We demonstrate the proposed transmitter by VPI simulation and three types of 16QAM signals are obtained, which shows the feasibility of the transmitter. In order to further investigate the transmission...
The transmittances of ZrO2/SiO2 25-layer film with and without UV-irradiation was compared with theoretical model. With UV-irradiation, infiltration was controlled in ZrO2/SiO2/ZrO2 three-layer film, and the refractive index had been increased by the treatment.
The environment is complex and the parameters are difficult to measure accurately in coal mine, so fuzzy theory and neural network technology were applied to structure intelligent fuzzy neural network sensor system in the areas of coal mine safety monitoring, which had a research focus on the formation and training of intelligent fuzzy sensor system membership function network samples and fuzzy inference...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.