The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Data uncertainty is common in real-world applications. Various reasons lead to data uncertainty, including imprecise measurements, network latency, outdated sources and sampling errors. These kinds of uncertainties have to be handled cautiously, or else the data mining results could be unreliable or wrong. In this demo, we will show uRule, a new rule-based classification and prediction system for...
Feature selection continues to grow in importance in many areas of science and engineering, as large datasets become increasingly common. In particular, bioscience and medical datasets routinely contain several thousands of features. For effective data mining in such datasets, tools are required that can reliably distinguish the most relevant features. The latter is a useful goal in itself (e.g. such...
Recent progress of bioinformatics technology has enabled large-scale screening of biomarker candidates. In this paper, we propose a new method called LEAF: LEAveone-out Forward selection method for analysis of the gene expression data. Our proposed method has made it possible to construct the ranking of informative genes using the parameter which evaluates the efficiency of the class discriminant...
In this study, we firstly take good advantage of SEER Public-Use Data to predict breast cancer recurrence using data mining techniques. The SEER Public-Use Data 2005 is used in this research. We presented a new data pre-classification method and firstly find a possible solution to discover the information of breast cancer recurrence of SEER data. After the preprocessing of the dataset, we investigate...
Due to the high dimensionality of microarray data, feature selection is an indispensable task in classification to identify a smaller subset of relevant genes. However, feature selection techniques that consider solely on gene expression values might not be able to identify biologically meaningful genes. Thus, this paper presents an integrative feature selection method that is able to incorporate...
Classification is one of the most efficient data mining techniques in Machine Learning. In classification, Decision trees can handle high dimensional data. But, decision trees yield poor performance in medical health care. So, In this paper, we investigate the use of Receiver Operating Characteristic (ROC) curve for the evaluation of machine learning algorithms. In particular, we investigate the use...
Mammography is the most effective and available tool for breast cancer screening. However, the low positive predictive value of breast biopsy resulting from mammogram interpretation leads to approximately 70% unnecessary biopsies with benign outcomes. Data mining algorithms could be used to help physicians in their decisions to perform a breast biopsy on a suspicious lesion seen in a mammogram image...
Due to the high fatality rate of patients with radiation pneumonitis (RP), a complication of the radiation therapy (radiotherapy), great attention has been paid to the treatment plan of individual RP patients. Therefore, not only technological advances in the development of treatment planning systems but also new prognostic models are urgently required to lessen the complication and to predict the...
Lung cancer patients who receive radiotherapy as part of their treatment are at risk radiation-induced lung injury known as radiation pneumonitis (RP). RP is a potentially fatal side effect to treatment. Hence, new methods are needed to guide physicians to prescribe targeted therapy dosage to patients at high risk of RP. Several predictive models based on traditional statistical methods and machine...
In previous studies, performance improvement of nearest neighbor classification of high dimensional data, such as microarrays, has been investigated using dimensionality reduction. It has been demonstrated that the fusion of dimensionality reduction methods, either by fusing classifiers obtained from each set of reduced features, or by fusing all reduced features are better than using any single dimensionality...
This paper proposes a new type of regularization in the context of multi-class support vector machine for simultaneous classification and gene selection. By combining the huberized hinge loss function and the elastic net penalty, the proposed support vector machine can do automatic gene selection and further encourage a grouping effect in the process of building classifiers, thus leading a sparse...
This paper presents a computer-aided diagnosis (CAD) system based on combined support vector machine (SVM) and linear discriminant analysis (LDA) classifier for detection and classification breast cancer in digital mammograms. The proposed system has been implemented in four stages: (a) Region of interest (ROI) selection of 32??32 pixels size which identifies suspicion regions, (b) Feature extraction...
When solving the problem in computer assisted detection by the approach of pattern recognition, the lesion data always exhibited high-dimensional and inhomogeneous, which makes most of the traditional classifiers can not performance very well. In this paper, a novel approach based on the dynamic feature subset selection and the EM algorithm with Naive Bayesian classifier integration algorithm (DSFS+EMNB)...
In this paper, we introduce a method of functionally classifying lung cancer cells from normal cells by using Tetrakis Carboxy Phenyl Porphine (TCPP) and well-known computational intelligent techniques. Tetrakis Carboxy Phenyl Porphine (TCPP) is a porphyrin that is able to label cancer cells due to the increased numbers of low density lipoproteins coating the surface of cancer cells and the porous...
In this paper, we investigated the use of gene coexpression network analyses to identify potential biomarkers for breast carcinoma prognosis. The network mining algorithm CODENSE is used to identify highly connected genome-wide gene co-expression networks among a variety of cancer types, and the resulted gene clusters are applied to a series of breast cancer microarray sets to categorize the patients...
Providing clinical predictions for cancer patients by analyzing their genetic make-up is a difficult and very important issue. With the goal of identifying genes more correlated with the prognosis of breast cancer, we used data mining techniques to study the gene expression values of breast cancer patients with known clinical outcome. Focus of our work was the creation of a classification model to...
By viewing a gene expression profile as a pseud-time signal, we apply wavelet transformation (WT) to analyze gene expression data in a time-frequency manner. As a result, two pattern extraction approaches, continuous wavelet transformation (CWT)-based one and discrete wavelet transformation (DWT)-based one, are proposed to extract hidden expression patterns for cancer classification and are compared...
Facing the phenomenon of "ageing population", the diagnosis and treatment of prostate cancer has become a serious menpsilas health issue in Taiwan. This study aimed to provide new scientific and quantitative information for traditional Chinese medicine (TCM) physicians in clinical practice of prostate cancer. In this study, data mining techniques were employed to explore the hidden knowledge...
Prostate cancer is a disease which is the most common and which is also the second deadly in men. When prostate cancer can be diagnosed early, medical surgery operation can be performed and the disease can be treated. In this study, the aim is to design a classifier based expert system for early diagnosis of the organ in constraint phase. The other purpose is to reach informed decision making without...
Feature selection plays an important role in cancer classification, for gene expression data usually have a large number of dimensions and relatively a small number of samples. In this paper, we use the support vector machine (SVM) for cancer classification. We propose a mixed two-step feature selection method. The first step uses a modified t-test method to select discriminatory features. The second...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.