Search results

Items from 1 to 20 out of 59 results

chapter

Where Is the Road for Issue Reports Classification Based on Text Mining?

Qiang Fan, Yue Yu, Gang Yin, Tao Wang, more

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) > 121 - 130

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

Currently, open source projects receive various kinds of issues daily, because of the extreme openness of Issue Tracking System (ITS) in GitHub. ITS is a labor-intensive and time-consuming task of issue categorization for project managers. However, a contributor is only required a short textual abstract to report an issue in GitHub. Thus, most traditional classification approaches based on detailed...

chapter

Mining Version Control System for Automatically Generating Commit Comment

Yuan Huang, Qiaoyang Zheng, Xiangping Chen, Yingfei Xiong, more

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) > 414 - 423

2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

Commit comments increasingly receive attention as an important complementary component in code change comprehension. To address the comment scarcity issue, a variety of automatic approaches for commit comment generation have been intensively proposed. However, most of these approaches mechanically outline a superficial level summary of the changed software entities, the change intent behind the code...

chapter

GenLog: Accurate Log Template Discovery for Stripped X86 Binaries

Maosheng Zhang, Ying Zhao, Zengmingyu He

2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC) > 1 > 337 - 346

2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC)

Log analysis plays an important role for computer failure diagnosis. With the ever increasing size and complexity of logs, the task of analyzing logs has become cumbersome to carry out manually. For this reason, recent research has focused on automatic analysis techniques for large log files. However, log messages are texts with certain formats and it is very challenging for automatic analysis to...

chapter

Are big data talents different from business intelligence expertise?: Evidence from text mining using job recruitment advertisements

Jun Wu, Honglei Shi, Jiaping Yang

2017 International Conference on Service Systems and Service Management > 1 - 6

2017 14th International Conference on Service Systems and Service Management (ICSSSM)

As more and more companies become aware of the benefits of collecting and analyzing data, hiring employee with data analytics expertise is a key issue faced by HR practitioners. Although previous research empirically highlighted the differences of knowledge and skill requirements between big data (BD) and business intelligence (BI) in English-speaking countries, limited similar study is conducted...

chapter

On Accelerating Ultra-Large-Scale Mining

Ganesha Upadhyaya, Hridesh Rajan

2017 IEEE/ACM 39th International Conference on Software Engineering: New Ideas and Emerging Technologies Results Track (ICSE-NIER) > 39 - 42

2017 IEEE/ACM 39th International Conference on Software Engineering: New Ideas and Emerging Technologies Results Track (ICSE-NIER)

Ultra-large-scale mining has been shown to be useful for a number of software engineering tasks e.g. mining specifications, defect prediction. We propose a new research direction for accelerating ultra-large-scale mining that goes beyond parallelization. Our key idea is to analyze the interaction pattern between the mining task and the artifact to cluster artifacts such that running the mining task...

chapter

Prevalence of Botched Code Integrations

Ward Muylaert, Coen De Roover

2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR) > 503 - 506

2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR)

Integrating code from different sources can be an error-prone and effort-intensive process. While an integration may appear statically sound, unexpected errors may still surface at run time. The industry practice of continuous integration aims to detect these and other run-time errors through an extensive pipeline of successive tests. Using data from a continuous integration service, Travis CI, we...

chapter

A Dataset for Dynamic Discovery of Semantic Changes in Version Controlled Software Histories

Chenguang Zhu, Yi Li, Julia Rubin, Marsha Chechik

2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR) > 523 - 526

2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR)

Over the last few years, researchers proposed several semantic history slicing approaches that identify the set of semantically-related commits implementing a particular software functionality. However, there is no comprehensive benchmark for evaluating these approaches, making it difficult to assess their capabilities. This paper presents a dataset of 81 semantic change data collected from 8 real-world...

chapter

SPYSE - A Semantic Search Engine for Python Packages and Modules

Shiva Krishna Imminni, Mir Anamul Hasan, Michael Duckett, Puneet Sachdeva, more

2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C) > 625 - 628

2016 IEEE/ACM 38th International Conference on Software Engineering Companion (ICSE-C)

Code reuse is a common practice among software developers,whether novices or experts. Developers often rely on onlineresources in order to find code to reuse. For Python, thePython Package Index (PyPI) contains all packages developedfor the community and is the largest catalog of reusable, opensource packages developers can consult. While a valuableresource, the state of the art PyPI search has very...

chapter

Locating Bugs without Looking Back

Tezcan Dilshener, Michel Wermelinger, Yijun Yu

2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR) > 286 - 290

2016 IEEE/ACM 13th Conference on Mining Software Repositories (MSR)

Bug localisation is a core program comprehension task in software maintenance: given the observation of a bug, where is it located in the source code files? Information retrieval (IR) approaches see a bug report as the query, and the source code files as the documents to be retrieved, ranked by relevance. Such approaches have the advantage of not requiring expensive static or dynamic analysis of the...

chapter

Multi-type Co-clustering of General Heterogeneous Information Networks via Nonnegative Matrix Tri-Factorization

Xianchao Zhang, Haixin Li, Wenxin Liang, Jiebo Luo

2016 IEEE 16th International Conference on Data Mining (ICDM) > 1353 - 1358

2016 IEEE 16th International Conference on Data Mining (ICDM)

Many kinds of real world data can be modeled by a heterogeneous information network (HIN) which consists of multiple types of objects. Clustering plays an important role in mining knowledge from HIN. Several HIN clustering algorithms have been proposed in recent years. However, these algorithms suffer from one or moreof the following problems: (1) inability to model general HINs, (2) inability to...

chapter

Duplication Detection for Software Bug Reports based on Topic Model

Jie Zou, Ling Xu, Mengning Yang, Meng Yan, more

2016 9th International Conference on Service Science (ICSS) > 60 - 65

2016 9th International Conference on Service Science (ICSS)

The traditional duplicate bug reports detection approaches are usually based on vector space model. However, the experimental result is rarely satisfying since this method cannot distinguish semantic correlation among bug reports which written by natural languages. Topic model, as a method to model underlying topics of texts, can solve the problem of document similarity calculation methods used in...

chapter

Using Temporal and Semantic Developer-Level Information to Predict Maintenance Activity Profiles

Stanislav Levin, Amiram Yehudai

2016 IEEE International Conference on Software Maintenance and Evolution (ICSME) > 463 - 467

2016 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Predictive models for software projects' characteristics have been traditionally based on project-level metrics, employing only little developer-level information, or none at all. In this work we suggest novel metrics that capture temporal and semantic developer-level information collected on a per developer basis. To address the scalability challenges involved in computing these metrics for each...

chapter

Hidden social networks analysis by semantic mining of noisy corpora

Christophe Thovex

2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) > 868 - 875

2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)

The present work proposes a paradigm for the analysis of social networks hidden within incomplete data models, based on the semantic mining of noisy corpora. A proof of concept is implemented and experimented on a partial database resulting from the capture of short text messages in line with the international project ‘Sms4science’.

chapter

NLP-driven ontology modeling for handling event semantics in NL constraints

Mamoona Malik, Mehak Saleem

2016 Sixth International Conference on Innovative Computing Technology (INTECH) > 485 - 490

2016 Sixth International Conference on Innovative Computing Technology (INTECH)

To assist decision makers, there is need of providing an insight on current of scenario of market, considering really sensitive news involving economic events like acquisitions, stock splits, or dividend announcements. Similar to the work discussed above, to machine process natural language constraints, there is need of a mechanism that can automate events related information extraction and knowledge...

chapter

SPARQL Queries over Source Code

Mattia Setzu, Maurizio Atzori

2016 IEEE Tenth International Conference on Semantic Computing (ICSC) > 104 - 106

2016 IEEE Tenth International Conference on Semantic Computing (ICSC)

We introduce a framework to extract and parse Java source code, serialize it into RDF triples by applying an appropriate ontology and then analyze the resulting structured code information by using standard SPARQL queries. We present our experiments on a sample of 134 Java repositories collected from Github, obtaining 17 Million triples about methods, input and output types, comments, and other source...

chapter

General LTL Specification Mining (T)

Caroline Lemieux, Dennis Park, Ivan Beschastnikh

2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE) > 81 - 92

2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Temporal properties are useful for describing and reasoning about software behavior, but developers rarely write down temporal specifications of their systems. Prior work on inferring specifications developed tools to extract likely program specifications that fit particular kinds of tool-specific templates. This paper introduces Texada, a new temporal specification mining tool for extracting specifications...

chapter

"What Parts of Your Apps are Loved by Users?" (T)

Xiaodong Gu, Sunghun Kim

2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE) > 760 - 770

2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE)

Recently, Begel et al. found that one of the most important questions software developers ask is "what parts of software are used/loved by users." User reviews provide an effective channel to address this question. However, most existing review summarization tools treat reviews as bags-of-words (i.e., mixed review categories) and are limited to extract software aspects and user preferences...

chapter

Lightweight Semantic approach for enterprise interoperability issues

Martin Seleng, Stefan Dlugolinsky, Martin Tomasek, Karol Furdik, more

2015 IEEE 19th International Conference on Intelligent Engineering Systems (INES) > 395 - 400

2015 IEEE 19th International Conference on Intelligent Engineering Systems (INES)

In this paper we present an ongoing FP7 project VENIS, where we are focusing on interoperability problems between Large Enterprise (LE) and Small/Medium Enterprise (SME) as well as Micro Enterprise (ME). We are proposing a solution for the enterprise search and enterprise interoperability, which follows the lightweight semantic approach. This approach is more suitable for SMEs and MEs, which are not...

chapter

Matching GitHub Developer Profiles to Job Advertisements

Claudia Hauff, Georgios Gousios

2015 IEEE/ACM 12th Working Conference on Mining Software Repositories > 362 - 366

2015 IEEE/ACM 12th Working Conference on Mining Software Repositories (MSR)

GitHub is a social coding platform that enables developers to efficiently work on projects, connect with other developers, collaborate and generally "be seen: by the community. This visibility also extends to prospective employers and HR personnel who may use GitHub to learn more about a developer's skills and interests. We propose a pipeline that automatizes this process and automatically suggests...

chapter

Mapping informal settlements using WorldView-2 imagery and C4.5 decision tree classifier

Barbara Maria Giaccom Ribeiro

2015 Joint Urban Remote Sensing Event (JURSE) > 1 - 4

2015 Joint Urban Remote Sensing Event (JURSE)

Recent developments in geotechnologies provide resources to propose innovative strategies for urban and environmental management, including remote sensing data and computational resources for processing them. With the main objective of identifying urban areas of illegal occupation, this work uses WorldView-2-sensor images and the InterIMAGE, an image interpretation software, based on knowledge, under...

Data set:
ieee
Keywords:
DATA MINING
SEMANTICS
SOFTWARE

Publication date

Set your own date range

INFONA - science communication portal

Search results

Where Is the Road for Issue Reports Classification Based on Text Mining?

Mining Version Control System for Automatically Generating Commit Comment

GenLog: Accurate Log Template Discovery for Stripped X86 Binaries

Are big data talents different from business intelligence expertise?: Evidence from text mining using job recruitment advertisements

On Accelerating Ultra-Large-Scale Mining

Prevalence of Botched Code Integrations

A Dataset for Dynamic Discovery of Semantic Changes in Version Controlled Software Histories

SPYSE - A Semantic Search Engine for Python Packages and Modules

Locating Bugs without Looking Back

Multi-type Co-clustering of General Heterogeneous Information Networks via Nonnegative Matrix Tri-Factorization

Duplication Detection for Software Bug Reports based on Topic Model

Using Temporal and Semantic Developer-Level Information to Predict Maintenance Activity Profiles

Hidden social networks analysis by semantic mining of noisy corpora

NLP-driven ontology modeling for handling event semantics in NL constraints

SPARQL Queries over Source Code

General LTL Specification Mining (T)

"What Parts of Your Apps are Loved by Users?" (T)

Lightweight Semantic approach for enterprise interoperability issues

Matching GitHub Developer Profiles to Job Advertisements

Mapping informal settlements using WorldView-2 imagery and C4.5 decision tree classifier

Filter options

Publication date

Content availability

Publication type

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Content availability

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options