The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
We introduce a novel algorithm for mining temporal intervals from real-time system traces with linear complexity using passive, black-box learning. Our interest is in mining nfer specifications from spacecraft telemetry to improve human and machine comprehension. Nfer is a recently proposed formalism for inferring event stream abstractions with a rule notation based on Allen Logic. The problem of...
Currently, open source projects receive various kinds of issues daily, because of the extreme openness of Issue Tracking System (ITS) in GitHub. ITS is a labor-intensive and time-consuming task of issue categorization for project managers. However, a contributor is only required a short textual abstract to report an issue in GitHub. Thus, most traditional classification approaches based on detailed...
Developing new ideas and algorithms or comparing new findings in the field of requirements engineering and management implies a dataset to work with. Collecting the required data is time consuming, tedious, and may involve unforeseen difficulties. The need for datasets often forces re-searchers to collect data themselves in order to evaluate their findings. However, comparing results with other publications...
Software engineers are using a variety of social platforms, where they participate in open source software projects and respond to other developers that ask for help on specific issues. This presence of developers in different platforms is a mirror of their hands-on experience and expertise in different technologies and programming languages and a useful source of information for their own use but...
In any sufficiently complex software system there are experts, having a deeper understanding of parts of the system than others. However, it is not always clear who these experts are and which particular parts of the system they can provide help with. We propose a framework to elicit the expertise of developers and recommend experts by analyzing complexity measures over time. Furthermore, teams can...
This technical briefing provides an overview of how quantitative empirical research methods can be combined with qualitative ones generating the family of empirical software engineering approaches known as mixed-methods. The ultimate aim of such mixed-methods is supporting cause-effect claims combining multiple data types, sources and analyses that provide software practitioners and academicians solid...
A main difficulty to study the evolution and quality of real-life software systems is the effect of moderator factors, such as: programming skill, type of maintenance task, and learning effect. Experimenters must account for moderator factors to identify the relationships between the variables of interest. In practice, controlling for moderator factors in realistic (industrial) settings is expensive...
Researchers perform empirical studies in industry to gain qualitative insights into a real-world problem. However, common critics are the diversity and selection process of participants. To address these issues, we propose to improve the integration of question-answering systems into empirical study. In this paper, we i) describe approaches to conduct studies in such systems, ii) exemplify corresponding...
Open-source projects rely on attracting new and retaining old contributors for achieving sustainable success. One may suspect that adopting new development practices like Continuous Integration (CI) should improve the attractiveness of a project. However, little is known about the impact that adoption of CI has on developer attraction and retention. To bridge this gap, we study how the introduction...
Integrating code from different sources can be an error-prone and effort-intensive process. While an integration may appear statically sound, unexpected errors may still surface at run time. The industry practice of continuous integration aims to detect these and other run-time errors through an extensive pipeline of successive tests. Using data from a continuous integration service, Travis CI, we...
In a large, long-lived project, an effective code review process is key to ensuring the long-term quality of the code base. In this work, we study code review practices of a large, open source project, and we investigate how the developers themselves perceive code review quality. We present a qualitative study that summarizes the results from a survey of 88 Mozilla core developers. The results provide...
Code reuse is a common practice among software developers,whether novices or experts. Developers often rely on onlineresources in order to find code to reuse. For Python, thePython Package Index (PyPI) contains all packages developedfor the community and is the largest catalog of reusable, opensource packages developers can consult. While a valuableresource, the state of the art PyPI search has very...
Many books and papers describe how to do data science. While those texts are useful, it can also be important to reflect on anti-patterns; i.e. common classes of errors seen when large communities of researchers and commercial software engineers use, and misuse data mining tools. This technical briefing will present those errors and show how to avoid them.
ABSTRACTIssue tracking systems store valuable data for testing hy-potheses concerning maintenance, building statistical pre-diction models and (recently) investigating developer affec-tiveness. For the latter, issue tracking systems can be minedto explore developers emotions, sentiments and politeness, affects for short. However, research on affect detection insoftware artefacts is still in its early...
Performance regressions, such as a higher CPU utilization than in the previous version of an application, are caused by software application updates that negatively affect the performance of an application.Although a plethora of mining software repository research has been done to detect such regressions, research tools are generally not readily available to practitioners. Application Performance...
In this paper we present a novel software analytics infrastructure supporting for a combination of three requirements to serve software practitioners in utilising data-driven decision making: (1) Real-time insight: streaming software analytics unify static historical and current event-stream data enabling for immediate, nearly real-time insight into software quality, processes and users; (2) Query...
Crashing of program is an annoying experience for users. Whenever a program crashes, an event log is generated. Sometimes built in crash reporting programs send crash reports automatically to developing site whereas sometimes, user is presented with an option to report the crash himself. This reporting is often useful for the development team to diagnose and fix the problem. It happens quite often...
It is crucial for Internet company to provide highly reliable web-based services. The web-based services always have many components running in the large-scale infrastructure with complex interactions. As an indispensable part of high reliability, the diagnosis remains to be a thorny problem. With the growth of system scale and complexity, it becomes even more difficult. In this paper, we propose...
Open Source Software (OSS) is distributed and maintained collaboratively by developers all over the world. However, frequent personnel turnover and lack of organizational management makes it difficult to capture the actual development effort. Various OSS maintenance effort estimation approaches have been developed to provide a way to understand and estimate development effort. The goal of this study...
Predictive models for software projects' characteristics have been traditionally based on project-level metrics, employing only little developer-level information, or none at all. In this work we suggest novel metrics that capture temporal and semantic developer-level information collected on a per developer basis. To address the scalability challenges involved in computing these metrics for each...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.