The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Network attack graphs are a type of analysis tool that can be used to determine the impact that security vulnerabilities have on the network. It is important, then, for attack graphs to be able to represent enough information to aid this analysis. Moreover, they must be able to handle and integrate new vulnerabilities that are being discovered by the security community. We developed a prototype tool...
Future of food innovation lies in the art and science of designing an interactive connected intelligent device that can detect how we feel and display the content suitable for individual consumers. We designed a smart dining table and chairs that can detect, sense, and analyze consumer's satisfaction, and interact with consumers. A team of furniture designer, software engineers, mechanical engineer,...
App store reviews are currently the main source of information for analyzing different aspects of app development and evolution. However, app users' feedback do not only occur on the app store. In fact, a large quantity of posts about apps are made daily on social media. In this paper, we study how Twitter can provide complementary information to support mobile app development. By analysing a total...
With the goal of helping software engineering researchers understand how to improve their papers, Mary Shaw presented "Writing Good Software Engineering Research Papers" in 2003. Shaw analyzed the abstracts of the papers submitted to the 2002 International Conference of Software Engineering (ICSE) to determine trends in research question type, contribution type, and validation approach....
Learning to use existing or new software libraries is a difficult task for software developers, which would impede their productivity. Much existing work has provided different techniques to mine API usage patterns from client programs inorder to help developers on understanding and using existinglibraries. However, these techniques produce incomplete patterns, i.e., without temporal properties, or...
App developers naturally want to know which of their releases are successful and which are unsuccessful. Such information can help with release planning and requirements prioritisation and elicitation. To address this problem, I performed causal analysis on 52 weeks of popular app releases from Google Play and Windows Phone Store. The results reveal properties of successful releases in multiple app...
Nowadays, software development projects produce a large number of software artifacts including source code, execution traces, end-user feedback, as well as informal documentation such as developers' discussions, change logs, StackOverflow, and code reviews. Such data embeds rich and significant knowledge about software projects, their quality and services, as well as the dynamics of software development...
Many books and papers describe how to do data science. While those texts are useful, it can also be important to reflect on anti-patterns; i.e. common classes of errors seen when large communities of researchers and commercial software engineers use, and misuse data mining tools. This technical briefing will present those errors and show how to avoid them.
Test cases are an essential tool in software quality assurance: they ensure that code behaves as specified in the requirement. However, writing test cases does not have only benefits, it comes with a cost: the programmer has to formulate the test cases and maintain them when the tested source code changes. Particularly for start-ups or small enterprises such costs become prohibitive, which often prefer...
Stack Overflow is one of the most popular question-and-answer sites for programmers. However, there are a great number of duplicate questions that are expected to be detected automatically in a short time. In this paper, we introduce two approaches to improve the detection accuracy: splitting body into different types of data and using word-embedding to treat word ambiguities that are not contained...
Context: GitHub, nowadays the most popular social coding platform, has become the reference for mining Open Source repositories, a growing research trend aiming at learning from previous software projects to improve the development of new ones. In the last years, a considerable amount of research papers have been published reporting findings based on data mined from GitHub. As the community continues...
Many kinds of real world data can be modeled by a heterogeneous information network (HIN) which consists of multiple types of objects. Clustering plays an important role in mining knowledge from HIN. Several HIN clustering algorithms have been proposed in recent years. However, these algorithms suffer from one or moreof the following problems: (1) inability to model general HINs, (2) inability to...
Open Source Software (OSS) hosted in Repositories such as GitHub can be valuable as a source of information for requirements engineers, especially in the apprentice phase of a new application. In this context, we propose a strategy to speed up the discovery of valuable information, since manual search may be time consuming in the vast dataset of GitHub projects. Our strategy is based on the identification...
A task at the beginning of the software development process is the creation of a requirements specification. The requirements specification is usually created by a software engineering expert. We try to substitute this expert by a domain expert (the user) and formulate the problem of creating requirements specifications as a search-based software engineering problem. The domain expert provides only...
Reviews for software products contain much information about the users' requirements and preferences, which can be very useful to the requirements engineer. However, taking advantage of this information is not easy due to the large and overwhelming number of reviews that is posted in various channels. Machine learning and opinion mining techniques have therefore been used to process the reviews automatically...
Big Data technologies enable new possibilities to analyze historical data generated by process plants. One possible application is the development of new types of operator support systems (OSS), which could help plant operators during operations in identifying and dealing with critical situations. The project FEE has the objective to develop such support functions based on Big Data analytics of historical...
In-spite of the principle of good programming practice which stipulates that a commit should include only modifications belonging to one task, programmers submit tangled commits consisting of modifications related to two or several distinct tasks. Some researches show that between 11 and 39% of bug fix commits are tangled and at least 16.6% of all the commits are incorrectly associated to bug reports...
Classification and regression algorithms require a flat mining table as input, which in most cases is built manually bysummarizing relational data into propositional features. This taskis not only time consuming, but it also inhibits the discoveryof new knowledge, because a small portion of the possiblefeatures will be built. Dataconda, a software program availableon www.dataconda.net, makes this...
ToMaR provides a flexible application for integrating existing software into data-flow applications that execute on top of a MapReduce-based environment. The application supports a Linux-inspired pipes-and-filter based syntax, the execution of existing applications using file and stream based IO, and the efficient integration with existing data-flow frameworks like Apache Pig.
DIBBs Brown Dog is a recent cyberinfrastructure effort which aims to create two new services to aid users in the searching, accessing, and usage of digital data and provide these services in a manner that is as broadly and easily accessible as possible. At its lowest level, the Data Access Proxy (DAP) providing file format conversion capabilities and the Data Tilling Service (DTS) providing content...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.