The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
In 2015, The District of Columbia framed a Vision Zero mission and action plan, aimed at curbing roadway deaths to zero by 2024. Automated traffic enforcement (ATE) features prominently amongst Vision Zero strategies. The paper performs some analytics of data derived from the DC's speed and red light cameras and discusses a framework of how ATE can help DC reach its Vision Zero goals by fine-tuning...
Sampling through crawling is an important research topic in social network analysis. However there is very little existing work on sampling through crawling in directed networks. In this paper we present a new method of sampling a directed network, with the objective of maximizing the node coverage. Our proposed method, Predicted Max Degree (PMD) Sampling, works by predicting which k open nodes are...
Big data processing has introduced new ideas in the applications of bacterial analysis in recent years. This paper aims to develop an effective framework to automatically extract quantitative knowledge relating to bacterial motility through processing a sequence of large-scale microscopic images of bacterial movements. It was hypothesized that motile bacteria move according to a conceptual model referred...
Multi-relational data, like knowledge graphs, are generated from multiple data sources by extracting entities and their relationships. We often want to include inferred, implicit or likely relationships that are not explicitly stated, which can be viewed as link-prediction in a graph. Tensor decomposition models have been shown to produce state-of-the-art results in link-prediction tasks. We describe...
It is required to simulate the tactical moving objects such as combat plane, naval vessels, and submarine for performing performance and functional testing of the target management system. It is not possible to collect tactical moving objects data of various circumstances due to military security. To solve this problem, in this paper, we have proposed a generator of test data set for tactical moving...
Convolutional Neural Networks (CNN) are useful methods for identification of previously unknown embedded patterns in images. Several object and facial recognition along with image segmentation tasks have benefited from the non-linear abstraction of hybrid features using CNN. This work presents a novel CNN model parametrization work-flow developed on the cloud-computing platform of Microsoft Azure...
The amount of data that businesses collect and analyze has been rapidly increasing, which has triggered an increase in big data teams. With the growth of both the number and size of big data teams, specialized roles are starting to be defined. One such role is the data engineer, who focuses on ensuring that the data is easily available for advanced analytics. Via a case study, this paper explores...
Despite the existence of data analysis tools such as R, SQL, Excel and others, it is still insufficient to cope with today's big data analysis needs. The author proposes a CUI (Character User Interface) toolset with dozens of functions to neatly handle tabular data in TSV (Tab Separated Values) files. It implements many basic and useful functions that have not been implemented in existing software...
Applications deployed in the Cloud usually come with dedicated performance and availability requirements. This can be achieved by replicating data across several sites and/or by partitioning data. Data replication allows to parallelize read requests and thus to decrease data access latency, but induces significant overhead for the synchronization of updates. Partitioning, in contrast, is highly beneficial...
As hardware and software technologies have improved, our definition of a “manageable amount of data” has increased in its scope dramatically. The term “big data” can be applied to any of several different projects and technologies sharing the ultimate goal of supporting analysis on these large, heterogeneous, and evolving data sets. The term “data science” refers to the statistical, technical, and...
E-commerce plays a key role in business success nowadays. Therefore, the performance of E-commerce websites is critical. E-commerce websites generate a large amount of data that is often used for performance evaluation. Many website evaluation methods have been proposed, but the social media factor is usually not taken into consideration. In this paper, Twitter data is utilized for big data analytics...
In this paper we present Digree, an experimental middleware system that can execute graph pattern matching queries over databases hosting voluminous graph datasets. First, we formally present the employed data model and the processes of re-writing a query into an equivalent set of subqueries and subsequently combining the partial results into the final result set. Our framework guarantees the correctness...
The Polystore architecture revisits the federated approach to access and querying the standalone, independent databases in the uniform and optimized fashion, but this time in the context of heterogeneous data and specialized analyses. In light of this architectural philosophy, and in the light of the major data architecture development efforts at the US Department of Veterans Administration (VA),...
The CloudMdsQL polystore provides integrated access to multiple heterogeneous data stores, such as RDBMS, NoSQL or even HDFS through a big data analytics framework such as MapReduce or Spark. The CloudMdsQL language is a functional SQL-like query language with a flexible nested data model. A major capability is to exploit the full power of each of the underlying data stores by allowing native queries...
Solar car race competitions offer realistic conditions to test and demonstrate the state-of-the-art technologies in multidisciplinary fields. In such races the solar panels mounted on the car produce the energy required to power the vehicle. A simulator runs during the race determines the optimal race speed based on the predicted availability of solar energy and other parameters as well as road conditions...
With no limit on time and location [1], the number of users attracted by massive open online course (MOOC) has increased rapidly, and many platforms have been built to provide a variety of courses. All of these trigger an explosive growth in data volume. As we known, people have met big data in many areas and proposed many techniques and methods to deal with them. However, people still have no sense...
Information which is related to the geographic area is being produced continuously. However, there is currently no technique that can handle large spatial data. For this reason, we developed a spatial big data platform, ORANGE, based on the Apache Hadoop. ORANGE can load the vector and raster data based on HDFS and manages metadata and creates index data using the Apache HIVE. These improvements made...
This poster presents the problem of 3D contact measurements from two co-registered volumetric images (z-stacks). The 3D contact measurement consists of (a) segmenting an object of interest in each z-stack, (b) computing the relative spatial positions of the detected objects to detect contacts, (c) validating the accuracy of segmentation, and (d) visually verifying correct contact detection. The 3D...
In recent years the flow of Saudi dialect big data in social media has enforced different sentiment analysis techniques to know the trends of the Saudi users towards different issues and events. The currents techniques analyze this amount of data in off-line manner which can't support the real-time decision making in the critical issues. Real-time analytics on stream data have been given attention...
The Atmospheric Radiation Measurement (ARM) Climate Research Facility (www.arm.gov) provides atmospheric observations from diverse climatic regimes around the world. Currently, ARM archives over 22 million user assessable data files, primarily stored in NetCDF file format, with total data volumes close to one Petabyte. In this paper, we will discuss how ARM is currently storing, distributing, cataloging...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.