The Infona portal uses cookies, i.e. strings of text saved by a browser on the user's device. The portal can access those files and use them to remember the user's data, such as their chosen settings (screen view, interface language, etc.), or their login data. By using the Infona portal the user accepts automatic saving and using this information for portal operation purposes. More information on the subject can be found in the Privacy Policy and Terms of Service. By closing this window the user confirms that they have read the information on cookie usage, and they accept the privacy policy and the way cookies are used by the portal. You can change the cookie settings in your browser.
Traditionally, the time-to-fill metric is used as a scorecard for past performance. An organization may use time to fill to assess the performance of its internal recruiting team, or as a way to set service level agreements with outsourced recruiting partners. By first developing a set of quantifiable job features and then applying survival analysis to historical time-to-fill data, we build a predictor...
Finding the best candidates to match a set of job requirements can be viewed as both an art and a science. In this paper, we conduct an empirical study using actual job candidates and job applicants. We compare the ranked lists generated by executive recruiting experts with the list generated by three search strategies: one using crowdworkers in a gamified environment, a second using information retrieval-based...
According to a report online [34], more than 200 million unique users search for jobs online every month. This incredibly large and fast growing demand has enticed software giants such as Google and Facebook to enter this space, which was previously dominated by companies such as LinkedIn, Indeed, Dice and CareerBuilder. Recently, Google released their “AIpowered Jobs Search Engine”, “Google For Jobs”...
Multilayer network analysis has become a vital tool for understanding different relationships and their interactions in a complex system, where each layer in a multilayer network depicts the topological structure of a group of nodes corresponding to a particular relationship. The interactions among different layers imply how the interplay of different relations on the topology of each layer. For a...
Networks are models representing relationships between entities. Often these relationships are explicitly given, or we must learn a representation which generalizes and predicts observed behavior in underlying individual data (e.g. attributes or labels). Whether given or inferred, choosing the best representation affects subsequent tasks and questions on the network. This work focuses on model selection...
Reliable uncertainty estimation for time series prediction is critical in many fields, including physics, biology, and manufacturing. At Uber, probabilistic time series forecasting is used for robust prediction of number of trips during special events, driver incentive allocation, as well as real-time anomaly detection across millions of metrics. Classical time series models are often used in conjunction...
Presents the introductory welcome message from the conference proceedings. May include the conference officers' congratulations to all involved with the conference event and publication of the proceedings record.
Highly imbalanced datasets continue to be a challenge in many data mining applications. It is surprising that state-of-the-art techniques countering class imbalances are usually very computationally expensive and therefore unscalable. Most research effort has been directed into enhancing those techniques, e.g., by focusing on borderline examples or combining multiple techniques. This is usually accompanied...
Deep learning algorithms have recently produced state-of-the-art accuracy in many classification tasks, but this success is typically dependent on access to many annotated training examples. For domains without such data, an attractive alternative is to train models with light, or distant supervision. In this paper, we introduce a deep neural network for the Learning from Label Proportion (LLP) setting,...
In this paper, we propose and evaluate the application of unsupervised machine learning to anomaly detection for a Cyber-Physical System (CPS). We compare two methods: Deep Neural Networks (DNN) adapted to time series data generated by a CPS, and one-class Support Vector Machines (SVM). These methods are evaluated against data from the Secure Water Treatment (SWaT) testbed, a scaled-down but fully...
This paper presents detailed anomaly detection evaluation on operational time-series data of Internet of Things (IoT) based household devices in general and Heating, Ventilation and Air Conditioning (HVAC) systems in specific. Due to the number of issues observed during evaluation of widely used distance-based, statistical-based, and cluster-based anomaly detection techniques, we also present a pattern-based...
The hashtag recommendation problem addresses recommending (suggesting) one or more hashtags to explicitly tag a post made on a given social network platform, based upon the content and context of the post. In this work, we propose a novel methodology for hashtag recommendation for microblog posts, specifically Twitter. The methodology, EmTaggeR, is built upon a training-testing framework that builds...
Public entities such as companies and politicians increasingly use online social networks to communicate directly with their constituencies. Often, this public messaging is aimed at aligning the entity with a particular cause or issue, such as the environment or public health. However, as a consumer or voter, it can be difficult to assess an entity’s true commitment to a cause based on public messaging...
Social Media allows people to post widely and share the posted online-items. Such items gain their popularity by the amount of attention received. Thus, studies on modeling the arrival process of attention to an individual item have recently attracted a great deal of interest. In this paper, we propose, by combining a Dirichlet process with a Hawkes process in a novel way, a probabilistic model, called...
Even while engaged in an attention-consuming activity such as watching TV, social media users often end up paying attention to one or more social media. This is an example of a behavioral phenomenon called Continuous Partial Attention (CPA). Quantification of user attention can be a valuable metric in understanding user behavior under scenarios where their attention is divided. In this study, we propose...
The recent rise in the use of social networks has resulted in an abundance of information on different aspects of everyday social activities that is available online. In the process of analysis of identifying the information originating from social networks, and especially Twitter, an important aspect is that of the geographic coordinates, i.e., geolocalisation, of the relevant information. Geolocalized...
As the blooming development of data mining in social computing systems (e.g., crowdsourcing system), statistical inference from crowdsourced data severs as a powerful tool to provide diversified services. To support critical applications (e.g., recommendation), in this paper, we shall focus on the collaborative ranking problems and construct a system of which the input is crowdsourced pairwise comparisons...
Contrast patterns are itemsets that frequently occur in one dataset while not in another. These patterns have been successfully applied to many data mining domains, such as prediction, classification and clustering. However, none of the previous studies has considered extracting contrast patterns from different types of datasets. In this paper, we introduce a new type of contrast pattern, Conditional...
Forecasting models that utilize multiple predictors are gaining popularity in a variety of fields. In some cases they allow constructing more precise forecasting models, leveraging the predictive potential of many variables. Unfortunately, in practice we do not know which observed predictors have a direct impact on the target variable. Moreover, adding unrelated variables may diminish the quality...
Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.