Search results

Items from 1 to 20 out of 31 results

chapter

A system of categorization and classification based on certain criteria

Robert Gyorodi, Cornelia Gyorodi, Anamaria Tontea, Livia Bandici

2016 6th International Conference on Computers Communications and Control (ICCCC) > 208 - 212

2016 6th International Conference on Computers Communications and Control (ICCCC)

This paper proposes a new system of categorization and classification using data mining techniques based on certain criteria/topics. We describe the design and implementation of proposed system that automatically categorizes a restaurant as being good or bad, using data mining techniques, based on users' reviews. For this study we took a data set consisting of approximately 9,000 reviews for 2,355...

chapter

Generic social network data crawler using attributed graph

Rinta Kridalukmana

2015 2nd International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE) > 138 - 142

2015 2nd International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE)

The more increasing active users in sharing information and in interacting with others on online social network unconsciously has reflected the existence of many data that can be used as the research objects for various purposes. Hence, the activity of data crawling is critical as a first gate in accessing the information in social network. This study aims to develop software of data crawler by using...

chapter

SAFSB: A self-adaptive focused crawler

Dilip kumar Sharma, Mohd Aamir Khan

2015 1st International Conference on Next Generation Computing Technologies (NGCT) > 719 - 724

2015 1st International Conference on Next Generation Computing Technologies (NGCT)

There are about 3 billion indexed websites present in the WWW. Not all websites do not belong to a particular topic are indexed by a search engine say google.com, there are online platforms available where different users help the person asking for a (Universal Resource Locator) URL containing a topical information. To verify the authenticity and validity of the URL, an empirical methodology and its...

chapter

Social Emotion Analysis System for Online News

Peng Nie, Xue Zhao, Li Yu, Chao Wang, more

2015 12th Web Information System and Application Conference (WISA) > 43 - 48

2015 12th Web Information System and Application Conference (WISA)

Social emotion analysis of online users has become an important task for mining public opinions, which aims at detecting the readers' emotions evoked by online news articles. In this paper, we focus on building a social emotion analysis system (SEAS) for online news. The system has implemented a text data crawler for mainstream online news websites, the modules of document preprocessing, document...

chapter

RUM Extractor: A Facebook Extractor for Data Analysis

Rehab M. Duwairi, Mosab Alfaqeeh

2015 3rd International Conference on Future Internet of Things and Cloud > 709 - 713

2015 3rd International Conference on Future Internet of Things and Cloud (FiCloud)

Social Network Analysis (SNA) is a field of study that focuses on analyzing user profiles and participations on social network channels in order to model relationships between people and to predict certain behaviors or knowledge. To achieve their goals, researchers, interested in SNA, have to extract content and structure from the numerous social networks available today. Existing tools, which help...

chapter

AutoBLG: Automatic URL blacklist generator using search space expansion and filters

Bo Sun, Mitsuaki Akiyama, Takeshi Yagi, Mitsuhiro Hatada, more

2015 IEEE Symposium on Computers and Communication (ISCC) > 625 - 631

2015 IEEE Symposium on Computers and Communication (ISCC)

Modern web users are exposed to a browser security threat called drive-by-download attacks that occur by simply visiting a malicious Uniform Resource Locator (URL) that embeds code to exploit web browser vulnerabilities. Many web users tend to click such URLs without considering the underlying threats. URL blacklists are an effective countermeasure to such browser-targeted attacks. URLs are frequently...

chapter

Neuron crawler: An automatic tracing algorithm for very large neuron images

Zhi Zhou, Staci A. Sorensen, Hanchuan Peng

2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI) > 870 - 874

2015 IEEE 12th International Symposium on Biomedical Imaging (ISBI 2015)

Automatic 3D neuron reconstruction for very large 3D light microscopy images remains to be a challenge in neuroscience. Few existing neuron tracing algorithms can be used with commonly available computers (laptops, desktops, or workstations) to efficiently and accurately reconstruct a neuron in image stacks that are tens of gigabytes or greater. We introduce a new automatic tracing algorithm called...

chapter

Keyword focused web crawler

Gunjan H. Agre, Nikita V. Mahajan

2015 2nd International Conference on Electronics and Communication Systems (ICECS) > 1089 - 1092

2015 2nd International Conference on Electronics and Communication Systems (ICECS)

Users and uses of internet is growing tremendously these days which causing an extreme trouble and efforts at user side to get web pages searched which are as per concern and relevant to user's requirement Generally users approach to search web pages from a large available hierarchy of concepts or use a query to browse web pages from available search engine and receive results based on search pattern...

chapter

A fast big data collection system using MapReduce framework

Bing Li, Keith C.C. Chan

2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems > 530 - 535

2014 IEEE 3rd International Conference on Cloud Computing and Intelligence Systems (CCIS)

Social network like a corpus with valuable data, has attracted much attention from a various fields of researchers in recent years, especially in the subject of big data analytics. However, as the foundation, the part of efficient and accurate data collection has not been focused much in the past published works. During the data among the web increasing rapidly, this article will identify two major...

chapter

Big Data in Memory: Benchmarking in Memory Database Using the Distributed Key-Value Store for Constructing a Large Scale Information Infrastructure

Michiaki Iwazume, Takahiro Iwase, Kouji Tanaka, Hideaki Fujii

2014 IEEE 38th International Computer Software and Applications Conference Workshops > 199 - 204

2014 IEEE 38th International Computer Software and Applications Conference Workshops (COMPSACW)

The Universal Communication Research Institute (UCRI), NICT conducts research and development on universal communication technologies: multi-lingual machine translation, spoken dialogue, information analysis and ultra-realistic interaction technologies, through which people can truly interconnect, anytime, anywhere, about any topic, and by any method, transcending the boundaries of language, culture,...

chapter

Optimization of WEB Data Collection Technology Based on HITS Algorithm

Desheng Mei, Weibo Li, Pin He

2013 International Conference on Computer Sciences and Applications > 119 - 122

2013 International Conference on Computer Sciences and Applications (CSA)

The WEB HITS algorithm was based on data acquisition modules for vast data collection. The algorithm is typical in using the Web link structure excavation to establish data bindings between the page links to improve the linked structure. This paper suggests another acquisition module to improve the HITS algorithm, and it also had been practiced and applied through the government website platform.

chapter

Research on Formation Handling of Serial Swarm Robots

Jianhua Wu, Mingshun Qi

2012 Third Global Congress on Intelligent Systems > 338 - 341

2012 Third Global Congress on Intelligent Systems (GCIS)

With analysis and research for of the serial formation of swarms robots movement, the strategy of keeping swarms robots formation in the process of schlepping and turning was proposed based on the discussion of the robot kinematic model. In the condition of the wireless communication the master-slave robots can control their speed based on environmental information in order to keep the spacing between...

chapter

Active Measurement on P2P Swarm Based on Coupon Collector Model

Yadong Chen, Tianhuan Qin, Xin Huang, Zhe Yang, more

2012 International Conference on Computer Science and Service System > 999 - 1002

2012 International Conference on Computer Science and Service System (CSSS)

Detecting the P2P swarm, analyzing their distribution is a challenging task, which has not received the deserved attention. In this paper, we demonstrate an active measurement methodology to continuously trace the real-world Bit Torrent and eMule/eDonkey swarms over the Internet from a stub network and for a long period of time. Our measurements achieve the ability of real-time scanning the online...

chapter

SoDesktop: A Desktop Search Engine

Zhiwang Cen, Jungang Xu, Jian Sun

2012 International Conference on Communication Systems and Network Technologies > 463 - 467

2012 International Conference on Communication Systems and Network Technologies (CSNT)

The number of files stored in a personal computer is increasing very quickly, so it is difficult for users to find the information they want. One desktop search engine named SoDesktop is proposed in this paper, which is composed of four modules including Data crawler, Task scheduler, Data indexer and Data searcher. The implementations of these four modules are described in details, and the implementation...

chapter

Symbolic verification of web crawler functionality and its properties

Keerthi S. Shetty, Swaraj Bhat, Sanjay Singh

2012 International Conference on Computer Communication and Informatics > 1 - 6

2012 International Conference on Computer Communication and Informatics (ICCCI)

Now a days people use search engines every now and then to retrieve documents from the Web. Web crawling is the process by which a search engine gather pages from the Web to index them and support a search engine. Web crawlers are the heart of search engines. Web crawlers continuously keep on crawling the web and find any new web pages that have been added to the web, pages that have been removed...

chapter

Information Retrieval from the Web and Application of Migrating Crawler

Niraj Singhal, R.P. Agarwal, Ashutosh Dixit, A.K. Sharma

2011 International Conference on Computational Intelligence and Communication Networks > 476 - 480

2011 International Conference on Computational Intelligence and Communication Networks (CICN)

Study reports that about 40% of current internet traffic and bandwidth consumption is due to the web crawlers that retrieve pages for indexing by the different search engines. As the size of the web continues to grow, searching it for useful information has become increasingly difficult. The centralized crawling techniques are unable to cope up with constantly growing web. In this paper it is presented...

chapter

Video Semantic Information Architecture Based on Web Crawlers

Guohong Guo, Wei Wei

2011 International Conference on Internet Technology and Applications > 1 - 4

2011 International Conference on Internet Technology and Applications (iTAP)

In order to apply the web crawler's technology to the construction of video semantic information, this paper propose the video information collection architecture based on Web crawler after has analyzed the basic principles, key technologies and problems of the current crawler technology. The system framework divided to two parts. The background part mainly manage Web crawler; Foreground part mainly...

chapter

Efficient extraction of news articles based on RSS crawling

G Adam, C Bouras, V Poulopoulos

2010 International Conference on Machine and Web Intelligence > 1 - 7

International Conference on Machine and Web Intelligence (ICMWI 2010)

The expansion of the World Wide Web has led to a state where a vast amount of Internet users face and have to overcome the major problem of discovering desired information. It is inevitable that hundreds of web pages and weblogs are generated daily or changing on a daily basis. The main problem that arises from the continuous generation and alteration of web pages is the discovery of useful information,...

chapter

Design and Implementation of FTP Search Engine Based on Lucene

Guanlin Chen, Mingming Chen

2010 International Conference on Internet Technology and Applications > 1 - 4

2010 International Conference on Internet Technology and Applications (iTAP 2010)

FTP search engine is one of the main important tools in network applications. This paper presents a design of FTP search engine system based on Lucene, which develops a multithreaded spider as an extension of Lucene, and improves the Chinese word segmentation with maximum matching algorithm in the Lucene documents. Finally the main functions and run-time examples of this system are shown.

chapter

Adaptive focused crawling based on link analysis

Debashis Hati, Biswajit Sahoo, Amritesh Kumar

2010 2nd International Conference on Education Technology and Computer > 4 > V4-455 - V4-460

2010 2nd International Conference on Education Technology and Computer (ICETC 2010)

A web search engine is designed to search for information on the World Wide Web (WWW). Crawlers are software which can traverse the internet and retrieve web pages by hyperlinks. In the face of the large spam websites, traditional web crawlers cannot function well to solve this problem. Focused crawlers utilize semantic web technologies to analyze the semantics of hyperlinks and web documents. The...

Keywords:
COMPUTERS

Publication date

Set your own date range

Keywords

INTERNET (12)
SEARCH ENGINES (12)
WEB PAGES (9)
DATA MINING (7)
DATABASES (6)
WEB SITES (6)
ONTOLOGIES (5)
WEB CRAWLER (5)
SERVERS (4)
WORLD WIDE WEB (4)
ALGORITHM DESIGN AND ANALYSIS (3)
CLIENT-SERVER SYSTEMS (3)
COMPUTER ARCHITECTURE (3)
MONITORING (3)
ONTOLOGY (3)
UNIFORM RESOURCE LOCATORS (3)
BANDWIDTH (2)
CRAWLER (2)
DATA COLLECTION (2)
DATA MODELS (2)
DOMAIN ONTOLOGY (2)
EDUCATION (2)
EDUCATIONAL INSTITUTIONS (2)
FACEBOOK (2)
FEATURE EXTRACTION (2)
FOCUSED CRAWLER (2)
INDEXES (2)
INFORMATION RETRIEVAL (2)
IP NETWORKS (2)
JAVA (2)
MACHINE LEARNING (2)
MOBILE AGENTS (2)
NATURAL LANGUAGE PROCESSING (2)
ONTOLOGIES (ARTIFICIAL INTELLIGENCE) (2)
PROBABILITY DENSITY FUNCTION (2)
PROTOCOLS (2)
SEARCH PROBLEMS (2)
SEMANTICS (2)
SOCIAL NETWORK ANALYSIS (2)
TRAINING (2)
WEB PAGE (2)
WEB SITE (2)
3D NEURON RECONSTRUCTION (1)
ACADEMIC PAPER (1)
ACTIVE MEASUREMENT (1)
ADAPTATION MODELS (1)
ADAPTIVE FOCUSED CRAWLING (1)
AGENT CRAWLERS (1)
ALL-PATH-PRUNING (1)
ANALYSIS SYSTEM (1)
ANCHOR TEXT RELEVANCY (1)
ATMOSPHERE (1)
ATTRIBUTED GRAPH (1)
BARIUM (1)
BENCHMARK TESTING (1)
BIG DATA (1)
BIG DATA ANALYTICS (1)
BLOCK RELEVANCY (1)
BOOKS (1)
CENTRALIZED VERTICAL CRAWLER (1)
CHINESE WORD SEGMENTATION (1)
CITATION (1)
CITATION ANALYSIS (1)
CITIES AND TOWNS (1)
CLIENT-SERVER ARCHITECTURE (1)
CLIENT-SERVER SCHEME (1)
COHESIVE TEXT SIMILARITY (1)
COMMUNICATION PROTOCOL (1)
COMPUTATIONAL EFFICIENCY (1)
COMPUTATIONAL GRID (1)
COMPUTATIONAL MODELING (1)
COMPUTER AIDED INSTRUCTION (1)
CONCEPT SIMILARITY (1)
CONCEPT-CONCEPT SIMILARITY (1)
CONCEPT-ONTOLOGY SIMILARITY (1)
CONFERENCES (1)
CONSISTENT HASH (1)
CONSISTENT HASHING (1)
CONTEXT (1)
CORPORATE ACQUISITIONS (1)
COUPLINGS (1)
COUPON COLLECTOR PROBLEM (1)
CRAWL (1)
CRAWLER MANAGER (1)
CRAWLING (1)
CRAWLING TASK (1)
CRAWLING-PERIOD (1)
CRAWLING-PERIOD BASED DISTRIBUTION STRATEGY (1)
CRYPTOGRAPHY (1)
DATA ACQUISITION (1)
DATA ANALYSIS (1)
DATA CRAWLER (1)
DATA EXTRACTION (1)
DATA GATHERING (1)
DATA HANDLING (1)
DATA MERGER (1)
DATA MINING STRATEGY (1)
DATA PRESENTATION (1)
more

INFONA - science communication portal

Search results

A system of categorization and classification based on certain criteria

Generic social network data crawler using attributed graph

SAFSB: A self-adaptive focused crawler

Social Emotion Analysis System for Online News

RUM Extractor: A Facebook Extractor for Data Analysis

AutoBLG: Automatic URL blacklist generator using search space expansion and filters

Neuron crawler: An automatic tracing algorithm for very large neuron images

Keyword focused web crawler

A fast big data collection system using MapReduce framework

Big Data in Memory: Benchmarking in Memory Database Using the Distributed Key-Value Store for Constructing a Large Scale Information Infrastructure

Optimization of WEB Data Collection Technology Based on HITS Algorithm

Research on Formation Handling of Serial Swarm Robots

Active Measurement on P2P Swarm Based on Coupon Collector Model

SoDesktop: A Desktop Search Engine

Symbolic verification of web crawler functionality and its properties

Information Retrieval from the Web and Application of Migrating Crawler

Video Semantic Information Architecture Based on Web Crawlers

Efficient extraction of news articles based on RSS crawling

Design and Implementation of FTP Search Engine Based on Lucene

Adaptive focused crawling based on link analysis

Filter options

Publication date

Keywords

INFONA - science communication portal

Search results

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options