Search results for: Wei Wang

Items from 1 to 4 out of 4 results

article

VChunkJoin: An Efficient Algorithm for Edit Similarity Joins

Wei Wang, Jianbin Qin, Chuan Xiao, Xuemin Lin, more

IEEE Transactions on Knowledge and Data Engineering > 2013 > 25 > 8 > 1916 - 1929

Similarity joins play an important role in many application areas, such as data integration and cleaning, record linkage, and pattern recognition. In this paper, we study efficient algorithms for similarity joins with an edit distance constraint. Currently, the most prevalent approach is based on extracting overlapping grams from strings and considering only strings that share a certain number of...

chapter

Efficient Graph Similarity Joins with Edit Distance Constraints

Xiang Zhao, Chuan Xiao, Xuemin Lin, Wei Wang

2012 IEEE 28th International Conference on Data Engineering > 834 - 845

2012 IEEE International Conference on Data Engineering (ICDE 2012)

Graphs are widely used to model complicated data semantics in many applications in bioinformatics, chemistry, social networks, pattern recognition, etc. A recent trend is to tolerate noise arising from various sources, such as erroneous data entry, and find similarity matches. In this paper, we study the graph similarity join problem that returns pairs of graphs such that their edit distances are...

chapter

Top-k Set Similarity Joins

Chuan Xiao, Wei Wang, Xuemin Lin, Haichuan Shang

2009 IEEE 25th International Conference on Data Engineering > 916 - 927

2009 IEEE 25th International Conference on Data Engineering. ICDE 2009

Similarity join is a useful primitive operation underlying many applications, such as near duplicate Web page detection, data integration, and pattern recognition. Traditional similarity joins require a user to specify a similarity threshold. In this paper, we study a variant of the similarity join, termed top-k set similarity join. It returns the top-k pairs of records ranked by their similarities,...

chapter

Fuzzy Multi-Dimensional Search in the Wayfinder File System

C. Peery, Wei Wang, A. Marian, T.D. Nguyen

2008 IEEE 24th International Conference on Data Engineering > 1588 - 1591

2008 IEEE 24th International Conference on Data Engineering (ICDE '08)

With the explosion in the amount of semi-structured data users access and store, there is a need for complex search tools to retrieve often very heterogeneous data in a simple and efficient way. Existing tools usually index text content, allowing for some IR-style ranking on the textual part of the query, but only consider structure (e.g., file directory) and metadata (e.g., date, file type) as filtering...

Filter options

Content availability:
Available
Keywords:
INDEXES
FILTERING

Publication date

Set your own date range

Publication type

book (3)
article (1)

INFONA - science communication portal

Search results for: Wei Wang

VChunkJoin: An Efficient Algorithm for Edit Similarity Joins

Efficient Graph Similarity Joins with Edit Distance Constraints

Top-k Set Similarity Joins

Fuzzy Multi-Dimensional Search in the Wayfinder File System

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options