Guoliang Li

chapter

K-Join: Knowledge-Aware Similarity Join

Zeyuan Shang, Yaxiao Liu, Guoliang Li, Jianhua Feng

2017 IEEE 33rd International Conference on Data Engineering (ICDE) > 23 - 24

2017 IEEE 33rd International Conference on Data Engineering (ICDE)

Similarity join is a fundamental operation in data cleaning and integration. Existing similarity-join methods utilize the string similarity to quantify the relevance but neglect the knowledge behind the data, which plays an important role in understanding the data. Thanks to public knowledge bases, e.g., Freebase and Yago, we have an opportunity to use the knowledge to improve similarity join. To...

chapter

Cleaning Relations Using Knowledge Bases

Shuang Hao, Nan Tang, Guoliang Li, Jian Li

2017 IEEE 33rd International Conference on Data Engineering (ICDE) > 933 - 944

2017 IEEE 33rd International Conference on Data Engineering (ICDE)

We study the data cleaning problem of detecting and repairing wrong relational data, as well as marking correct data, using well curated knowledge bases (KBs). We propose detective rules (DRs), a new type of data cleaning rules that can make actionable decisions on relational data, by building connections between a relation and a KB. The main invention is that, a DR simultaneously models two opposite...

chapter

A Novel Cost-Based Model for Data Repairing

Shuang Hao, Nan Tang, Guoliang Li, Jian He, more

2017 IEEE 33rd International Conference on Data Engineering (ICDE) > 49 - 50

2017 IEEE 33rd International Conference on Data Engineering (ICDE)

Integrity constraint (IC) based data repairing is typically an iterative process consisting of two parts: detecting and grouping errors that violate given ICs, and modifying values inside each group such that the modified database satisfies those ICs. However, most existing automatic solutions treat the process of detecting and grouping errors straightforwardly (e.g., violations of functional dependencies...

article

K-Join: Knowledge-Aware Similarity Join

Zeyuan Shang, Yaxiao Liu, Guoliang Li, Jianhua Feng

IEEE Transactions on Knowledge and Data Engineering > 2016 > 28 > 12 > 3293 - 3308

Similarity join is a fundamental operation in data cleaning and integration. Existing similarity-join methods utilize the string similarity to quantify the relevance but neglect the knowledge behind the data, which plays an important role in understanding the data. Thanks to public knowledge bases, e.g., Freebase and Yago, we have an opportunity to use the knowledge to improve similarity join. To...

chapter

Fast-join: An efficient method for fuzzy token matching based string similarity join

Jiannan Wang, Guoliang Li, Jianhua Fe

2011 IEEE 27th International Conference on Data Engineering > 458 - 469

2011 27th IEEE International Conference on Data Engineering (ICDE 2011)

String similarity join that finds similar string pairs between two string sets is an essential operation in many applications, and has attracted significant attention recently in the database community. A significant challenge in similarity join is to implement an effective fuzzy match operation to find all similar string pairs which may not match exactly. In this paper, we propose a new similarity...

INFONA - science communication portal

Search results for: Guoliang Li

K-Join: Knowledge-Aware Similarity Join

Cleaning Relations Using Knowledge Bases

A Novel Cost-Based Model for Data Repairing

K-Join: Knowledge-Aware Similarity Join

Fast-join: An efficient method for fuzzy token matching based string similarity join

Filter options

Publication date

Publication type

Keywords

INFONA - science communication portal

Search results for: Guoliang Li

K-Join: Knowledge-Aware Similarity Join

Cleaning Relations Using Knowledge Bases

A Novel Cost-Based Model for Data Repairing

K-Join: Knowledge-Aware Similarity Join

Fast-join: An efficient method for fuzzy token matching based string similarity join

Add recipient

Sending message cancelled

Are you sure you want to cancel sending this message?

Send message

Filter options

Publication date

Date range setting

Set the date range to filter the displayed results. You can set a starting date, ending date or both. You can enter the dates manually or choose them from the calendar.

Publication type

Keywords

Reporting an error / abuse

Sending the report failed

Accessibility options