Get free YouTube views, likes and subscribers
Get Free YouTube Subscribers, Views and Likes

Clean up similar text in Excel with Power Query fuzzy clustering

Follow
David Benaim

This brilliant feature allows you to clean up any column with similar text, such as misspelt names, different ways of saying it. It uses Power Query but this video can be used by even Power Query novices. It works in either Excel or Power BI using the brilliant Table.AddFuzzyClusterColumn.
Ideally one should use data validation but if not, this method could save hours of data clean up work. I also show how it can be used in dataflows.

I show how to use the code in a table, see similarity % for each match and set the minimum level, override some matches, even have a transformation table to say for example that England should change to the UK.

Example files can be found here: www.xlconsultingasia.com/youtubefiles

00:00 Introduction
01:13 Get data into Power Query
02:46 Fuzzy cluster column function
05:44 Similarity thresholds, case, spacing
09:07 Transformation table & manual adj
11:22 Troubleshooting issues
12:36 Add fuzzy match to existing table
14:44 Power Query Online UI
15:28 Fuzzy grouping

posted by Russin8w