Microsoft Excel has just released a cool new add-in called Transform Data by Example to help make cleaning and transforming data easier. This add-in will allow you to quickly standardize time and dates, names, postal addresses, email addresses and much more! All you need to do is provide a few examples and Excel will find the transformation function for you. It does this by searching its large index of sophisticated transformation functions from services such as GitHub, Stack Overflow, .NET and Bing Maps to find the best match.
How to Install the Add-In?
The add-in is only available for Excel 2016, so if you don’t have this version you will need to upgrade before you can install it. If you do have Excel 2016 you can install it from here.
Follow the link above and then click on the Add button.
Then click on Open in Excel.
You will then need to add this to your trusted add-ins. You will see this notification in the right hand add-in window and you will need to press the Trust this add-in button.
Your add-in should be installed now and ready to be activated from your My Add-ins in the ribbon.
- Go to the Insert tab in the ribbon.
- Under the Add-ins section, press the My Add-ins button.
- You will see the Transform Data by Example add-in in the Office Add-ins pop up window. Select this add-in then press the Add button.
Your add-in is now ready to be used.
- Go to the Data tab in the ribbon.
- Transform Data by Example is found in its own section called Data Wrangling.
How to Use the Add-In?
Microsoft has provided some sample data to play with and give you an idea of what the add-in is capable of doing for you. You can get the sample data here.
In this example we are going to transform a list of names in various formats into a list of Last Name, First Name.
- In column B we have a list of names in various formats.
- Some have prefixes like Sir, Mr., Mrs, and Dr.
- Some have suffixes like M.D., Sr., Jr. and III.
- Some have middle names or initials.
- Some are already in Last, First format.
- In column C we have typed out a few examples of what we want our data to look like and our active cell cursor is now just below our examples.
- Go to the Data tab in the ribbon.
- Select Transform Data by Example.
A Transform Data by Example window will appear on the right. Press the Get Transformations button.
- A list of transformations from the search will be returned.
- Hover your mouse cursor over any of the transformations returned to preview the results.
- You can see a live preview of the transformation results in your data.
Compare different transformations.
- Press the Pin button in any transformation to compare it against any other transformation.
- Hover your mouse cursor over any other transformation to compare it against your pinned transformation.
- The live preview in your data will show the differences between the two transformations.
- Blue cells are where the pinned transformation and other transformation have the same results.
- Yellow cells are where the other transformation has different results.
In this case it looks like the CSharpNameParser transformation does a better job and has the results we are looking for. Select it by pressing the Lightning icon in the top right corner of the tranformation. Transformed values will now appear in your data.
How is This Different From Flash Fill?
You might be thinking this sounds very familiar. Excel already has a feature called Flash Fill (found in the Data Tools section of the Data ribbon) that transforms your data automatically based on a few examples. This feature can transform your data based on basic patterns, like in our example if all the names were of the form First Last then Flash Fill could easily transform the data to the required Last, First format. Our example with many different prefixes, suffixes, middle names and initials was just too complex for Flash Fill. Transform Data by Example is like the more advanced version and uses sophisticated algorithms built by others to transform your data.