When importing data from CSV files created by various sales or accounting systems, you might find they sometimes contain metadata at the top of the file.
Metadata is just data about the file such as when the report was created by the system or the time period of the reporting. In the case of CSV files, this will likely be in the first couple of rows of data before the report data begins.
When importing the data with power query, you would remove the top rows that contain the metadata and then promote the first row of the report data to column headers.
What if the metadata contains data you need to keep in your report like an account name?
In this example, the reporting data doesn’t have the account name as a column in the data. If we remove the top rows of metadata, then we will lose this information.
In this post, we’ll explore how we can import a CSV file and add the metadata to a column in our report with power query (also called Get & Transform).
Import the CSV File
We will first need to import our CSV file. Go to the Data tab and press the From Text/CSV command to create a new import query. We will then need to navigate to the location of our CSV report and then select it.
Now we will see a preview of the data and we can press the Edit button. This will open the query editor where we can add the account name into the data and then remove the top metadata rows.
If the first row of data in a CSV file contains column headings, Excel will automatically promote the first row into column headings. Notice the columns in our data are given generic column headings of Column1, Column2, and Column3.
Add a Custom Column with the Metadata Value
We need to add a new column to the data to bring in the account name.
When the first column of our data contains the value “Account”, then the second column will contain the name of our account.
We’ll create a new column that contains the account name when our first column contains “Account” and is null otherwise.
Go to the Add Column tab in the query editor and press the Custom Column command.
Name the new column as Account Value and enter the following formula into the formula editor and press the OK button.
if [Column1]="Account" then [Column2] else null
This will create a column with the account value from the second column in one row and null values everywhere else.
We can then use the fill down command to fill this value down the column. Right click on the Account Value column heading and select Fill then select Down.
This will fill any non-null values down a column until the next non-null value appears. In our column, there is only one non-null value so this will fill in the account name to the end row of the table.
Add a Custom Column with the Metadata Label
We are not quite done.
If we were to remove the top 6 rows now and promote the remaining row to column headers, then our account column would get renamed to the specific account name Some Company Inc.
We could rename the column from Some Company Inc to Account Name, but then our query would throw an error if we ever tried to import a file with a different account name in the metadata.
We need to add in another column that will have the column heading we want in the specific row which contains all the other column headings and then the actual account name in all the other columns.
Go to the Data tab and add another Custom Column.
We can name it Account and then enter the following formula and press the OK button.
if [Column1]="Product" then "Account" else [Account Value]
This will create an almost duplicate column as our previous custom column, the only difference is it will have the value “Account” in the row that contains all the other column headings.
Remove Temporary Column
The Account Value column was only a temporary column needed as an intermediary step in our transformation process and can be removed now. Right click on the column heading and select Remove.
Remove Top Rows with Metadata and Promote Headers
Since we’ve now got our account name into a column in the data, we no longer need the top rows of metadata and can get rid of them. Go to the Home tab and press the Remove Rows command and select Remove Top Rows from the menu.
A dialog box will pop up and ask you to specify how many rows to remove from the top. Our data has 6 rows before the report data begins, so we can enter 6 and press the OK button.
Now we can promote the first row to our column headings. Go to the Home tab and press the Use First Row as Headers command.
We are now finished with our query and we can Close & Load it.
We now have a query which will take the account name found in the metadata section of our CSV file and fill a new column with that information.
This will work regardless of what account name we have as long as it’s labelled in the first column as Account. We can import different files using the same query and keep track of which account it is for.
We can repeat the same process to add other columns from the metadata.