In this blog where we are going to share some useful features of Einstein Analytics. Today, we are going to discuss how to append and merge multiple datasets into one dataset in Einstein Analytics.
Dataflow editor will be used in order to achieve this. We will create a new data flow and will use the newly created data flow for merging the Datasets. Data flow in Einstein Analytics is a JSON file which has the collection of operations to be performed for creating a dataset and you can find this file on Data manager.
But you need to have Dataflow Replication permission in order to create or edit the data flow.
So, we need to follow the below steps to append the data sets.
- Login to your org, open data manager and create a new data flow.
Give a name to new Dataflow and click on create button.
After clicking on create, you will be redirected to the data flow editor page. It will look as shown below:
I have three datasets in org named as A, B and D.
They are having 3 columns, name, age and salary. Dataset A is having records from name A to D, Dataset B is having records from name E to H and Dataset D is having records from I to L.
After appending them we will get the following records in a single dataset.
Note: The datasets which are going to be appended, need to have identical columns otherwise you will get an error while running your data flow.
- So, now we will append datasets A, B and D into a new dataset named as AppendedDataset. For this, we will use Edgemart to bring the existing datasets from org to the data flow builder.
- Now, we need to append these 3 data sets into 1 dataset. So we will use append transformation for this.
- Our datasets have been appended, now we only need to register the dataset so that we can view and use this dataset to build the charts. So we have added a sfdcRegister transformation into our data flow builder.
Update and run the data flow. You will see a new dataset is being created under the monitor tab of the data manager as shown in the image.
Your dataset has been created and it will be visible in our Analytics Studio. Let’s verify this.
We have got a dataset named as AppendedDataset. This is the name which we have given in the sfdcRegister.
As we have discussed earlier, we will get a total of 12 rows(4 rows from each dataset) in our final dataset, the following screenshot is displaying the same.
So that all was about adding the rows from a dataset into another dataset. Now we will discuss how to merge two datasets having a key column in both of the datasets.
Merging of the datasets is another useful feature provided by the Salesforce Einstein Analytics. We can join two datasets which are having a column in common.
In this example, we are having the appendeddataset which contains the name, age and salary information.
The second dataset is having a name, city and state columns.
We are going to create a new dataset which will show city and state corresponding to the name of a person.
- Create a new data flow and use edgemart transformation
Now, we will use augment transformation to merge them into one dataset. Draw the Augment transformation into data flow builder and fill the required details.
Now your data flow builder will look like below:
We are just one step away from merging the two datasets. So any guesses what is remaining? Yes, the last thing which is to register the dataset so that we can use it on our charts.
Update and run your data flow.
Open dataset tab in analytics studio and you will find a dataset named as MergedDataset.
Open the dataset, switch to value table and you will find the following table:
Steps are given below:
You can see that we are having a single dataset which contains records from both the datasets. We have merged the datasets on the basis of the name. Also, in the table, you will see some records are blank in the column City and State. This is because in our second dataset we don’t have any records for the name C, D and J.
In this blog, we have discussed how to add more rows in a dataset from other datasets which is having the same schema. Then we have merged two datasets which are having an identical column in both the datasets.
These were the two useful features of Einstein Analytics which can be very important when dealing with the datasets.