Appending & Merging Datasets in Einstein Analytics

In this blog where we are going to share some useful features of Einstein Analytics. Today, we are going to discuss how to append and merge multiple datasets into one dataset in Einstein Analytics.

Dataflow editor will be used in order to achieve this. We will create a new data flow and will use the newly created data flow for merging the Datasets. Data flow in Einstein Analytics is a JSON file which has the collection of operations to be performed for creating a dataset and you can find this file on Data manager.

But you need to have Dataflow Replication permission in order to create or edit the data flow.

merging datasets image1

So, we need to follow the below steps to append the data sets.

1. Login to your org, open data manager and create a new data flow.

merging datasets image2

2. Give a name to new Dataflow and click on create button.

merging datasets image3

3. After clicking on create, you will be redirected to the data flow editor page. It will look as shown below:

merging datasets image4

4. I have three datasets in org named as A, B and D.

merging datasets image5

They are having 3 columns, name, age and salary. Dataset A is having records from name A to D, Dataset B is having records from name E to H and Dataset D is having records from I to L.

merging datasets image6

After appending them we will get the following records in a single dataset.

merging datasets image7

Note: The datasets which are going to be appended, need to have identical columns otherwise you will get an error while running your data flow.

5. So, now we will append datasets A, B and D into a new dataset named as AppendedDataset. For this, we will use Edgemart to bring the existing datasets from org to the data flow builder.

merging datasets image8

6. Now, we need to append these 3 data sets into 1 dataset. So we will use append transformation for this.

merging datasets image9

7. Our datasets have been appended, now we only need to register the dataset so that we can view and use this dataset to build the charts. So we have added a sfdcRegister transformation into our data flow builder.

merging datasets image10

8. Update and run the data flow. You will see a new dataset is being created under the monitor tab of the data manager as shown in the image.

merging datasets image11

9. Your dataset has been created and it will be visible in our Analytics Studio. Let’s verify this.

merging datasets image12

We have got a dataset named as AppendedDataset. This is the name which we have given in the sfdcRegister.

10. As we have discussed earlier, we will get a total of 12 rows(4 rows from each dataset) in our final dataset, the following screenshot is displaying the same.

merging datasets image13

So that all was about adding the rows from a dataset into another dataset. Now we will discuss how to merge two datasets having a key column in both of the datasets.

Merging of the datasets is another useful feature provided by the Salesforce Einstein Analytics. We can join two datasets which are having a column in common.

In this example, we are having the appendeddataset which contains the name, age and salary information.

merging datasets image14

The second dataset is having a name, city and state columns.

merging datasets image15

We are going to create a new dataset which will show city and state corresponding to the name of a person.
Steps are given below:

1. Create a new data flow and use edgemart transformation

merging datasets image16

2. Now, we will use augment transformation to merge them into one dataset. Draw the Augment transformation into data flow builder and fill the required details.

merging datasets image17

Now your data flow builder will look like below:

merging datasets image18

3. We are just one step away from merging the two datasets. So any guesses what is remaining? Yes, the last thing which is to register the dataset so that we can use it on our charts.

merging datasets image19

4. Update and run your data flow.

merging datasets image20

5. Open dataset tab in analytics studio and you will find a dataset named as MergedDataset.

merging datasets image21

6. Open the dataset, switch to value table and you will find the following table:

merging datasets image22

You can see that we are having a single dataset which contains records from both the datasets. We have merged the datasets on the basis of the name. Also, in the table, you will see some records are blank in the column City and State. This is because in our second dataset we don’t have any records for the name C, D and J.

In this blog, we have discussed how to add more rows in a dataset from other datasets which is having the same schema. Then we have merged two datasets which are having an identical column in both the datasets.

These were the two useful features of Einstein Analytics which can be very important when dealing with the datasets.