An Overview of Data Stream in Salesforce Data Cloud
A Data Stream in Salesforce Data Cloud is a data ingestion pipeline that brings data from a source system into the Data Cloud environment. It defines how data is connected, imported, and stored in the Data Cloud data lake, creating a structured path through which records from source systems enter the platform.
When a Data Stream is created, Salesforce ingests the source data into Data Source Objects (DSOs) and then transforms and stores it into Data Lake Objects (DLOs). This ingested data can later be mapped to the standard Customer 360 data model, enabling analytics, segmentation, and data activation.
Data Ingestion in Salesforce Data Cloud
Data Ingestion in Salesforce Data Cloud is the process of bringing data from multiple source systems into Data Cloud so it can be stored, processed, unified, and used for analytics and activation.
During ingestion, data is collected from source systems, stored in Data Lake Objects (DLOs), and prepared for mapping to the Customer 360 data model. Data ingestion is the foundation of Salesforce Data Cloud, as all downstream features depend on it.
Common Ways to Ingest Data into Salesforce Data Cloud
Salesforce Data Cloud supports multiple data ingestion methods. The following are the most commonly used and important sources:
1. Salesforce CRM
- Uses Data Streams for ingestion
- Supports standard and custom objects
- Can connect multiple Salesforce orgs
- Supports incremental and full refresh
2. Salesforce CRM
- Ingests data stored in Amazon S3 buckets
- Supports large-scale and cloud-based data ingestion
- Commonly used for enterprise data lakes and analytics platforms
Additional Ingestion Options
Salesforce Data Cloud continues to support and introduce additional ingestion methods and connectors. While many options are available, the sources listed above represent the most widely used and important ingestion methods in real-world implementations.
Data Ingestion Using Another Salesforce Org
Salesforce Data Cloud supports ingesting data from another Salesforce org by establishing a secure connection between the orgs and using Data Streams This approach is commonly used when data needs to be centralized from multiple Salesforce environments into a single Data Cloud instance.
Ingesting data from another Salesforce org enables unified reporting, analytics, and customer profiling across different Salesforce implementations.
Prerequisite: Connect a Salesforce Org to Data Cloud
Before creating a Data Stream, it is mandatory to connect a Salesforce org to Salesforce Data Cloud.
Data Streams can only ingest data from connected source systems. Without establishing this connection, Salesforce Data Cloud cannot access objects or records from another Salesforce org.
Connecting a Salesforce org enables Data Cloud to:
- Securely access data from the source org
- Read objects and fields available for ingestion
- Support data transfer from multiple Salesforce orgs into a single Data Cloud environment
This step is required when data needs to be transferred or ingested from another Salesforce org.
Steps to Connect a Salesforce Org to Data Cloud
To connect a Salesforce org:
-
Open Data Cloud Setup
-
Select Salesforce CRM Connections
-
Click New Connection
-
Choose Connect Another Salesforce Org
-
Authenticate using the source Salesforce org credentials
-
Complete the connection setup and verify successful connection status
Once the connection is established, the connected Salesforce org becomes available for selection while creating Data Streams
Data Stream Operations
Creating a Data Stream
To create a Data Stream from Salesforce CRM:
-
Navigate to Data Cloud → Data Streams
-
Click New
-
Select the data source (for example, Salesforce CRM)
-
Choose the connected Salesforce org and required objects
-
Configure settings such as category and refresh options
Note - you can also change the object category from here and also create new formula fields
-
Save and activate the Data Stream
Note - you can also change the object category from here and also create new formula fields
Refresh Modes & Settings
Refresh Modes
When data changes at the source system, Salesforce Data Cloud updates the Data Lake using refresh mechanisms. The commonly used refresh modes are:
Incremental Refresh
- Processes only new or updated records
- More efficient for regular data updates
Full Refresh
- Reloads all records from the source system
- Useful when source logic changes (for example, formula field updates)
- Ensures complete data consistency when incremental refresh is insufficient
Types of Data Streams (Categories)
While Salesforce documentation commonly refers to these as categories, Data Streams are typically classified into the following types during creation:
Profile Data
- Contains identity or descriptive attributes
- Used for building unified customer profiles
Engagement Data
- Contains event-based interaction data
- Requires a DateTime field to capture event timing
Other Data
- Includes data that does not fit into Profile or Engagement categories
- Commonly used for transactional or miscellaneous records
Conclusion
This task demonstrated a complete end-to-end implementation of Data Streams in Salesforce Data Cloud, including org connection, data preparation, ingestion, monitoring, and field mapping. It provided hands-on experience with real-world data integration scenarios.
Have questions? Learn more about our services at support@astreait.com or visit astreait.com to schedule a consultation.