Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
April 13, 2022 08:59 am GMT

Generating AppFlow's flows using Cloud Formation's templates

Salesforce is a great tool for managing, keeping in touch, and monitoring our members, our researchers are using its data to create models such as Logistic Regression or Neural Network and verify existing models.
To query the data in SalesForce, you can either use SOQL (which is less suitable for research uses) or fetch the data to your storage and use more suitable tools such as Athena to query the data.

We at Assured Allies use AWS AppFlow, AWS AppFlow is a fully-managed integration service that enables you to securely exchange data between SaaS applications such as SalesForce and AWS services, such as S3 and Redshift.
In AssuredAllies we utilize AppFlow to fetch SF Objects and store them as Parquet files on S3, later our ETL\LTE processes will clean, transform and enrich the files and we will be able to serve them to the researchers for their use.
I'm not going to dive into ETLs or Parquet in this post for lack of time but I would love to touch on the first link in this chain - fetching multiple SF objects using AppFlow.

Creating one scheduled flow manually using AppFlow's console isn't very hard and it'll fetch one object from SF, but if you fully utilize SF as we do, then you'll need to create many flows (in our case ~60 objects * 2 environments), so we looked for a better, automatic way to create the flows.
There are many examples of how to programmatically create flows but most of them are either:

  • "on-demand" flows and not scheduled.
  • Rely on boto3.
  • Force you to know all the fields of the object in order to fetch them.

In this post, I will give snippets of how to generate a general Cloud Formation template that anyone can use to create scheduled flows.
You'll be able to find the template and related code in AA's public GitLab repo.

AWSTemplateFormatVersion: '2010-09-09'Transform: AWS::Serverless-2016-10-31Description: Dumping SF object to S3Parameters:  ObjectName:    Type: String  ScheduleStartTime:    Type: String  S3Bucket:    Type: String  S3Prefix:    Type: String  Connector:    Type: String

The template receives 5 params:

S3Bucket and S3Prefix are the Bucket name and Prefix to store the results.

ObjectName is the SalesForce Object we want to fetch.

ScheduleStartTime is the scientific notation of the unixtime for the first occurrence of the flow.
i.e. for: 2022-04-11 00:00:00+00:00 ScheduleStartTime will be 1.64962440E9, the repo has a small python script for calculating ScheduleStartTime.

Connector is the name of the connector we will use to connect to SF, the easiest way to get the connector is to manually create a connector using AppFlow.

Resources:  GenericFlow:    Type: AWS::AppFlow::Flow    Properties:      Description:        Fn::Join:        - ''        - - 'App Flow for '          - Ref: ObjectName          - ' object'      DestinationFlowConfigList:      - ConnectorType: S3        DestinationConnectorProperties:          S3:             BucketName: Ref S3Bucket            BucketPrefix: Ref S3Prefix            S3OutputFormatConfig:              AggregationConfig:                AggregationType: None              FileType: PARQUET              PrefixConfig:                PrefixType: PATH_AND_FILENAME                PrefixFormat: DAY

Please note the Parquet file will be saved with the S3Prefix/year/month/day as prefix

      FlowName: Ref: ObjectName      SourceFlowConfig:        ConnectorProfileName: Ref Connector        ConnectorType: Salesforce        SourceConnectorProperties:          Salesforce:            EnableDynamicFieldUpdate: true

EnableDynamicFieldUpdate checks every time if the SF object's fields changed and updates the flow accordingly.

            IncludeDeletedRecords: false            Object:              Ref: ObjectName      Tasks:

Map_all and the empty EXCLUDE_SOURCE_FIELDS_LIST array are where the magic really is, without these two you'd need to map all the fields from the object one by one! and then if you add a new field to the object you'll need to change the template and redeploy it, Map_all saves us the trouble.

      - TaskType: Map_all        SourceFields: []        TaskProperties:        - Key: EXCLUDE_SOURCE_FIELDS_LIST          Value: '[]'        ConnectorOperator:          Salesforce: NO_OP      TriggerConfig:        TriggerType: Scheduled        TriggerProperties:          DataPullMode: Complete          ScheduleExpression: rate(1days)          ScheduleOffset: 0          ScheduleStartTime:            Ref: ScheduleStartTime

ScheduleExpression states what is the reoccurrence rate of the flow (in my example it's a daily reoccurrence).

The template can be deployed using sam deploy and it needs to be activated once (using either boto3, AWS cli or rest request).

You should at least create one flow manually to understand the different configurations, after your manual flow is set you can copy the configuration using multiple tools such as boto3, AWS cli or even AWS explorer in your favorite IDE.

Sign off, and links to AA
Jobs: https://www.assuredallies.com/careers/


Original Link: https://dev.to/manicqin/generating-appflows-flows-using-cloud-formations-templates-2bcm

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To