App Architecture

App Architecture

SAP → Snowflake communication

https://datavard.atlassian.net/wiki/spaces/DATAVARD/pages/5110731989

Data Organization

In the application, data is structured into sources, each linked to a single storage connection from SNP Glue. Each source represents a schema and maintains its own instance of settings, allowing data to be managed separately across different environments, such as development, quality, and production.

Sub-Schemas for Logical Separation

A source can include logical sub-schemas, which help organize generated merge objects into different schemas. This ensures that tables like staging and merge tables can have the same name but exist separately. These sub-schemas are:

  • Automatically created when tables are set up

  • Automatically removed during housekeeping if they remain empty

Controller Tasks

Each source includes controller tasks, which are serverless tasks that manage internal operations, e.g. triggering merge execution based on the size of the delta.

Merge objects

Each staging table created comes with its four merge objects:

  • Merge table: Table with de-duplicated data

  • Stream: Captures delta on the staging table

  • View: Provides real-time de-duplication merging delta from the stream and the merge table

  • Task: Merges delta from the stream into the merge table. It should be run as infrequently as possible.

Housekeeping

It is responsible for the following operations: