Workflows

Build automated data pipelines using the dcupl Console's visual graph-based editor. Connect nodes, define data flow, and process data at scale.

What are Workflows?

Workflows are directed acyclic graphs (DAGs) where data flows through connected processing steps:

flowchart LR
  T[Trigger] --> F[Fetch Data]
  F --> TR[Transform]
  TR --> S[Save Results]
  F -->|error| E[Handle Error]
  • Nodes are execution units that process data
  • Edges define how data flows between nodes
  • Items are discrete data units moving through the graph
  • Ports are typed connection points (success, error, custom)

Creating a Workflow

  1. Open your project in the Console
  2. Navigate to Workflows in the sidebar
  3. Click Create Workflow
  4. Enter a name and key for your workflow

The visual workflow editor opens with an empty canvas ready for design.

Triggers

Every workflow starts with a trigger that defines how it's activated:

Trigger Description Use Case
Manual Trigger from Console or API Testing, on-demand processing
Webhook HTTP endpoint receives requests External integrations, real-time events
Scheduled Runs on a cron schedule Daily syncs, periodic updates

Adding a Trigger

  1. From the node palette, drag a trigger onto the canvas
  2. Configure the trigger type and settings
  3. Connect it to your first processing node

Node Types

Drag nodes from the palette onto the canvas to build your workflow.

Request Node

Makes HTTP requests to external APIs:

  • Configure URL, method, headers, and body
  • Access workflow variables with {{variables.apiToken}}
  • Success data flows to the main port
  • Errors route to specialized ports (error, client-error, server-error, timeout)

Script Node

Executes JavaScript for data transformation:

  • Access input items with $items() or $json()
  • Return transformed items
  • Route to multiple output ports with $ports()
  • Use built-in utilities like fetch(), csvToJson(), jsonToCsv()

File Nodes

Work with files from various storage backends:

Node Purpose
dcupl-files Read/write local project files
git-files Clone, commit, push to Git repositories
s3-files Upload/download from AWS S3
azure-files Upload/download from Azure Blob Storage

Data Mapper Node

Transform data structures with visual field mapping:

  • Map source fields to target fields
  • Rename properties
  • Flatten or nest structures

dcupl Instance Node

Query and update dcupl data directly in workflows:

  • Initialize dcupl instances
  • Run queries against models
  • Update data programmatically

Connecting Nodes with Edges

Edges define how data flows between nodes.

Creating Connections

  1. Click on a node's output port
  2. Drag to another node's input port
  3. Release to create the connection

Edge Conditions

Filter items that flow through an edge:

  1. Click on an edge to select it
  2. Add a condition expression (e.g., $json.status === 'active')
  3. Only items matching the condition pass through

Edge Transforms

Modify items as they flow through:

  1. Select an edge
  2. Add a transform expression
  3. Items are transformed before reaching the target node

Data Flow Patterns

Linear Flow (ETL)

flowchart LR
  E[Extract] --> T[Transform] --> L[Load]

Simple sequential processing from source to destination.

Parallel Branches

flowchart LR
  S[Source] --> A[Process A]
  S --> B[Process B]
  A --> M[Merge]
  B --> M

Process data through multiple paths simultaneously, then combine results.

Conditional Routing

flowchart LR
  V[Validate] -->|valid| P[Process]
  V -->|invalid| E[Log Error]

Route items to different nodes based on conditions.

Merge Strategies

When multiple edges connect to a single node, configure how incoming data is combined:

Strategy Behavior Use Case
Append Concatenate all items Aggregating multiple sources
Merge Combine by index position Parallel enrichment
Merge by Key Join by common field SQL-style joins
Wait All Keep inputs separate Comparing datasets
First Use first to complete Failover scenarios

Configuring Merge Strategy

  1. Select a node with multiple incoming edges
  2. Open the node configuration
  3. Choose the merge strategy
  4. For merge-by-key, specify the key field

Error Handling

Workflows treat errors as data that can be routed and processed.

Error Ports

Every node has an error output port that triggers when:

  • An exception is thrown
  • Validation fails
  • A timeout occurs

Connect error ports to handle failures gracefully:

flowchart LR
  A[API Call] -->|success| P[Process]
  A -->|error| E[Handle Error]
  • Error port connected: Workflow continues, error becomes data
  • Error port not connected: Workflow stops, error propagates

HTTP-Specific Error Ports

Request nodes provide specialized error routing:

Port Status Description
main 2xx Success responses
client-error 4xx Bad request, not found, unauthorized
server-error 5xx Server problems (retry these)
timeout - Request took too long
error - Network errors, DNS failures

Error Handling Patterns

Fallback to backup:

flowchart LR
  P[Primary] -->|error| B[Backup]
  P -->|success| R[Result]
  B -->|success| R

Graceful degradation:

flowchart LR
  F[Fetch Live] -->|success| P[Process]
  F -->|error| D[Use Defaults]
  D --> P

Variables

Store configuration values and secrets that can be accessed throughout workflows.

Adding Variables

  1. Click Settings in the workflow toolbar
  2. Navigate to Variables
  3. Add key-value pairs
  4. Mark sensitive values as Secret

Using Variables

Reference variables in node configurations:

  • URLs: {{variables.apiBaseUrl}}/users
  • Headers: Bearer {{variables.apiToken}}

Variables resolve from (highest to lowest priority):

  1. Runtime variables (passed when triggering)
  2. Project-level variables
  3. Workflow template variables

Response Configuration

Configure how your workflow returns results:

Type Description
Status Return execution status only
Script Custom response with transformed data
Debug Full execution details (development only)

Deploying Workflows

Save and Deploy

  1. Click Save to save your workflow
  2. Click Deploy to deploy to a runner
  3. Select the target runner instance
  4. Wait for status to show Ready

Testing

  • Use the Run button for manual execution
  • Send requests to the webhook endpoint
  • View execution logs in the Console

Memory Efficiency

V3 workflows use item-based data flow for significant memory improvements:

Dataset Size Memory Usage
10K items ~10MB
100K items ~10MB
1M items ~10MB

Previous node results are garbage collected as items flow forward.

Best Practices

Node Design

  • Keep nodes focused on single responsibility
  • Use descriptive names
  • Handle errors with dedicated error ports

Performance

  • Use stream mode for large datasets (100K+ items)
  • Set buffer sizes for backpressure control
  • Monitor memory in production

Testing

  • Test with realistic data volumes
  • Verify merge strategies work correctly
  • Test error paths thoroughly

What's Next?