Bidirectional Data Sync Patterns
Traditional data serving layers are read-onlyβthey pull data from slow backends and serve it from a fast cache, but changes flow only one direction. flAPI's write support enables bidirectional sync patterns where data flows both ways: reads from the cache, writes that update both the cache and the backend.
This guide covers patterns for keeping your flAPI cache and backend data warehouse in sync.
The Problem with One-Way Syncβ
Traditional data architectures face challenges:
Warehouse β Cache β Consumers (one-way flow)
β
stale data
lost updates
dual-source-of-truth
When consumers can only read from the cache:
- Updates to the backend don't immediately reflect in the cache
- Updates to the cache have no way back to the warehouse
- You need separate systems to sync data back
- Risk of conflicting data between cache and warehouse
The Solution: Bidirectional Syncβ
With flAPI's write support, data can flow both ways:
Warehouse β Cache β Consumers
(read & write in both directions)
Benefits:
- Single source of truth - All reads and writes go through flAPI
- Immediate consistency - Updates reflected instantly
- Simplified architecture - No separate sync pipelines
- Transaction safety - ACID guarantees on all operations
- Audit trail - All changes logged in warehouse
Pattern 1: Write-Throughβ
When: Need immediate consistency between cache and warehouse
How: Every write goes directly to the warehouse, then updates the cache
Consumer request to write
β
Validate in flAPI
β
Write to Warehouse
β
Update Cache (if separate)
β
Return result to consumer
Implementationβ
Endpoint configuration:
url-path: /api/customers/:customer_id
method: PUT
operation:
type: write
validate-before-write: true
returns-data: true
transaction: true
request:
- field-name: customer_id
field-in: path
required: true
validators:
- type: int
min: 1
- field-name: email
field-in: body
required: false
validators:
- type: string
regex: '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
template-source: customer-update.sql
connection: [snowflake] # Write directly to warehouse
SQL:
-- Direct write to Snowflake
UPDATE customers
SET
email = COALESCE(:email, email),
updated_at = CURRENT_TIMESTAMP()
WHERE customer_id = :customer_id
RETURNING *
Advantages:
- β Immediate consistency
- β No sync delay
- β Simple to understand
Disadvantages:
- β Slower than writing to local cache (warehouse latency)
- β Network failures block the write
- β Can't write if warehouse is unavailable
Pattern 2: Write-Back (Eventual Consistency)β
When: Need fast writes with eventual consistency
How: Write to local cache immediately, sync to warehouse asynchronously
Consumer writes to flAPI
β
Write to DuckDB Cache (fast)
β
Return to consumer immediately
β
[Async] Sync to Warehouse
β
[Async] Handle conflicts
Implementationβ
Endpoint writes to DuckDB:
url-path: /api/orders/
method: POST
operation:
type: write
validate-before-write: true
returns-data: true
transaction: true
request:
- field-name: customer_id
required: true
validators:
- type: int
min: 1
- field-name: total_amount
required: true
validators:
- type: string
regex: '^\d+(\.\d{2})?$'
template-source: order-create.sql
connection: [local-duckdb] # Write to fast local cache
SQL:
INSERT INTO orders (
customer_id,
total_amount,
status,
created_at,
synced_to_warehouse
)
VALUES (
:customer_id,
:total_amount,
'pending',
CURRENT_TIMESTAMP(),
false
)
RETURNING *
Separate sync process:
Create a scheduled job to sync pending writes to the warehouse:
-- Sync pending orders to Snowflake
INSERT INTO snowflake.orders
SELECT * FROM orders WHERE synced_to_warehouse = false
-- Mark as synced
UPDATE orders SET synced_to_warehouse = true
WHERE synced_to_warehouse = false
-- Handle conflicts with MAX(updated_at)
Advantages:
- β Ultra-fast writes (local cache only)
- β Resilient to warehouse outages
- β Great user experience
Disadvantages:
- β Eventual consistency (temporary divergence)
- β Requires handling sync failures
- β Complex conflict resolution
Pattern 3: Read-Through Write-Back Hybridβ
When: Need balance between speed and consistency
How: Reads from cache, writes batch to warehouse on schedule
Consumer reads
β
Read from Cache (fast)
β
Consumers write
β
Write to Cache (fast)
β
[Scheduled] Batch sync to Warehouse
β
Update cache on refresh
Implementationβ
Read endpoints (from cache):
url-path: /api/products/
method: GET
request: []
template-source: products-read.sql
connection: [local-duckdb] # Read from fast cache
Write endpoints (to cache with sync flag):
url-path: /api/products/:product_id
method: PUT
operation:
type: write
validate-before-write: true
returns-data: true
transaction: true
request:
- field-name: product_id
field-in: path
required: true
- field-name: unit_price
field-in: body
required: false
template-source: product-update.sql
connection: [local-duckdb]
SQL with sync tracking:
UPDATE products
SET
unit_price = COALESCE(:unit_price, unit_price),
needs_sync = true, -- Flag for batch sync
updated_at = NOW()
WHERE product_id = :product_id
RETURNING *
Scheduled batch sync job (every 5 minutes):
BEGIN TRANSACTION;
-- Copy changes to warehouse
INSERT INTO warehouse.products
SELECT * FROM products WHERE needs_sync = true
ON CONFLICT (product_id) DO UPDATE SET
unit_price = EXCLUDED.unit_price,
updated_at = EXCLUDED.updated_at;
-- Mark as synced
UPDATE products SET needs_sync = false
WHERE needs_sync = true;
COMMIT;
Advantages:
- β Fast reads from cache
- β Fast writes to cache
- β Batch efficiency
- β Reasonable consistency window
Disadvantages:
- β Sync delay (usually 5-15 minutes)
- β Requires batch reconciliation logic
- β Storage overhead for sync tracking
Pattern 4: Smart Cache Invalidationβ
When: Need to keep cache fresh after writes
How: Invalidate relevant cache entries, lazy-reload on next read
Write to Cache
β
Invalidate Cache Entry
β
Next Read
β
Cache Miss
β
Reload from Warehouse
β
Serve to Consumer
Implementationβ
Write endpoint with cache invalidation:
url-path: /api/customers/:customer_id
method: PUT
operation:
type: write
validate-before-write: true
returns-data: true
transaction: true
# After successful write, invalidate related cache
cache-invalidation:
- pattern: "customer:*" # Clear all customer caches
- pattern: "customers:list" # Clear customer list cache
template-source: customer-update.sql
connection: [snowflake]
Cache-aware read endpoint:
url-path: /api/customers/:customer_id
method: GET
cache:
enabled: true
ttl: 3600 # 1 hour default TTL
key: "customer:{customer_id}"
template-source: customer-read.sql
connection: [snowflake]
Advantages:
- β Automatic cache freshness
- β Lazy reloading (only on demand)
- β Reduced memory footprint
Disadvantages:
- β First read after write is slow
- β Requires cache layer (Redis, Memcached)
- β More complex setup
Conflict Resolution Strategiesβ
When syncing writes from cache back to warehouse, conflicts can occur.
Strategy 1: Last-Write-Winsβ
Simplest approach: later timestamp wins.
INSERT INTO warehouse.data
SELECT * FROM cache.data WHERE needs_sync = true
ON CONFLICT (id) DO UPDATE SET
-- Take value from cache if warehouse update is older
value = CASE
WHEN warehouse.updated_at < EXCLUDED.updated_at
THEN EXCLUDED.value
ELSE warehouse.value
END,
updated_at = GREATEST(
warehouse.updated_at,
EXCLUDED.updated_at
);
Strategy 2: Custom Merge Logicβ
Different rules for different columns.
-- For price: take max (don't lower prices accidentally)
-- For availability: take min (conservative estimate)
-- For notes: concatenate with timestamp
INSERT INTO warehouse.products
SELECT * FROM cache.products WHERE needs_sync = true
ON CONFLICT (product_id) DO UPDATE SET
unit_price = GREATEST(warehouse.unit_price, EXCLUDED.unit_price),
units_in_stock = LEAST(warehouse.units_in_stock, EXCLUDED.units_in_stock),
notes = warehouse.notes || ' | ' || EXCLUDED.notes,
updated_at = CURRENT_TIMESTAMP();
Strategy 3: Manual Conflict Resolutionβ
Flag conflicts for human review.
-- Detect conflicts
WITH conflicting_updates AS (
SELECT
cache.id,
cache.value as cache_value,
warehouse.value as warehouse_value
FROM cache.data cache
INNER JOIN warehouse.data warehouse
ON cache.id = warehouse.id
AND cache.value != warehouse.value
AND cache.updated_at > warehouse.updated_at
)
INSERT INTO conflict_log (record_id, cache_value, warehouse_value, detected_at)
SELECT id, cache_value, warehouse_value, NOW()
FROM conflicting_updates;
Best Practicesβ
1. Choose Your Consistency Modelβ
Decide: Immediate (write-through) vs Eventual (write-back)?
- High-stakes data (payments, inventory) β write-through
- User-generated content (profiles, preferences) β write-back
- Reporting data β write-back with batch sync
2. Always Validate Before Writingβ
Prevent invalid data entering either cache or warehouse:
operation:
validate-before-write: true
3. Use Transactionsβ
Ensure partial operations don't corrupt data:
operation:
transaction: true
4. Track Sync Statusβ
Make it easy to identify what's synced:
CREATE TABLE products (
id INTEGER,
name STRING,
updated_at TIMESTAMP,
synced_at TIMESTAMP, -- NULL until synced
needs_sync BOOLEAN
);
5. Monitor Divergenceβ
Alert when cache and warehouse drift too far:
SELECT COUNT(*) as diverged_records
FROM cache.products c
FULL OUTER JOIN warehouse.products w
ON c.id = w.id
WHERE c.updated_at != w.updated_at
OR c.needs_sync = true;
6. Set Reasonable TTLsβ
Don't hold stale cache forever:
cache:
ttl: 3600 # Refresh hourly
max_age: 86400 # Hard limit: 24 hours
Examplesβ
Example 1: User Profile Syncβ
Fast writes to cache, sync to warehouse every 10 minutes:
# Endpoint: PUT /api/users/:user_id
operation:
type: write
returns-data: true
transaction: true
# Writes to local DuckDB for speed
connection: [local-duckdb]
# Scheduled job syncs to Snowflake every 10 minutes
cache-invalidation:
sync-interval: 600 # 10 minutes
Example 2: Order Processingβ
Immediate consistency: write directly to warehouse:
# Endpoint: POST /api/orders/
operation:
type: write
returns-data: true
transaction: true
# Write directly to Snowflake
connection: [snowflake]
# No sync delay needed
Example 3: Analytics Dataβ
Batch updates with smart cache invalidation:
# Endpoint: PUT /api/metrics/:metric_id
operation:
type: write
returns-data: true
transaction: true
# Write to cache
connection: [local-duckdb]
# Invalidate related caches
cache-invalidation:
- pattern: "metrics:*"
- pattern: "dashboards:*"
# Batch sync to warehouse nightly
sync-interval: 86400 # 24 hours
When to Use Each Patternβ
| Pattern | Speed | Consistency | Complexity | Use Case |
|---|---|---|---|---|
| Write-Through | Slow | Immediate | Low | Critical data (payments) |
| Write-Back | Very Fast | Eventual | High | User data, preferences |
| Hybrid | Fast | Delayed | Medium | Analytics, reporting |
| Cache Invalidation | Fast | Hours | Medium | Public data, products |