Skip to content

FileLog Migration to Snowflake

FileLog package currently writes integration logs (Yodlee, Plaid, etc.) to S3 as JSON files. This approach has encountered performance limitations and architectural issues that warrant a strategic migration to Snowflake.

Sentry Issue: RET-API-SGC - S3 503 Slow Down errors

Statistics (as of 2026-03-16):

  • 1,735 occurrences since 2025-04-18
  • 462 users affected
  • Latest burst: 20 errors in 3 seconds
User Request
Yodlee API Call
Connector::logRequestResponseToFile()
FileLog::write(['integration', 'yodlee', $advisor_id, $household_id], ...)
FileLog::handleStorageSaving() → dispatch()->afterResponse()
FileLog::savePendingLogs() (Laravel termination callback)
S3::writeStream() → rightcapital-prd bucket

Current Path Structure:

logs/integration/yodlee/{advisor_id}/{household_id}/{timestamp}_{endpoint}.json
Example: logs/integration/yodlee/16476/320463/20260315-222803_000_providerAccounts_PUT.json

AWS S3 Limits (per prefix):

  • PUT/DELETE: 3,500 req/s
  • GET/HEAD: 5,500 req/s

Problem: All logs for a household share the same S3 prefix → same partition → shared rate limit

Trigger Pattern:

Micro-burst: 6 concurrent requests in ~100ms
→ Exceeds S3 Token Bucket capacity
→ 503 Slow Down error

Current State:

  • Logs stored as individual JSON files in S3
  • Requires downloading files to query
  • No indexing or search capabilities
  • Time-consuming for debugging

Typical Query Process:

Terminal window
# Need to download files manually
aws s3 ls s3://rightcapital-prd/logs/integration/yodlee/16476/320463/
aws s3 cp s3://rightcapital-prd/logs/.../file.json - | jq '.'

Code Issues (FileLog.php:354-374):

foreach (static::$pending_logs as [$file_path, $temp]) {
rescue(fn () => use_stream(
$temp,
fn ($stream) => static::getStorage()->writeStream($file_path, $stream),
));
}

Problems:

  • Silent failure with rescue()
  • No retry mechanism
  • No rate limiting between writes
  • Uses sync queue (blocks process)
  • Manual log cleanup required
  • No automatic archival
  • Difficult to enforce retention policies
  • S3 costs accumulate over time
User Request
Yodlee API Call
Connector::logRequestResponseToFile()
SnowflakeFileLogger::write()
Snowflake INSERT (async via Snowpipe or direct)
INTEGRATION_LOGS table
  • No rate limiting issues (Snowflake scales horizontally)
  • Batch inserts supported
  • Async ingestion via Snowpipe
-- Direct SQL queries instead of S3 downloads
SELECT *
FROM integration_logs
WHERE advisor_id = 16476
AND household_id = 320463
AND endpoint LIKE '%providerAccounts%'
AND timestamp >= '2026-03-15'
ORDER BY timestamp DESC;
  • Already in data warehouse
  • Join with other tables (advisors, households, accounts)
  • Use existing BI tools (Looker, Tableau)
  • Snowflake compression (5-10x better than S3)
  • Automatic data retention/archival
  • Pay for compute only when querying
  • Single source of truth
  • Schema enforcement
  • Version control for structure changes
CREATE TABLE integration_logs (
-- Identifiers
log_id VARCHAR PRIMARY KEY,
integration_type VARCHAR NOT NULL, -- 'yodlee', 'plaid', etc.
advisor_id INTEGER,
household_id INTEGER,
-- Request/Response
endpoint VARCHAR NOT NULL,
http_method VARCHAR,
request_url VARCHAR,
request_headers VARIANT,
request_body VARIANT,
response_status INTEGER,
response_headers VARIANT,
response_body VARIANT,
-- Metadata
timestamp TIMESTAMP_NTZ NOT NULL,
duration_ms INTEGER,
error_message VARCHAR,
trace_id VARCHAR,
-- Partitioning
log_date DATE NOT NULL,
-- Audit
created_at TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP()
)
CLUSTER BY (log_date, integration_type, advisor_id);

Option 1: Direct Snowflake Insert (Synchronous)

Section titled “Option 1: Direct Snowflake Insert (Synchronous)”

Implementation:

class SnowflakeFileLogger
{
public function write(array $paths, string $postfix, mixed $data): void
{
DB::connection('snowflake')->table('integration_logs')->insert([
'log_id' => Str::uuid(),
'integration_type' => $paths[1], // 'yodlee'
'advisor_id' => $paths[2],
'household_id' => $paths[3],
'endpoint' => $postfix,
'request_url' => $data['url'],
'request_method' => $data['method'],
'request_body' => json_encode($data['body']),
'response_status' => $data['response']['status'] ?? null,
'response_body' => json_encode($data['response']['body'] ?? null),
'timestamp' => now(),
'log_date' => now()->toDateString(),
]);
}
}

Pros:

  • Simple implementation
  • Immediate availability

Cons:

  • Blocks request termination
  • Network latency impact
Section titled “Option 2: Queue-Based Async Insert (Recommended)”

Implementation:

class SaveIntegrationLogJob implements ShouldQueue
{
public $queue = 'low-priority';
public $tries = 5;
public $backoff = [1, 5, 10, 30, 60];
public function handle()
{
DB::connection('snowflake')
->table('integration_logs')
->insert($this->logData);
}
}
class SnowflakeFileLogger
{
public function write(array $paths, string $postfix, mixed $data): void
{
dispatch(new SaveIntegrationLogJob([
// ... log data
]))->onQueue('low-priority');
}
}

Pros:

  • Non-blocking
  • Built-in retry
  • Backpressure handling

Cons:

  • Requires queue infrastructure
  • Slightly delayed availability

Implementation:

// Phase 1: Write to S3 staging (minimal change)
FileLog::write($paths, $postfix, $data);
// Phase 2: Snowpipe auto-ingests from S3
// Snowflake configuration:
CREATE PIPE integration_logs_pipe
AS COPY INTO integration_logs
FROM @integration_logs_stage
FILE_FORMAT = (TYPE = JSON);

Pros:

  • Minimal code changes
  • Snowflake handles ingestion
  • Can keep S3 as backup

Cons:

  • Still requires S3
  • Ingestion delay (minutes)
  • Snowpipe costs

Goal: Validate Snowflake logging without disrupting existing system

Steps:

  1. Create Snowflake table schema
  2. Implement SnowflakeFileLogger class
  3. Enable dual-write (both S3 and Snowflake)
    FileLog::write($paths, $postfix, $data); // S3
    SnowflakeFileLogger::write($paths, $postfix, $data); // Snowflake
  4. Monitor for consistency and errors
  5. Fix any schema/data issues

Success Criteria:

  • 100% of logs written to both systems
  • Snowflake data matches S3 (spot-check 1000 samples)
  • No performance regression

Goal: Validate Snowflake can replace S3 for queries

Steps:

  1. Update debugging tools to query Snowflake
  2. Train team on SQL queries
  3. Monitor query performance
  4. Optimize indexes/clustering if needed

Success Criteria:

  • All debugging queries use Snowflake
  • Query response time < 5 seconds for typical cases
  • No S3 downloads needed for log analysis

Goal: Remove S3 logging dependency

Steps:

  1. Disable S3 writes in FileLog
  2. Archive existing S3 logs to Glacier
  3. Remove FileLog S3 code (after 1 month buffer)

Success Criteria:

  • No S3 writes for integration logs
  • No Sentry errors related to S3 throttling
  • Team comfortable with Snowflake-only logging

Goal: Remove legacy code and optimize

Steps:

  1. Delete FileLog package (or mark deprecated)
  2. Update documentation
  3. Set up Snowflake data retention policies
  4. Create monitoring dashboards

At any phase, rollback is simple:

// Disable Snowflake write
SnowflakeFileLogger::$enabled = false;
// Continue with S3 only
FileLog::write($paths, $postfix, $data);

Rollback Triggers:

  • Snowflake unavailability > 5 minutes
  • Data loss or corruption detected
  • Performance regression > 2x baseline

While planning the Snowflake migration, implement these immediate fixes:

1. Time-Based S3 Prefix Sharding ⭐ Immediate

Section titled “1. Time-Based S3 Prefix Sharding ⭐ Immediate”
// Connector.php - Change paths structure
$now = CarbonImmutable::now();
$paths = [
'integration',
self::getVendorName(),
$now->format('Y/m/d/H'), // Hourly partitioning
$advisor->id ?? 'unknown',
$household->id ?? 'unknown',
];

Impact: Eliminates S3 throttling immediately

config/filesystems.php
's3' => [
'options' => [
'http' => [
'retries' => [
'mode' => 'adaptive',
'max_attempts' => 10,
],
],
],
],

Impact: Handles transient S3 errors gracefully

const handleRefresh = useDebouncedCallback(
async () => {
await refreshYodleeAccount();
},
2000,
{ leading: true, trailing: false }
);

Impact: Reduces micro-bursts from user clicks

Assumptions:

  • ~1M integration logs per month
  • Average file size: 10KB
  • Storage: 10GB/month cumulative

Costs:

S3 Storage: $0.023/GB × 10GB = $0.23/month
S3 PUT: $0.005/1000 × 1M = $5.00/month
S3 GET: $0.0004/1000 × 100K = $0.04/month
Data Transfer: negligible (same region)
Total: ~$5.27/month (~$63/year)

Assumptions:

  • 1M rows per month
  • Compressed size: 2GB/month (5x compression)
  • Query frequency: 100 queries/day

Costs:

Storage: $40/TB/month × 0.002TB = $0.08/month
Compute: ~$2/hour × 1 hour/month = $2.00/month
(assuming XS warehouse for queries)
Total: ~$2.08/month (~$25/year)

Savings: ~$38/year + operational efficiency gains

Note: Cost savings are minimal, but operational benefits are significant:

  • Faster debugging (saves engineer time)
  • Better analytics (business insights)
  • Reduced complexity (maintenance savings)
RiskLikelihoodImpactMitigation
Snowflake downtimeLowHighKeep S3 dual-write for 1 month
Data loss during migrationMediumHighValidate with checksums, dual-write period
Query performance issuesMediumMediumPre-optimize schema, add clustering
Team learning curveHighLowTraining sessions, SQL examples doc
Schema evolution needsMediumMediumUse VARIANT for flexible fields
  • ✅ 100% of logs in Snowflake
  • ✅ Zero S3 throttling errors
  • ✅ <5s p95 query latency
  • ✅ Team adoption (>80% use Snowflake for debugging)
  • ✅ <1 incident per month related to logging
  • ✅ Log retention policy enforced automatically
  • ✅ Integration log analytics dashboard in production
PhaseDurationEstimated StartOwner
Quick Wins (S3 fixes)1 week2026-03-18TBD
Phase 1: Dual-Write4 weeks2026-04-01TBD
Phase 2: Read from Snowflake2 weeks2026-05-01TBD
Phase 3: Snowflake-Only1 week2026-05-15TBD
Phase 4: Cleanup1 week2026-05-22TBD

Total Estimated Duration: ~8 weeks

  • Status: Proposed
  • Decision Date: TBD
  • Review Date: 2026-03-25 (engineering team meeting)
  • Next Steps:
    1. Present migration plan to team
    2. Get budget approval
    3. Implement quick wins (S3 fixes)
    4. Begin Phase 1 (dual-write) development

Last Updated: 2026-03-16 Document Owner: TBD