
Full-Stack Software Engineer • Oct 2025 – Present
Built AI-powered compliance automation systems processing 50k+ files annually for 200+ financial institutions
Joined OnFinance AI to modernize their enterprise compliance platform serving 200+ regulated financial institutions. Owned full-stack features from system design to production deployment, shipping 27 releases and 3,500+ lines of production code.
Compliance teams were drowning in manual data entry. Creating 10,000 regulatory tasks took 40+ hours per quarter, with a 12% error rate causing compliance violations. Customers demanded bulk upload, but existing architecture couldn't handle it.
Designed and built a complete bulk ingestion system from ground up: multi-step wizard frontend, FastAPI backend with 9 RESTful endpoints, AI-powered column mapping using GPT-4, and background workers for processing. System handles Excel/CSV files with thousands of rows, automatically maps columns to schema, and creates validated tasks in under 60 seconds.
Instead of sending full files to GPT-4 ($4.50/file), implemented sample-based analysis: send only column metadata + 10 sample rows. Added smart model selection (GPT-3.5 for standard schemas, GPT-4 for complex). Result: $225K → $22K annually.
async def create_tasks_batch(tasks: List[Dict]):
CHUNK_SIZE = 1000
inserted_ids = []
for chunk in chunks(tasks, CHUNK_SIZE):
try:
result = await db.actionables.insert_many(
chunk, ordered=False # Continue on failures
)
inserted_ids.extend(result.inserted_ids)
except BulkWriteError as e:
# Partial success - don't fail entire batch
successful = e.details.get('insertedIds', [])
inserted_ids.extend(successful)
# Dead letter queue for failed docs
failed_docs = [d for d in chunk
if d['_id'] not in successful]
await dlq_service.publish(
'task_creation_failures',
failed_docs
)
return inserted_ids # 90s → 5s (18× faster)Admin dashboard was unusable with 6.2s load times. Backend made 3 separate API calls with server-side aggregation (JOIN + GROUP BY), causing cache inconsistencies and data mismatches. Teams complained about stale role counts after organizational changes.
Complete architecture refactor: eliminated 2 redundant endpoints, moved role count aggregation to client-side (O(n) JavaScript Map operations), implemented cascading cache invalidation, and rebuilt UI with modular tab components. Single API call now provides single source of truth.
Implemented smart invalidation strategy: when departments change, automatically invalidate people + roles caches. When people update, role counts auto-recompute from fresh data. Result: zero consistency bugs post-deployment.
Critical bug blocked 850+ disclosure operations monthly. Data model only mapped parent tasks to communications, but users created disclosures from subtasks. System threw 404 errors, causing compliance workflow failures.
Designed parent resolution algorithm with intelligent fallback: try direct mapping (O(1) for parent tasks), then check subtask mapping collection, resolve parent ID, fetch parent's communication. Added background job for orphan detection and data integrity monitoring.
Optimize hot paths ruthlessly. Measure before/after. Users feel the difference between 6s and 1s.
Partial success > total failure. Circuit breakers, fallbacks, and DLQs ensure resilience.
Structured logging, metrics, and alerts. Debug production in minutes, not hours.
Build what's needed now. Avoid over-engineering. Simple scales better than complex.
Shipping production systems that move the needle on performance, cost, and user experience