OnFinance AI

OnFinance AI

Full-Stack Software Engineer • Oct 2025 – Present

Built AI-powered compliance automation systems processing 50k+ files annually for 200+ financial institutions

Impact at a Glance

40×
API Performance
5.6×
Faster Page Loads
90%
Cost Reduction
<60s
10K Tasks Uploaded

The Role

Joined OnFinance AI to modernize their enterprise compliance platform serving 200+ regulated financial institutions. Owned full-stack features from system design to production deployment, shipping 27 releases and 3,500+ lines of production code.

System 01

AI-Powered Bulk Upload Pipeline

The Challenge

Compliance teams were drowning in manual data entry. Creating 10,000 regulatory tasks took 40+ hours per quarter, with a 12% error rate causing compliance violations. Customers demanded bulk upload, but existing architecture couldn't handle it.

The Solution

Designed and built a complete bulk ingestion system from ground up: multi-step wizard frontend, FastAPI backend with 9 RESTful endpoints, AI-powered column mapping using GPT-4, and background workers for processing. System handles Excel/CSV files with thousands of rows, automatically maps columns to schema, and creates validated tasks in under 60 seconds.

FastAPIGPT-4LangGraphMongoDBRedisCeleryNext.jsPandas

Technical Highlights

Performance Engineering: 180s → 42s

File Parsing (Pandas + lazy loading)90s → 11s (8×)
DB Writes (Chunked batch inserts)90s → 5s (18×)
AI Processing (Concurrent execution)45s → 15s (3×)
Total End-to-End180s → 42s (4.3×)

AI Cost Optimization: 90% Reduction

Instead of sending full files to GPT-4 ($4.50/file), implemented sample-based analysis: send only column metadata + 10 sample rows. Added smart model selection (GPT-3.5 for standard schemas, GPT-4 for complex). Result: $225K → $22K annually.

Cache hit rate: 67% • 50K files/year processed

Code Sample: Batch Write with Graceful Degradation

async def create_tasks_batch(tasks: List[Dict]):
    CHUNK_SIZE = 1000
    inserted_ids = []

    for chunk in chunks(tasks, CHUNK_SIZE):
        try:
            result = await db.actionables.insert_many(
                chunk, ordered=False  # Continue on failures
            )
            inserted_ids.extend(result.inserted_ids)
        except BulkWriteError as e:
            # Partial success - don't fail entire batch
            successful = e.details.get('insertedIds', [])
            inserted_ids.extend(successful)

            # Dead letter queue for failed docs
            failed_docs = [d for d in chunk
                          if d['_id'] not in successful]
            await dlq_service.publish(
                'task_creation_failures',
                failed_docs
            )

    return inserted_ids  # 90s → 5s (18× faster)

Business Impact

Time Savings
40+ hours eliminated per quarter
Error Rate
12% → <0.1%
Annual Cost Savings
$203K (AI optimization)
Processing Speed
10K tasks in <60s
System 02

Organization Admin Dashboard Refactor

The Challenge

Admin dashboard was unusable with 6.2s load times. Backend made 3 separate API calls with server-side aggregation (JOIN + GROUP BY), causing cache inconsistencies and data mismatches. Teams complained about stale role counts after organizational changes.

The Solution

Complete architecture refactor: eliminated 2 redundant endpoints, moved role count aggregation to client-side (O(n) JavaScript Map operations), implemented cascading cache invalidation, and rebuilt UI with modular tab components. Single API call now provides single source of truth.

ReactTypeScriptTanStack QueryNext.js

Technical Highlights

Architecture Decision: Server vs Client Aggregation

Before: 3 API calls → server-side JOIN + GROUP BY → potential cache mismatches → 6.2s load time
After: 1 API call → client-side O(n) Map aggregation → single source of truth → 1.1s load time
Why it works: P99 org size is 4,200 people (well under 10K threshold). JavaScript Map operations are blazing fast for this scale, and we eliminate an entire class of consistency bugs.

Cascading Cache Invalidation

Implemented smart invalidation strategy: when departments change, automatically invalidate people + roles caches. When people update, role counts auto-recompute from fresh data. Result: zero consistency bugs post-deployment.

Business Impact

Page Load Time
6.2s → 1.1s (5.6× faster)
API Endpoints
3 → 1 (66% reduction)
Cache Hit Rate
42% → 87%
Consistency Bugs
12/month → 0
System 03

Subtask Parent Resolution System

The Challenge

Critical bug blocked 850+ disclosure operations monthly. Data model only mapped parent tasks to communications, but users created disclosures from subtasks. System threw 404 errors, causing compliance workflow failures.

The Solution

Designed parent resolution algorithm with intelligent fallback: try direct mapping (O(1) for parent tasks), then check subtask mapping collection, resolve parent ID, fetch parent's communication. Added background job for orphan detection and data integrity monitoring.

Technical Highlights

Resolution Algorithm

1.Try direct communication mapping (fast path - O(1) index lookup)
2.If not found, check subtask mapping collection
3.Extract parent task ID from mapping
4.Fetch parent's communication mapping (O(2) total queries)
5.Validate data integrity, throw meaningful errors if orphaned

Business Impact

Operations Unblocked
850+ per month
Failures Post-Deploy
Zero
P95 Resolution Time
8.7ms (subtasks)
Orphaned Subtasks
0 (monitoring active)

How I Work

Performance First

Optimize hot paths ruthlessly. Measure before/after. Users feel the difference between 6s and 1s.

Graceful Degradation

Partial success > total failure. Circuit breakers, fallbacks, and DLQs ensure resilience.

Observability Built-In

Structured logging, metrics, and alerts. Debug production in minutes, not hours.

YAGNI Ruthlessly

Build what's needed now. Avoid over-engineering. Simple scales better than complex.

Overall Impact

27
Production Deployments
3,500+
Lines of Code Shipped
200+
Institutions Served

Shipping production systems that move the needle on performance, cost, and user experience