Workflow SystemsEngineering Log March 20, 2025 11 min read

Financial Intelligence: Keeping AI Away From the Ledger

Clearworth keeps the financial ledger under deterministic backend control while using AI for language, routing, and structured action proposals. The model can explain and propose, but money records still move through typed schemas, guarded mutations, and normal database rules.

Backend ArchitectureFinancial SystemsFastAPISQLAlchemyIndexedDBLLM GatewayStructured ActionsBudget ControlsForecastingAnomaly Detection

The dangerous part was letting the model touch the ledger.

This started as a private build for someone in my network who wanted a personal finance workflow they could own. The goal was not to beat every App Store budgeting app on feature count. The goal was to streamline their own finance routine, keep the ledger controllable, and test what a personalized AI assistant could do when it had access to structured records.

That changed the cost model.

A subscription app hides infrastructure cost inside the subscription and keeps the workflow inside its own product rules. This build exposed the cost directly: hosted frontend, backend service, model calls, storage, and maintenance.

That was acceptable for this use case because the point was control, customization, and learning what a private AI finance assistant could safely handle.

So I treated hosting and model usage as part of the architecture, not an afterthought.

A Vercel frontend, Railway backend, and Together AI calls are cheap at low volume, but they are still real operating costs. The system needed hard boundaries around provider calls, request loops, and accidental usage spikes.

That is why the LLM gateway had a spend cap, Redis-backed usage tracking with memory fallback, provider timeouts, and a concurrency limit. If the assistant became expensive or unstable, the backend had to fail before the bill became the failure.

The model handles language. The backend handles money.

The first version of the architecture made the service boundary explicit:

browser UI
→ local IndexedDB storage
→ API service layer
→ FastAPI backend
→ Pydantic validation
→ SQLAlchemy models
→ deterministic finance services
→ LLM gateway for language and action routing

Clearworth finance architecture showing browser storage, API service layer, FastAPI backend, validation layer, SQLAlchemy database models, deterministic summary, forecast, and anomaly services, and a separate LLM gateway. — Clearworth finance architecture

I called the project Clearworth for the client, mostly to give the system a clean name while building it. The important part was the architecture.

The model handled language.

The backend handled money.

That rule shaped the whole system.

The ledger starts as structured data

The FastAPI backend owns the financial records through SQLAlchemy models. The core table is a financial log with indexed fields for type, amount, date, expense category, and income category. Monthly goals, budgets, balances, performance metrics, and system metrics sit beside it as separate tables.

That shape matters because the assistant can only be useful if the ledger is structured first.

FinancialLog
* logType
* description
* amount
* currency
* date
* expenseCategory
* incomeCategory
* expectedSavings
* createdAt
* updatedAt

MonthlySavingsGoal
* month
* projected_amount
* actual_amount

MonthlyBudget
* month
* category
* projected_amount
* day_of_month

MonthlyBalance
* month
* balance_type
* account_type
* amount

SQLAlchemy financial models for logs, monthly savings goals, budgets, balances, performance metrics, and system metrics. — SQLAlchemy finance models

The finance assistant sits on top of that structure.

It doesn’t replace it.

Questions stream. Mutations go through JSON mode.

The agent endpoint has two paths.

Normal questions stream back as text.

Mutation requests go through strict JSON mode.

That split became the main safety boundary.

For a normal question, the backend builds a compact context packet. It does not dump the entire database into the model. If the user asks for a summary, the backend calculates the monthly income, expenses, net flow, savings rate, and goal progress in Python first.

The model receives the result as context.

It explains numbers that already exist.

It does not calculate them.

That was the right tradeoff because LLMs can sound confident while doing bad arithmetic. In finance, a wrong total is not a harmless hallucination. It changes the user’s trust in the whole system.

The summary path uses date ranges instead of database-specific month functions. That was deliberate. SQLite and Postgres handle date formatting differently. I wanted the backend to work locally and on Railway without fragile SQL dialect assumptions.

The flow became:

summary trigger
→ query current month by date range
→ aggregate income and expenses through SQLAlchemy
→ calculate net flow in Python
→ calculate savings rate in Python
→ pass compact summary to the model
→ stream explanation to the UI

Monthly summary function that aggregates income and expenses by date range, calculates net flow and savings rate in Python, and passes only compact context to the LLM. — Deterministic monthly summary

Database writes need stricter handling

The mutation path is stricter.

If the user says “delete the transaction from Jan 18” or “change that lunch to $20,” the request does not go straight from model output to database write.

It goes through intent detection first.

add
create
delete
remove
update
edit
change
modify
record

Those words push the request into the mutation path. The backend switches the model into JSON mode and asks for a structured action.

The expected shape is not prose. It is an action object.

action_type
entity
summary
payload

Then the backend validates the action before execution.

For updates and deletes, the backend tries to resolve the target through an exact ID first. If there is no ID, it can fall back to date or description. If fuzzy description matching finds more than one record, it refuses to bulk delete and asks for a more specific target.

That guardrail is small, but important.

A personal finance assistant should not delete five transactions because the word “coffee” matched too many rows.

The mutation flow looks like this:

user request
→ mutation intent detected
→ LLM forced into JSON mode
→ action parsed and normalized
→ Pydantic action schema validates shape
→ backend resolves target record
→ SQLAlchemy executes write
→ commit or rollback
→ UI receives refresh action

Agent mutation flow showing intent detection, strict JSON action, schema validation, ID, date, and description resolution, guarded database write, commit or rollback, and UI refresh. — Guarded finance mutation flow

This is where the system started feeling less like an AI demo and more like a backend application with an AI interface.

The assistant can suggest or structure the action.

The backend decides whether that action is safe enough to run.

There is also a separate path for normal assistant replies. Those use Server-Sent Events so the UI can start receiving text before the full response finishes. That matters for perceived latency. Finance questions often involve summaries and explanations, and waiting for one large response makes the app feel heavier than it is.

So the agent has two runtime modes:

analysis/chat
→ streaming response
→ no database mutation

mutation/tooling
→ synchronous JSON
→ validated database write

The same endpoint supports both, but the execution model changes based on risk.

The LLM gateway has cost and runtime controls

The LLM gateway has its own controls.

Together AI sits behind a small backend client instead of being called directly from the frontend. That gave me one place to enforce provider timeouts, concurrency limits, model routing, and budget controls.

The gateway checks spend before requests. It uses Redis if available and falls back to in-memory tracking if Redis is not configured. It also uses a semaphore to cap concurrent provider calls, and a 30-second timeout so requests do not hang forever.

The basic control loop is:

check budget
→ acquire concurrency slot
→ call Together AI
→ track token usage
→ update spend counter
→ return response or controlled error

LLM gateway wrapper with Together AI request handling, hard budget cap, Redis-backed spend tracking, in-memory fallback, concurrency semaphore, timeout, and streaming support. — LLM gateway budget and concurrency guard

This mattered because public demos and private tools can both get abused in boring ways. Duplicate requests, accidental loops, impatient double-clicks, and reload storms can burn API calls without adding user value.

The agent endpoint adds another guardrail before the LLM gateway. It hashes the prompt and rejects exact duplicates sent within two seconds from the same session.

That is not sophisticated distributed rate limiting. It is a practical safety mechanism.

The backend also keeps short in-memory conversation history with a session TTL. It trims the context window to avoid dragging old conversation state into every call. The README is honest about this being single-instance behavior. If the system needed multi-worker or serverless scale, that state should move to Redis.

For this stage, Railway runs the backend as one worker. SQLite is used for plug-and-play deployment and demo seeding. The deployment starts by seeding data, then runs Uvicorn. That is fine for a controlled reference app. It is not pretending to be a multi-tenant banking platform.

That distinction matters.

A good engineering log should say where the system is strong and where it is intentionally scoped.

Local-first storage keeps the UI responsive

The local-first side is handled in the browser with IndexedDB through idb-keyval. Logs can be saved, updated, deleted, cleared, and bulk-added locally. There is also an offline queue through localStorage for pending logs, plus a sync manager that can push device operations to the backend using an API key.

That gave the app a practical offline-friendly shape:

browser event
→ write to IndexedDB
→ update React state immediately
→ queue pending operation if needed
→ sync through backend when available

IndexedDB log storage layer with save, getAll, update, delete, clear, and bulk-add behavior. — IndexedDB finance log storage

The sync path is still simple. It stores operations per device in memory on the backend. That is useful for the reference architecture, but I would not sell that as final production sync. Production sync would need persistent operation logs, conflict resolution, replay safety, and probably per-user isolation.

For this build, the useful part is the boundary:

local writes should feel instant
backend sync should be explicit
AI should not be the source of truth

Forecasting and anomaly detection stay deterministic

Forecasting and anomaly detection follow the same rule.

They run as separate deterministic API endpoints.

Forecasting uses Prophet against monthly expense totals. The backend fetches raw expense logs, groups them by month in Pandas, validates that there are at least two months of data, fits the model, and returns a bounded forecast payload.

It also returns MAE and RMSE when requested.

The endpoint clamps negative spending predictions to zero and adds cache headers so the frontend can avoid refetching expensive results too aggressively.

forecast request
→ validate periods
→ fetch expense logs
→ group by month in Pandas
→ fit Prophet
→ calculate metrics
→ return forecast bounds
→ cache response

Forecast endpoint using Pandas monthly aggregation, Prophet fitting, MAE and RMSE metrics, forecast bounds, negative value clamping, and cache headers. — Deterministic forecast endpoint

Anomaly detection uses IsolationForest. It pulls the latest expense records, builds features from amount, day of week, and day of month, then returns unusual transactions with a plain reason.

It also handles the low-data case safely. Fewer than ten logs returns an empty result instead of forcing the model to pretend.

anomaly request
→ fetch recent expense logs
→ require enough records
→ build date-based features
→ run IsolationForest
→ return flagged transactions
→ safe empty result on failure

Anomaly endpoint using IsolationForest with amount, day-of-week, and day-of-month features, low-data handling, and safe error response. — Deterministic anomaly endpoint

The system became safer when the AI had less authority

This is the backend pattern I wanted across the whole app.

The model can help the user talk to the system.

The system still needs normal engineering mechanisms:

Typed schemas
Sanitized prompts
Normalized log types
Indexed database fields
Date-range queries
Server-side aggregation
Action validation
Guarded deletes
Commit and rollback behavior
Provider timeouts
Budget caps
Concurrency limits
Cache headers
Health checks
Deployment constraints written down honestly

The client wanted a personal AI finance tracker.

The actual build became a finance backend with an AI interface.

That difference matters because money workflows punish loose boundaries. A model can classify a phrase, draft a response, or translate intent into a structured action. It should not be trusted as the ledger, the calculator, the database, the sync engine, or the deployment strategy.

The system became safer when the AI had less authority.

Not less value. Less authority.

That is the architecture I’d keep carrying forward: use the model where language is messy, then hand off to deterministic services where the result has to be correct.

Onto the next one. Let’s keep sharpening that edge!

First written on March 20, 2025.

Want to implement this architecture in your business?

Discuss Your Project