Did you actually fire your bookkeeper?

Yes, after nine months of running an AI bookkeeping stack in parallel and confirming the outputs matched. The bookkeeper was doing $650 a month of reconciliation work across five businesses. The AI stack covers the same scope for about $35 a month in tool costs plus one to two hours of human review per month. The CPA remains. Bookkeeper and CPA are not the same role.

What does the AI bookkeeper actually do?

It pulls transactions daily from Stripe, Amazon Seller Central, Shopify, PayPal, and a business bank account via Plaid. Claude categorizes each transaction using a 40-category taxonomy with a confidence score. A Python reconciliation script matches deposits to settlement totals and flags mismatches over $5. At month end, a consolidated P&L lands in Telegram. Anything under 85% confidence gets flagged for a five-minute human review.

Is AI bookkeeping audit-safe?

No, not on its own. AI categorization reasoning is not an audit defense. If you are audited by the IRS, your CPA will need human-level documentation and professional judgment. The AI stack handles the day-to-day transaction flow. A human CPA is still required for year-end tax prep, quarterly estimated tax, multi-state sales tax compliance, and any audit situation.

What tools do you use for AI bookkeeping?

The stack is n8n for pipeline orchestration (self-hosted on an $8 VPS), Claude API for transaction categorization, Plaid for bank feed access, and Notion as the data store. Amazon settlements come in via SP-API, Shopify and Stripe via their native APIs. Total out-of-pocket cost is about $35 a month.

Should every small business replace their bookkeeper with AI?

No. This approach works for solo founders or small businesses with $100K to $3M revenue, simple revenue streams across one to three payment processors, and no employees or payroll to manage. It does not work well for businesses with complex multi-entity structures, inventory or COGS accounting, employees (use Gusto or Rippling instead), or regulated industries. Keep your CPA regardless.

Replaced a Bookkeeper With AI: 9-Month Production Log (2026)

May 5, 2026 · 9 min read · by Dmytro Negodiuk

Nine months ago I was paying a bookkeeper $650 a month to keep five small businesses reconciled. Good person, did the work, but every month I was paying to move numbers between systems that already had APIs.

I decided to test an AI replacement. This post is the honest 9-month log. Savings are real. Pain points are also real. I still have a CPA because I am not an idiot. Read to the end before you tell your bookkeeper anything.

What the bookkeeper was doing

Five-business bookkeeping across: Amazon Seller (Mozabrik), Shopify (OD Granite), Stripe (Negodiuk.ai consulting retainers), PayPal (miscellaneous international), and a Notion-based consulting tracker.

Monthly output:

Reconciled bank statements across 3 business accounts
Categorized all transactions in QuickBooks
Matched Amazon settlement reports to deposits
Flagged anything weird to me
Produced a basic P&L per business and a consolidated view

Time it took her: about 8-10 hours a month. Fee: $650 a month. Equivalent hourly: $65-$80, which is the going rate for a competent US bookkeeper on a small portfolio.

The AI replacement (what I actually built)

Not "a single AI tool." It is a small stack of automations that produce 90% of what the bookkeeper produced, plus a human (me) who handles the 10% that matters most.

Component 1: Transaction pull

A daily n8n cron pulls transactions from:

Stripe via API
Amazon Seller Central via SP-API (settlement reports every 14 days)
Shopify via API
PayPal via API
Business bank account via Plaid (read-only)

All raw transactions land in a single Notion database with a consistent schema.

Component 2: Categorization via Claude

Each night, Claude reads uncategorized rows and assigns a category based on merchant name, memo, and amount. Uses a 40-category taxonomy I built once. Confidence score per categorization. Anything under 85% confidence gets flagged to a "review" view.

Categorization accuracy after 3 months of prompt tuning: about 94% on recurring transactions, 72% on novel ones. Human review takes 5 minutes a week, not 5 hours a month.

Component 3: Reconciliation

Simple Python script compares bank statement deposits to Stripe / Amazon / Shopify / PayPal settlement totals. Flags any mismatch over $5. Most mismatches are timing differences (deposit was Nov 30, Amazon settlement was Nov 28). Flags ones that stay unresolved for 14+ days as "investigate."

Component 4: Monthly report

End of month, n8n runs a summary and drops a Telegram message:

Mar 2026 consolidated:
• Revenue: $X (breakdown per business)
• Expenses: $Y (top 10 categories)
• Net: $Z
• Unusual: 2 transactions over $1K not recurring, review needed
• All accounts reconciled

That's it. Monthly close that used to take the bookkeeper 8-10 hours is now a 30-second Telegram notification plus maybe 15 minutes of my time reviewing flagged items.

What it actually costs

Item	Monthly
Claude API for categorization	~$15
Plaid (free for my transaction volume)	$0
n8n self-hosted on $8 VPS	$8 (shared across 50+ other automations)
Notion (paid plan for database)	$10
My time, 15-30 min/week on review	~1-2 hours/mo
Out-of-pocket	~$35

Before: $650/month bookkeeper.
After: $35/month stack + 1-2 hours of my time.
Monthly saving: $615. Over 9 months: $5,535.

What broke (and the fixes)

Break 1 (month 2): Amazon settlement complexity

Amazon settlements are not simple. They include sales, fees, reimbursements, FBA storage fees, tax withholdings, and chargebacks. My first categorization pass lumped them all as "Amazon revenue" which inflated gross revenue by ~30%.

Fix: Built a parser that splits settlements by line item before categorization. Extra 2 hours of setup. Permanent fix.

Break 2 (month 3): PayPal edge cases

Currency conversion on international PayPal payments showed up as two transactions (FX fee + actual transfer). Claude categorized the FX fee as "miscellaneous" and lost $3 a month.

Fix: Added a rule specifically for "PayPal FX fee" category. 10 minutes.

Break 3 (month 5): Owner's personal card charged to business

Month where I accidentally put a personal dentist charge on the business card. Claude categorized it as "Medical, employee benefit." My tax person noticed. Had to reclassify.

Fix: Added weekly human review step. Claude can't catch personal-vs-business when the merchant is ambiguous. Human skim takes 5 min/week.

Break 4 (month 7): Claude API format change

Anthropic rolled a minor output format change. My parser broke for one day. Categorizations stopped.

Fix: Added a fallback to raw text parse, plus error alert via Telegram. Took an hour. Hasn't broken since.

What I still use humans for

Important: I replaced my bookkeeper. I did NOT replace my CPA. Those are different roles.

Year-end tax prep. CPA. $1,200/yr. Worth every dollar. Keeps me out of trouble with the IRS.
Quarterly estimated tax calculation. CPA reviews my numbers, calculates. I don't trust AI for this at stakes where being wrong costs 10% penalties.
Sales tax compliance (multi-state). I use TaxJar for this, not AI. Automated but it's a dedicated tool, not a Claude prompt.
Ambiguous transaction review. Me, weekly, 5 min. Always a human in the loop.
Any transaction flagged "investigate." Me, within 48 hours. Average 1-2 per month.

When AI bookkeeping will bite you

If you have complex multi-entity structure. LLCs, S-corps, parent-subsidiary relationships. AI categorization is not at the level of knowing which entity gets which deduction. Get a CPA.
If you have inventory or COGS accounting. This is where bookkeepers earn their keep. AI is okay at simple FIFO but loses on weighted average or specific identification.
If you have employees / payroll. Use Gusto or Rippling. Not a Claude prompt. Payroll errors cost more than bookkeeping fees.
If you are audited. Your CPA will want human-level documentation. AI categorization reasoning is not an audit defense.

Who should actually try this

Solo founder / small business with $100K-$3M revenue
Simple revenue streams (1-3 payment processors max)
Comfortable with Python / n8n / some basic ops setup (or hiring someone to set it up)
Has a CPA already. DO NOT DO THIS WITHOUT A CPA.
Not publicly traded or regulated (financial services, healthcare)

How to start (if you want to)

Keep your current bookkeeper for 1 month while you set up the AI stack in parallel.
Reconcile against their output. Goal: 95%+ match.
If match is good, switch to the AI stack for 1 month WITH weekly human review. Keep the bookkeeper as consultant on speed dial.
If month 2 still matches, cancel the bookkeeper. Keep the CPA.

Budget 20-30 hours of setup time. Or hire someone like me to build it for you. See below.

Want this built for your business?

I set up this exact stack for $1-10M SMBs as part of the Fractional AI Officer Sprint ($5,000-$8,000). Turnaround: 3-4 weeks. Includes the stack plus a 2-hour handoff training. Book a 30-min call to see if it fits.

Book a call

More on the Fractional AI Officer model · The $600 operator stack · Fractional AI Officer for distributors