Statement Parser API

v1.0

Turn any bank or credit card statement into structured, categorized transaction data.

Overview

The Statement Parser API extracts every transaction from a bank statement PDF and returns clean, categorized JSON. Each transaction is classified using the Plaid taxonomy, normalized merchant names are identified, and the statement math is automatically reconciled.

🏦 Full Extraction

Every transaction, balance, and metadata from any statement format

🏷️ Auto-Categorized

Plaid-standard categories with deterministic matching for known brands

✅ Reconciled

Automatic math verification: opening + credits - debits = closing

💡 Perfect For

  • • Accounting firms and bookkeeping automation
  • • Personal finance and budgeting apps
  • • Real estate transaction tracking
  • • Expense management and categorization
  • • Financial reconciliation workflows

Quick Start

Base URL

https://api.cparse.com/statement/v1

Authentication

http
X-API-Key: YOUR_API_KEY

Get your API key from the dashboard. New accounts include free credit — no credit card required.

Quick Example

cURL

bash
curl --request POST \
  --url https://api.cparse.com/statement/v1/parse \
  --header 'X-API-Key: YOUR_API_KEY' \
  --form 'file=@statement.pdf'

Endpoints

GET/health

Health check endpoint. Returns service status.

Response
json
{ "status": "ok", "service": "statement-parser", "version": "0.1.0" }
POST/parse

Extract structured transaction data from a bank statement.

Request
  • Content-Type: multipart/form-data
  • file (required): Bank statement file (PDF, DOCX, JPEG, PNG, ZIP, 7z)
Response
json
{
  "success": true,
  "file_name": "statement.pdf",
  "pages_processed": 1,
  "ocr_used": false,
  "data": {
    "document_metadata": {
      "document_id": "doc_a1b2c3d4",
      "institution": "ACME BANK",
      "account_number_last4": "4829",
      "account_currency": "GBP",
      "statement_period_start": "2026-03-01",
      "statement_period_end": "2026-03-31"
    },
    "summary": {
      "opening_balance": 1250.50,
      "closing_balance": 1685.53,
      "total_debits": 64.97,
      "total_credits": 500.00,
      "transaction_count": 4,
      "math_reconciliation_status": "verified",
      "reconciliation_difference": 0.0
    },
    "transactions": [
      {
        "transaction_id": "tx_001",
        "date": "2026-03-04",
        "amount": -42.99,
        "currency": "GBP",
        "description_raw": "AMZN MKTPLACE AMZN.CO.UK GBR",
        "description_normalised": "Amazon Marketplace",
        "merchant_name": "Amazon",
        "classification": {
          "category": "general_merchandise",
          "category_detailed": "general_merchandise_online_marketplaces",
          "method": "deterministic",
          "confidence_score": 1.0
        },
        "flags": {
          "is_subscription": false,
          "is_recurring": false,
          "is_duplicate_candidate": false
        }
      },
      {
        "transaction_id": "tx_002",
        "date": "2026-03-07",
        "amount": -10.99,
        "currency": "GBP",
        "description_raw": "NETFLIX.COM BEVERLY HILLS CA",
        "description_normalised": "Netflix",
        "merchant_name": "Netflix",
        "classification": {
          "category": "entertainment",
          "category_detailed": "entertainment_tv_and_movies",
          "method": "deterministic",
          "confidence_score": 1.0
        },
        "flags": {
          "is_subscription": true,
          "is_recurring": true,
          "is_duplicate_candidate": false
        }
      }
    ]
  }
}

Response Fields

Document Metadata

FieldDescription
document_idUnique identifier for this parsed document
institutionBank or financial institution name
account_number_last4Last 4 digits of the account number
account_currency3-letter ISO currency code (e.g. GBP, USD)
statement_period_startStatement start date (YYYY-MM-DD)
statement_period_endStatement end date (YYYY-MM-DD)

Summary

FieldDescription
opening_balanceStarting balance for the statement period
closing_balanceEnding balance for the statement period
total_debitsSum of all debit transactions (positive number)
total_creditsSum of all credit transactions (positive number)
transaction_countTotal number of extracted transactions
math_reconciliation_status"verified" if balances add up, "discrepancy" if not, "not_checked" if data missing
reconciliation_differenceDifference between expected and actual closing balance

Transaction Fields

FieldDescription
transaction_idSequential ID (e.g. tx_001)
dateTransaction date (YYYY-MM-DD)
amountSigned amount: negative for debits, positive for credits
currency3-letter ISO currency code
description_rawOriginal text as it appears on the statement
description_normalisedCleaned, human-readable version
merchant_nameIdentified merchant or entity
classification.categoryPlaid primary category (e.g. food_and_drink)
classification.category_detailedPlaid detailed category (e.g. food_and_drink_groceries)
classification.method"deterministic" for known brands, "ai_inference" for AI-classified
classification.confidence_score1.0 for deterministic, 0.0-0.99 for AI inference
flags.is_subscriptionWhether this is a subscription payment
flags.is_recurringWhether this charge recurs on a schedule
flags.is_duplicate_candidateTrue if same date + amount + description appears more than once

Error Handling

StatusErrorSolution
400No file providedInclude a file in the multipart request
401Authentication requiredPass X-API-Key header with a valid key
402Insufficient balanceTop up your account at cparse.com/dashboard
413File too large / too many pagesReduce file size or split the document
415Unsupported file typeUse PDF, DOCX, JPEG, PNG, ZIP, or 7z
500Internal errorRetry or contact support

Tips

📌 Best Practices

  • Text-based PDFs work best. For scanned statements, ensure the PDF has selectable text, or pre-process with an OCR tool before uploading.
  • Check the reconciliation status. A "verified" status means the statement math adds up. A "discrepancy" may indicate missing transactions or fees not shown as line items.
  • Use deterministic classifications with confidence. Transactions classified with method: deterministic and confidence_score: 1.0 are matched via regex against known global brands.
  • Review duplicate candidates. The is_duplicate_candidate flag highlights transactions with identical date, amount, and description within the same document.

🔒 Security & Privacy

  • ✅ Statements are processed in memory and never stored
  • ✅ All requests over HTTPS
  • ✅ API key authentication required on every request
  • ✅ No transaction data is logged or retained

Code Examples

Python

python
import requests

url = "https://api.cparse.com/statement/v1/parse"
headers = {"X-API-Key": "YOUR_API_KEY"}

with open("statement.pdf", "rb") as f:
    response = requests.post(url, files={"file": f}, headers=headers)

data = response.json()
print(f"Institution: {data['data']['document_metadata']['institution']}")
print(f"Reconciliation: {data['data']['summary']['math_reconciliation_status']}")

for tx in data["data"]["transactions"]:
    print(f"{tx['date']}  {tx['amount']:>10.2f}  {tx['description_normalised']}")

JavaScript (Node.js)

javascript
import FormData from 'form-data';
import fs from 'fs';
import axios from 'axios';

const form = new FormData();
form.append('file', fs.createReadStream('statement.pdf'));

const response = await axios.post(
  'https://api.cparse.com/statement/v1/parse',
  form,
  {
    headers: {
      ...form.getHeaders(),
      'X-API-Key': 'YOUR_API_KEY',
    },
  }
);

const { document_metadata, summary, transactions } = response.data.data;
console.log(`${transactions.length} transactions extracted`);
console.log(`Reconciliation: ${summary.math_reconciliation_status}`);