Posted 2026-01-135 minutes read (About 822 words)0 visits

title: How to build a PDF Autofiller Agent?
tags: [agent, pdf]
categories: [agent]
date: [2026-01-15 17:15:00]
index_img: /img/agent.png
cover: /img/agent.png
thumbnail: /img/agent.png
excerpt: Notes

How to build a PDF Autofiller Agent?

Requirements:

Design a Copilot Chatbox to provide such functionality: user uploads a PDF file with fields to fill in, and give Chatbox certain commands to fill out some fields. AI will use this command and identify the fields and values to fill, then fill those fields with the values that user specifies and return the form to users.

Existing Tools

Tool	Printed Select	Printed Edit	Scanned Select	Scanned Edit	Comments
Adobe Acrobat PDF	✅	❓	✅	❓	Needs Pro subscription to edit
ABBYY Finereader PDF					Can’t install on Mac
PDFfiller	✅	✅	❌	❌
LuminPDF	✅	❓	✅	❓	Need s Pro subscription to edit

Workflow Overview

PDF form Template
      ↓
PDF form parsing
      ↓
PDF manifest Generation
      ↓
LLM API
      ↓
PDF filling engine
      ↓
Generate filled PDF

Tech Stack (Client-side, Serverless)

PDF Form Parsing

Input: PDF raw data

12 0 obj
<<
  /Type /Annot
  /Subtype /Widget
  /FT /Tx             % Field Type: Text
  /T (Age_Field)      % Field Name (Key)
  /V ( )              % Value: Empty
  /Rect [100 100 200 120] % Position on page
  /AP << /N 13 0 R >> % Appearance Stream (how it looks)
>>
endobj

Since inputs and labels are not connected data structure-wise, i.e., they are not linked in the source code, unlike HTML where labels and inputs might be linked by id. The only way to identify related labels and inputs is to compare the coordinates.

Web Browser

pdf.js: parse raw PDF in browser.

Output: return a map between each object (input or label) and its coordinates.

{
    id: annot.fieldName || "unknown_id", // The internal ID (e.g., "txt_01")
    type: annot.subtype,                 // "Widget" (usually)
    inputType: annot.fieldType,          // "Tx" (Text), "Btn" (Button/Checkbox)
    rect: {
      x: Math.round(x),
      y: Math.round(y),
      width: Math.round(xMax - x),
      height: Math.round(yMax - y)
    }
}

PDF Manifest Generation

Given the coordinates of each object, find the matching ones. Especially, for all the input fields, find the matching label. Return the relationship as a JSON.

Using for-loops to calculate the Euclidean Distance between each coordinate pair can work.

LLM Agent

use Vercel AI SDK to orchestrate the “Reasoning-Action” loop. The LLM does not modify the file directly; it acts as a router to decide which client-side tool to call.

Framework: Next.js App Router + ai (Vercel AI SDK).
Tool Calling: Define a tool schema (using Zod) that describes the form fields. The LLM outputs structured JSON matching this schema instead of plain text.
Client-Side Execution: Use the useChat hook to intercept the LLM’s tool call. When the LLM requests fill_fields, the browser executes the JavaScript logic to update the PDF.

PDF Filling Engine

Library: pdf-lib (Client-side JavaScript).
Logic:
1. Load the PDF Uint8Array in memory.
2. Locate fields using the IDs provided by the LLM tool call.
3. Execute Write:
  - form.getTextField(id).setText(value)
  - form.getCheckBox(id).check()
4. Update Appearance: Run form.updateFieldAppearances() to ensure text is rendered visibly (generating the /AP stream).
5. Output: Generate a new Blob for user download.

Tech Stack (Server-side)

PDF Form Parsing

Input: PDF raw data (Bytes) Similar to the JS version, inputs (Widgets) and visual labels (Text) are disconnected in the PDF structure. We need to extract them separately.

Tool: PyMuPDF (import fitz)

Why: Faster and more accurate coordinate extraction than other Python libraries.

Output: A map between each object and its coordinates.

Python

# Extracted using page.widgets() and page.get_text("words")
{
    "id": widget.field_name,   # Internal ID (e.g., "txt_01")
    "type": widget.field_type, # Text, Checkbox, etc.
    "rect": [x0, y0, x1, y1]   # Bounding box coordinates
}

PDF Manifest Generation

Logic: Spatial Matching (Euclidean Distance). Given the coordinates of widgets and text blocks, find the matching pair.

Algorithm: For each input field, calculate distance to all text blocks. Find the text that is closest (Top/Left priority) to the input field.
Result: A clean JSON list linking field_id to label_text (e.g., {"id": "t1", "label": "Date of Birth"}).

LLM Agent

Tool: LangChain + Pydantic Use Pydantic to define the strict schema for the LLM output (Structured Output), replacing the need for raw prompt parsing.

Workflow:

Context Injection: Inject the PDF Manifest JSON directly into the System Prompt.
Reasoning: LLM maps User Command -> Field IDs.
Output: LLM returns a Pydantic object (JSON) containing the fill plan.

Python

class FieldUpdate(BaseModel):
    field_id: str
    value: str

# LLM is forced to return this structure
structured_llm = chat_model.with_structured_output(FieldUpdate)

PDF Filling Engine

Tool: pypdf

Why: Robust support for writing AcroForms and updating appearance streams.

Action:

Load PDF bytes using PdfReader.
Map the LLM’s Pydantic output to a dictionary: { "field_id": "value" }.

Execute filling:

Python

writer.update_page_form_field_values(
    writer.pages[0], 
    fields_dict, 
    auto_regenerate=True # Crucial for visible text (/AP Stream)
)

Return the BytesIO stream to the user.

http://example.com/2026/01/13/PDF_Autofiller/

Author

John Doe

Posted on

2026-01-13

Updated on

2026-02-02

Licensed under

You need to set install_url to use ShareThis. Please set it in _config.yml.

Afdian.net Alipay

Buy me a coffee Patreon

You forgot to set the business or currency_code for Paypal. Please set it in _config.yml.

Wechat

Comments

You forgot to set the shortname for Disqus. Please set it in _config.yml.

How to build a PDF Autofiller Agent?

Requirements:

Existing Tools

Workflow Overview

Tech Stack (Client-side, Serverless)

PDF Form Parsing

Web Browser

PDF Manifest Generation

LLM Agent

PDF Filling Engine

Tech Stack (Server-side)

PDF Form Parsing

PDF Manifest Generation

LLM Agent

PDF Filling Engine

Author

Posted on

Updated on

Licensed under

Like this article? Support the author with

Comments

Links

Categories

Tags

Subscribe for updates

follow.it

Recents

Archives