😫 The Pain Point
Duplex (2-sided) scanners often scan the back of single-sided documents, resulting in “Content - Blank - Content - Blank”. You have to manually delete the blanks from the PDF.
🚀 Agentic Solution
Visual Inspector: Checks the “Ink Density” of each page. No ink = Blank.
Key Features:
- Sensitivity: Adjustable threshold (because scanned paper is never 100% #FFFFFF white).
⚔️ Phase 1: Commander (Quick Fix)
For a quick cleanup.
Prompt:
“Iterate through pages of
scan.pdf. Convert each page to grayscale image. Calculate the percentage of black pixels. If it’s less than 0.5%, remove the page. Save asclean.pdf.”
Result: A clean PDF.
🏗️ Phase 2: Architect (Permanent Tool)
For Office Admins.
Engineering Prompt:
**Role:** Python PDF Developer
**Task:** Create a "Blank Page Remover".
**Requirements:**
1. **GUI:**
* Select PDF.
* Slider: "Sensitivity" (Noise Box).
* "Clean" button.
2. **Logic:**
* Convert page to image.
* Count non-white pixels.
* Compare vs Threshold.
* Build new PDF with only valid pages.
3. **Deliverables:** `remove_blank.py`, `run.bat` (Windows), `run.sh` (Mac).
🧠 Prompt Decoding
- Threshold: Real-world paper has dust and grain. A strict “is completely white?” check will fail. The logic checks “is mostly white?”, which is robust.
🛠️ Instructions
- Copy Prompt -> Paste -> Run.
- Select PDF -> Clean.