πŸ“„

Extract Images from Docs

Mining original image files embedded inside Word or PDF documents without losing quality.

Document ⭐ Beginner ⏱️ 3 minutes

😫 The Pain Point

A client sends a Word report with 50 site photos inside. You need those photos as separate JPGs to upload to your system. Right-click -> Save as Picture… 50 times? No thanks.

πŸš€ Agentic Solution

Deep Extraction: Digs into the file structure (Zip/PDF internal) to retrieve the original assets.

Key Features:

  • Original Quality: Retrieves the exact file that was inserted, not a compressed version.
  • Bulk: Process an entire folder of reports.

βš”οΈ Phase 1: Commander (Quick Fix)

For extracting from a single file.

Prompt:

β€œI have report.docx. Since docx is a zip file, extract all media images inside it to an β€˜Images’ folder. Or use a Python library to do it.”

Result: All images extracted instantly.

πŸ—οΈ Phase 2: Architect (Permanent Tool)

For Archivists/Designers.

Engineering Prompt:

**Role:** Python Document Developer
**Task:** Create an "Asset Extractor for Docs".
**Requirements:**
1.  **GUI:**
    *   Select File Type (Word or PDF).
    *   Select Input Folder.
    *   "Extract Images" button.
2.  **Logic:**
    *   **Word:** unzip the `.docx` and copy files from `word/media` (Fastest & Best quality).
    *   **PDF:** Use `fitz` (PyMuPDF) to iterate pages and `get_images()`.
    *   Save to Output Folder.
3.  **Deliverables:** `doc_img_extract.py`, `run.bat` (Windows), `run.sh` (Mac).

🧠 Prompt Decoding

  • Word as Zip: A .docx file is literally a zipped folder of XMLs and Images. Treating it as a zip file is a β€œhacker” trick that makes extraction instant and robust without needing MS Word installed.

πŸ› οΈ Instructions

  1. Copy Prompt -> Paste -> Run.
  2. Select Doc/PDF -> Extract.

Related Workflows

Explore other categories

πŸ“¬

Get Started with Agentic Working

Subscribe to receive updates from AgenticWorking.io

πŸ“– Free eBook Guide πŸ“¦ 7 Ready-to-use Scripts πŸ”” Weekly Tips

No spam, unsubscribe anytime. Join 1,000+ subscribers.