πŸ–ΌοΈ

Advanced Deduplicator

Find visually similar images using AI-powered perceptual hashing with customizable similarity thresholds.

Image ⭐⭐⭐ Advanced ⏱️ 5 minutes

😫 The Pain Point

You have the same photo in multiple resolutions, crops, and slight edits. Simple hash comparison won’t catch these β€œnear duplicates” because the bytes are different.

πŸš€ Agentic Solution

An AI-Powered Similarity Detector that finds visually similar images, not just identical ones.

Key Features:

  • Multiple Algorithms: Average hash, Perceptual hash, Difference hash.
  • Adjustable Threshold: Control how similar images need to be to match.
  • Visual Comparison: Side-by-side preview of similar pairs.

βš”οΈ Phase 1: Commander (Quick Fix)

For finding similar images.

Prompt:

β€œI have a folder photos with near-duplicate images. Write a Python script using imagehash to:

  1. Hash Method: Use perceptual hash (phash).
  2. Similarity: Find images with Hamming distance < 10 (adjustable).
  3. Report: Group similar images and print paths.
  4. Dry Run: List only; --delete to remove extras (keep largest file).

Print progress. Show similarity scores. Handle corrupt images gracefully.”

Result: A truly clean photo library with no visual duplicates.

πŸ—οΈ Phase 2: Architect (Permanent Tool)

For Professional Photographers.

Engineering Prompt:

**Role:** Python GUI Developer (PyQt6 Specialist)
**Task:** Create "Advanced Image Similarity Finder" Desktop App

**Objective:** A standalone desktop tool to find and clean near-duplicate images using visual similarity algorithms.

**Tech Stack:**
* Language: Python 3.10+
* GUI Library: PyQt6 (Cross-platform coverage for Windows/macOS)
* Image Processing: imagehash, Pillow
* Packaging: PyInstaller

**Functional Requirements:**
1.  **UI Layout (PyQt6):**
    *   **Top Bar:** Folder Selection, Scan Button.
    *   **Settings Panel:** Algorithm Dropdown (aHash, pHash, dHash), Threshold Slider (0-60).
    *   **Main View:** Side-by-side comparison of found duplicates with "Keep Left" / "Keep Right" buttons.
    *   *Constraint:* Use `QScrollArea` for the list of duplicates.

2.  **Core Logic:**
    *   Calculate hashes not just bytes (Perceptual Hash).
    *   Compare all images (O(n^2) or optimized).
    *   **Threading:** Use `QThread` for the scanning process to ensure the UI remains responsive.

3.  **Deliverables:**
    *   `main.py`: Complete source code.
    *   `requirements.txt`: Dependencies.
    *   **Build Instructions:**
        *   Windows: `pyinstaller --onefile --noconsole main.py`
        *   macOS: `pyinstaller --windowed --noconsole main.py`

🧠 Prompt Decoding

  • Hamming Distance: The number of bits that differ between two hashes. Lower = more similar. 0 = identical.

πŸ› οΈ Instructions

  1. Install: pip install imagehash
  2. Copy Prompt β†’ Run.
  3. Adjust threshold to catch more or fewer matches.

Related Workflows

Explore other categories

πŸ“¬

Get Started with Agentic Working

Subscribe to receive updates from AgenticWorking.io

πŸ“– Free eBook Guide πŸ“¦ 7 Ready-to-use Scripts πŸ”” Weekly Tips

No spam, unsubscribe anytime. Join 1,000+ subscribers.