π« The Pain Point
You have the same photo in multiple resolutions, crops, and slight edits. Simple hash comparison wonβt catch these βnear duplicatesβ because the bytes are different.
π Agentic Solution
An AI-Powered Similarity Detector that finds visually similar images, not just identical ones.
Key Features:
- Multiple Algorithms: Average hash, Perceptual hash, Difference hash.
- Adjustable Threshold: Control how similar images need to be to match.
- Visual Comparison: Side-by-side preview of similar pairs.
βοΈ Phase 1: Commander (Quick Fix)
For finding similar images.
Prompt:
βI have a folder
photoswith near-duplicate images. Write a Python script using imagehash to:
- Hash Method: Use perceptual hash (phash).
- Similarity: Find images with Hamming distance < 10 (adjustable).
- Report: Group similar images and print paths.
- Dry Run: List only;
--deleteto remove extras (keep largest file).Print progress. Show similarity scores. Handle corrupt images gracefully.β
Result: A truly clean photo library with no visual duplicates.
ποΈ Phase 2: Architect (Permanent Tool)
For Professional Photographers.
Engineering Prompt:
**Role:** Python GUI Developer (PyQt6 Specialist)
**Task:** Create "Advanced Image Similarity Finder" Desktop App
**Objective:** A standalone desktop tool to find and clean near-duplicate images using visual similarity algorithms.
**Tech Stack:**
* Language: Python 3.10+
* GUI Library: PyQt6 (Cross-platform coverage for Windows/macOS)
* Image Processing: imagehash, Pillow
* Packaging: PyInstaller
**Functional Requirements:**
1. **UI Layout (PyQt6):**
* **Top Bar:** Folder Selection, Scan Button.
* **Settings Panel:** Algorithm Dropdown (aHash, pHash, dHash), Threshold Slider (0-60).
* **Main View:** Side-by-side comparison of found duplicates with "Keep Left" / "Keep Right" buttons.
* *Constraint:* Use `QScrollArea` for the list of duplicates.
2. **Core Logic:**
* Calculate hashes not just bytes (Perceptual Hash).
* Compare all images (O(n^2) or optimized).
* **Threading:** Use `QThread` for the scanning process to ensure the UI remains responsive.
3. **Deliverables:**
* `main.py`: Complete source code.
* `requirements.txt`: Dependencies.
* **Build Instructions:**
* Windows: `pyinstaller --onefile --noconsole main.py`
* macOS: `pyinstaller --windowed --noconsole main.py`
π§ Prompt Decoding
- Hamming Distance: The number of bits that differ between two hashes. Lower = more similar. 0 = identical.
π οΈ Instructions
- Install:
pip install imagehash - Copy Prompt β Run.
- Adjust threshold to catch more or fewer matches.