free-vision
CommunityAccurately analyze and describe images
Authorthanhtunguet
Version1.0.0
Installs0
System Documentation
What problem does it solve?
Reduce the friction of interpreting images by invoking local AI CLIs to produce structured visual summaries, explicit object lists, and verbatim visible text so users do not need to manually inspect or transcribe images.
Core Features & Use Cases
- CLI-first analysis with fallback: Prefer Gemini for structured image analysis and automatically fall back to Qwen Code CLI if Gemini fails or returns generic output.
- Structured output: Return a one-sentence summary, enumerated key objects, verbatim visible text extraction, and notable visual details for each image.
- Path validation and error handling: Require absolute paths, confirm file existence, and handle CLI errors or empty responses to prompt for alternative inputs.
- Use Case: Convert screenshots, photos, or diagrams into concise, machine-readable descriptions and extract any visible text for downstream tasks like documentation or accessibility summaries.
Quick Start
Ask the skill to analyze the image at /absolute/path/to/image.jpg and provide a one-sentence summary, list key objects, extract any visible text verbatim, and note notable details.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: free-vision Download link: https://github.com/thanhtunguet/agent-skills/archive/main.zip#free-vision Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.