free-vision

Community

Accurately analyze and describe images

Authorthanhtunguet
Version1.0.0
Installs0

System Documentation

What problem does it solve?

Reduce the friction of interpreting images by invoking local AI CLIs to produce structured visual summaries, explicit object lists, and verbatim visible text so users do not need to manually inspect or transcribe images.

Core Features & Use Cases

  • CLI-first analysis with fallback: Prefer Gemini for structured image analysis and automatically fall back to Qwen Code CLI if Gemini fails or returns generic output.
  • Structured output: Return a one-sentence summary, enumerated key objects, verbatim visible text extraction, and notable visual details for each image.
  • Path validation and error handling: Require absolute paths, confirm file existence, and handle CLI errors or empty responses to prompt for alternative inputs.
  • Use Case: Convert screenshots, photos, or diagrams into concise, machine-readable descriptions and extract any visible text for downstream tasks like documentation or accessibility summaries.

Quick Start

Ask the skill to analyze the image at /absolute/path/to/image.jpg and provide a one-sentence summary, list key objects, extract any visible text verbatim, and note notable details.

Dependency Matrix

Required Modules

None required

Components

Standard package

💻 Claude Code Installation

Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.

Please help me install this Skill:
Name: free-vision
Download link: https://github.com/thanhtunguet/agent-skills/archive/main.zip#free-vision

Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
View Source Repository

Agent Skills Search Helper

Install a tiny helper to your Agent, search and equip skill from 223,000+ vetted skills library on demand.