6.3 KiB
MCP Image Recognition Server
An MCP (Model Context Protocol) server that provides AI-powered image analysis tools for AI assistants.
Features
- describe_image: Analyze images from base64 encoded data using OpenAI's Vision API
- describe_image_from_file: Analyze images from file paths using OpenAI's Vision API
- Automatic fallback to basic metadata if OpenAI API is not configured
- Automatic Kilocode configuration on installation
- Portable and distributable via PyPI
Quick Installation (Recommended)
Install from PyPI (once published):
pip install image-recognition-mcp
The server will automatically configure itself in Kilocode during installation! 🎉
If automatic configuration doesn't work, you can manually run:
image-recognition-mcp-install
Local Development Setup
For local development or if you want to run from source:
cd /home/enne2/Sviluppo/tetris-sdl/mcp-image-server
./run.sh
The script will automatically:
- ✅ Create virtual environment if it doesn't exist
- ✅ Install dependencies if needed
- ✅ Activate the virtual environment
- ✅ Start the server
Configuration
After installation, you need to add your OpenAI API key:
-
Open Kilocode's MCP settings:
~/.config/VSCodium/User/globalStorage/kilocode.kilo-code/settings/mcp_settings.json -
Find the
image-recognitionserver entry -
Replace
"your-openai-api-key-here"with your actual OpenAI API key -
Restart Kilocode
Available Tools
1. describe_image
Analyzes an image from base64 encoded data using OpenAI's GPT-4 Vision.
Parameters:
image_data(string, required): Base64 encoded image datamime_type(string, optional): MIME type of the image (default: 'image/jpeg')
Returns: Detailed AI-generated description of the image including objects, colors, composition, and visible text
Fallback: If OpenAI API is not configured, returns basic image metadata (size, mode, format)
2. describe_image_from_file
Analyzes an image from a file path using OpenAI's GPT-4 Vision.
Parameters:
file_path(string, required): Path to the image file
Returns: Detailed AI-generated description of the image
Supported formats: JPEG, PNG, GIF, WebP (automatically detected from file extension)
Example Usage
Once configured in Kilocode with a valid OpenAI API key:
Can you analyze the image at /path/to/image.jpg?
The AI will use the describe_image_from_file tool to provide a detailed description.
Installation Methods
Method 1: PyPI (Recommended - Once Published)
pip install image-recognition-mcp
Automatically configures Kilocode! ✨
Method 2: From Source
git clone https://github.com/yourusername/image-recognition-mcp.git
cd image-recognition-mcp
pip install -e .
Method 3: Using uvx (Portable)
uvx image-recognition-mcp
No installation needed! Works like npx for Python.
Kilocode Configuration
The server automatically adds this configuration:
{
"mcpServers": {
"image-recognition": {
"command": "uvx",
"args": ["image-recognition-mcp"],
"env": {
"OPENAI_API_KEY": "your-openai-api-key-here"
}
}
}
}
Files Structure
mcp-image-server/
├── run.sh # Local startup script
├── requirements.txt # Python dependencies
├── setup.py # Package setup (with auto-config)
├── pyproject.toml # Modern Python packaging
├── README.md # This file
├── PUBLISHING.md # Publishing guide
├── LICENSE # MIT License
├── MANIFEST.in # Package manifest
├── image_server.log # Server logs
├── venv/ # Virtual environment (auto-created)
└── image_recognition_server/
├── __init__.py
├── server.py # Main server implementation
└── install.py # Auto-configuration script
Commands
After installation, these commands are available:
image-recognition-mcp- Start the MCP serverimage-recognition-mcp-install- Configure Kilocode (runs automatically on install)
Dependencies
- fastmcp: FastMCP framework for building MCP servers
- pillow: Python Imaging Library for image processing
- openai: OpenAI API client for Vision API
Logs
Server logs are written to:
/home/enne2/Sviluppo/tetris-sdl/mcp-image-server/image_server.log (local)
Or when installed via pip:
~/.local/share/image-recognition-mcp/logs/ (system-wide)
How It Works
-
With OpenAI API Key:
- Images are encoded to base64
- Sent to OpenAI's GPT-4o-mini Vision model
- Returns detailed AI-generated descriptions
-
Without OpenAI API Key:
- Falls back to basic image metadata
- Returns size, color mode, and format information
- Includes a note about configuring the API key
Troubleshooting
Server won't start
- Check that Python 3.8+ is installed:
python3 --version - Verify installation:
pip show image-recognition-mcp - Check logs for errors
Automatic configuration failed
- Run manually:
image-recognition-mcp-install - Or configure manually (see PUBLISHING.md)
No AI descriptions
- Verify your OpenAI API key is correctly set in MCP settings
- Check that the key is valid and has credits
- Review logs for API errors
- The server will show a warning on startup if no valid API key is detected
Image not found
- Ensure the file path is absolute
- Check file permissions
- Verify the file exists:
ls -la /path/to/image.jpg
Development
To modify the server:
- Clone the repository
- Install in development mode:
pip install -e . - Make changes to
image_recognition_server/server.py - Test locally:
image-recognition-mcp
Publishing
See PUBLISHING.md for instructions on publishing to PyPI.
License
MIT License - see LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Future Enhancements
- Support for batch image processing
- Image comparison tools
- Custom vision models
- Image generation capabilities
- Support for more image formats
- Caching for repeated image analyses
- Web interface for testing