Nano Banana Image Generation

Decisions

Pending: Which Nano Banana model/tier to use (free vs Pro)?
Pending: Use via Gemini API directly or through a wrapper?
Pending: What are the primary feature requests (vault illustrations, social media, personal)?

User Tasks

Summary

Integrate Google’s Nano Banana (Gemini image generation) into Opus for creating and editing images from text prompts.

Problem / Motivation

Currently no image generation capability exists in Opus. Nano Banana (Google’s Gemini-powered image model) offers text-to-image generation and chat-based image editing, which could be useful for creating visuals, illustrations, or creative content.

Proposed Solution

Integrate the Nano Banana / Gemini image generation API into Opus, either as a skill, a standalone script, or an MCP tool — enabling image creation from text prompts within the workflow.

Open Questions

1. Access Method

Question: How should we access Nano Banana?

Option	Description
A) Gemini API (recommended)	Direct API integration via Google’s Gemini API with Nano Banana 2 model
B) Pixlr / third-party wrapper	Use a third-party service that wraps Nano Banana
C) MCP server	Build or use an MCP server for image generation

Recommendation: Option A — direct API gives most control and is free-tier friendly.

Decision:

2. Primary Feature Requests

Question: What will we mainly use image generation for?

Option	Description
A) Personal creative projects	Wallpapers, social media, fun
B) Vault illustrations	Generate visuals for vault notes
C) Photo editing	Edit existing photos with AI
D) All of the above	Full integration

Decision:

Phase Overview

Phase	Description	Status
Phase 1	API setup and basic text-to-image generation	—
Phase 2	Image editing and advanced features	—
Phase 3	Opus integration (skill/MCP)	—

Phase 1: API Setup & Basic Generation —

Goal: Get Nano Banana working with a simple Python script that generates images from text prompts.

File / Feature	Details	Owner	Status
Gemini API key	Obtain and configure API access	opus	—
`src/tools/image_gen.py`	Basic text-to-image script	opus	—
Output folder	Save generated images to a designated folder	opus	—

Phase 2: Image Editing —

Goal: Add chat-based image editing (upload + modify with natural language).

File / Feature	Details	Owner	Status
Image upload support	Accept existing images for editing	opus	—
Edit prompts	Natural language editing commands	opus	—

Phase 3: Opus Integration —

Goal: Make image generation accessible as a skill or MCP tool within Claude Code.

File / Feature	Details	Owner	Status
Skill or MCP server	Integrate into Opus workflow	opus	—

Test

Manual tests

Test	Expected	Actual	Last
…	…	pending	-

AI-verified tests

Scenario	Expected behavior	Verification method
…	…	…

E2E tests

Scenario	Assertion
…	…

Integration tests

Component	Coverage
…	…

Unit tests

Component	Tests	Coverage
…	…	…

History

Date	Event	Details
2026-03-04	Created	Initial feature request

Opus Vault

Explorer