Decisions

  • Pending: Which Nano Banana model/tier to use (free vs Pro)?
  • Pending: Use via Gemini API directly or through a wrapper?
  • Pending: What are the primary feature requests (vault illustrations, social media, personal)?

User Tasks


Summary

Integrate Google’s Nano Banana (Gemini image generation) into Opus for creating and editing images from text prompts.

Problem / Motivation

Currently no image generation capability exists in Opus. Nano Banana (Google’s Gemini-powered image model) offers text-to-image generation and chat-based image editing, which could be useful for creating visuals, illustrations, or creative content.

Proposed Solution

Integrate the Nano Banana / Gemini image generation API into Opus, either as a skill, a standalone script, or an MCP tool — enabling image creation from text prompts within the workflow.


Open Questions

1. Access Method

Question: How should we access Nano Banana?

OptionDescription
A) Gemini API (recommended)Direct API integration via Google’s Gemini API with Nano Banana 2 model
B) Pixlr / third-party wrapperUse a third-party service that wraps Nano Banana
C) MCP serverBuild or use an MCP server for image generation

Recommendation: Option A — direct API gives most control and is free-tier friendly.

Decision:

2. Primary Feature Requests

Question: What will we mainly use image generation for?

OptionDescription
A) Personal creative projectsWallpapers, social media, fun
B) Vault illustrationsGenerate visuals for vault notes
C) Photo editingEdit existing photos with AI
D) All of the aboveFull integration

Decision:


Phase Overview

PhaseDescriptionStatus
Phase 1API setup and basic text-to-image generation
Phase 2Image editing and advanced features
Phase 3Opus integration (skill/MCP)

Phase 1: API Setup & Basic Generation —

Goal: Get Nano Banana working with a simple Python script that generates images from text prompts.

File / FeatureDetailsOwnerStatus
Gemini API keyObtain and configure API accessopus
src/tools/image_gen.pyBasic text-to-image scriptopus
Output folderSave generated images to a designated folderopus

Phase 2: Image Editing —

Goal: Add chat-based image editing (upload + modify with natural language).

File / FeatureDetailsOwnerStatus
Image upload supportAccept existing images for editingopus
Edit promptsNatural language editing commandsopus

Phase 3: Opus Integration —

Goal: Make image generation accessible as a skill or MCP tool within Claude Code.

File / FeatureDetailsOwnerStatus
Skill or MCP serverIntegrate into Opus workflowopus

Test

Manual tests

TestExpectedActualLast
pending-

AI-verified tests

ScenarioExpected behaviorVerification method

E2E tests

ScenarioAssertion

Integration tests

ComponentCoverage

Unit tests

ComponentTestsCoverage

History

DateEventDetails
2026-03-04CreatedInitial feature request

References