Google this week unveiled Gemini CLI, an AI terminal that expands its coding agent product line, with support for images and video as a major selling point.
Gemini CLI is generally available now, with a free tier for individual users accessible with a personal Gmail account that supports up to 1,000 Gemini 2.5 Pro model requests per day for a single associated agent. Existing Gemini Code Assist and Vertex AI users get Gemini CLI with support for multiple large language models (LLMs), multi-agent workflows, governance policies, audit logging and data residency support, but they pay standard token-based pricing.
Gemini CLI is designed for lightweight developer terminals -- or command-line interfaces -- that use text-based commands. That's unlike integrated development environments (IDEs), which provide more visibility into codebases and include advanced features such as built-in linting. Tools such as package managers and cloud-native infrastructure frameworks, including Kubernetes and infrastructure as code, use CLIs.
"CLI scripting is a quick and easy method for automating simple tasks that do not need a complete application," said Torsten Volk, an analyst at Enterprise Strategy Group, now part of Omdia. "These are modular pieces that you can easily plug into a pipeline for scheduled or event-driven automation, or you can use them ad hoc to automate tasks like 'Add a new EC2 server instance to my environment' or 'Add a database' or just 'Find me all Python files I wrote last week across projects.'"
Even when developing alongside agents, deployment, testing, quality checks and debugging still often happen in the CLI. These types of products give more coverage to the full developer experience of writing, debugging and deploying code.
Devin DickersonAnalyst, Forrester Research
Adding AI to the CLI is essentially the same as adding it to an IDE, Volk said, but it is potentially interesting for DevOps and other operations roles that are not as well versed in everyday coding.
Agents embedded in the IDE can already interact with the terminal and run CLI commands in a variety of products, said Devin Dickerson, an analyst at Forrester Research.
"That said, the user experience can be uneven, sometimes with the agent failing to connect to the terminal or run the commands successfully," he said.
Gemini CLI could improve the user experience of developing alongside Gemini-based AI and agents in IDEs, Dickerson said.
"Even when developing alongside agents, deployment, testing, quality checks and debugging still often happen in the CLI," he said. "These types of products give more coverage to the full developer experience of writing, debugging and deploying code."
Google catches up with multimodal twist
Gemini CLI is not the first AI terminal -- products from specialized vendors such as Aider, Claude Code, OpenAI Codex CLI, Warp and Wave are already available. Cloud competitors such as Microsoft, which offers GitHub Copilot CLI, and Amazon Q Developer also provide AI terminals. Typically, AI terminals offer flexibility and natural language capabilities to make rigid command-line syntax more accessible. AWS offers a free tier for all its cloud services, including Q Developer, which is free with monthly usage limits for personal users and public CLI completions; Warp's free tier includes 150 AI requests per month with a choice of AI models and support for shared sessions, notebooks and workflows.
Google officials claim that Gemini CLI's multimodal support also sets it apart from the competition. During a press briefing Tuesday, they demonstrated Gemini CLI's ability to generate images and videos from text prompts.
"It can do a lot more than just code," said Taylor Mullen, senior staff software engineer at Google, during the briefing presentation. He played a recorded demo showing Gemini CLI using an included Model Context Protocol server to generate images and video via Google's Veo text-to-video and Chirp speech-to-text models.
"That's one of the biggest things that has really resonated with us [internally at Google], where people are using it for coding, content generation, slide generation, marketing material -- everything from totally non-coding to completely coding-related," Mullen said.
It's unclear how widespread demand is for such workflows in CLI tools, but there are obvious potential use cases, according to Volk.
"You could write scripts doing interesting stuff like 'Automatically identify all people included in the video files stored in a certain folder,'" he said.
Users might want to create a graph to visualize a data set or a word cloud of responses to a question, or automatically build a cover page with images based on data within a report, said Andrew Cornwall, an analyst at Forrester Research.
"However, I suspect the main thinking behind it wasn't that CLI users want multimodal AI, but rather, 'Hey, we have the capability, we might as well enable it on the command line as well,'" Cornwall said. "Users will think up uses that the developers never intended. It's also easier to have consistent documentation if everything in the IDE is available on the command line and vice versa."
Beth Pariseau, a senior news writer for Informa TechTarget, is an award-winning veteran of IT journalism covering DevOps. Have a tip? Email her or reach out @PariseauTT.