Time was, you’d have been stuffed. But today you whip out your phone, open Gemini (with your US$20 Pro subscription) and activate the new improved Live feature. “What are they saying?” is a question you can ask Google Translate. But you want more: what are they saying to each other? Are they colluding? The rental guy points at a clause in the contract – you point your camera at it. What does it say? Is it valid? Should you call the police or try to pay them off? The nice Gemini lady calmly talks you through it – and talks to them – acting as an informed interpreter with eyes and ears by your side.
Back home, you want to increase transfer limits on your banking app. You hit Gemini Live’s Share Screen function and open the app. Gemini sits at the top of the screen as a small icon. You ask, “Where’s the menu for that?”
Emboldened, you decide it’s time to achieve your potential as a social media star. You ask Gemini what app to use to edit your Vietnam videos. You open the app and say: “I’ve no idea what I’m doing. Show me how to make myself beautiful, stitch these together and post them on Instagram.” Gemini sees everything on your screen and talks you through the process with infinite patience. Imagine asking your kids to do that. Scary face emoji!
Digital savant
In America, though not yet in Thailand, you can also share your laptop screen, bringing previously too-hard activities like coding into reach. It’s one thing to upload screenshots to your chatbot and home in on your answer prompt by laborious prompt. It’s quite another to have a rolling, interactive conversation with a digital savant where you can pivot, backtrack and get context and detail in real-time. It’s a completely different experience.
Even mundane tasks like writing a document (remember that?) are transformed. With Gemini Live open as you compose, you can ask: “Is this too verbose? How will this land with the boss? Am I exposing myself legally here?” If you’re cooking and your hands are full, you can tell it to read the next step.
This, at least, is the promise. Your correspondent tried all these tasks and managed to increase his bank limits and avoid getting banged up in Vietnam. However, Gemini Live (and its human) still stumbled over the video editing (“Choose the video you want to upload.” “I can only upload photos.” “The app supports uploading videos.” “No it doesn’t.” “Yes it does.”).
Gemini on my Mac claimed that I would be able to schedule tasks with Gemini Live, like rounding up the morning’s AI news and giving me a brief. Gemini on my phone disagreed – I would have to prompt it every time, it said.
The feature is not yet agentic, either. It would be truly game-changing to ask Gemini Live to open an app, apply effects, and post the video for you. It can’t do that – yet. But it is a big step up from previous AI chat features.
Google has improved how you can interrupt the AI and request more succinct or detailed answers. It is a much less irritating interlocutor than ChatGPT’s companion, which still lacks integrated screen-sharing features.
Gemini Live is a preview of what it will be to have an always-on AI companion with eyes and ears to help you navigate the real and digital realm.
Joe Smith is Founder of the AI consultancy 2Sigma Consultants. He studied AI at Imperial College Business School and is researching AI’s effects on cognition at Chulalongkorn University. He is author of The Optimized Marketer, a book on how to use AI to promote your business and yourself. Contact joe@2Sigmaconsultants.com.


