Apple is pulling a Braveheart and can change the way we use phones whenever they choose

Agents and Copilots, AGI, Apple, Features, Locally Run, Mobile, Multimodal, Robotics Embodiment

Apple is pulling a Braveheart and can change the way we use phones whenever they choose

October 25, 2024

Update: Dec 15, 2024
Making onscreen content available to Siri and Apple Intelligence | Apple Developer Documentation
https://developer.apple.com/documentation/appintents/making-onscreen-content-available-to-siri-and-apple-intelligence

I believe Apple is purposefully slowing their AI releases because the impacts would be too disruptive.

Apple is sitting on a Large Action Model that could “use your phone for you”. It could use apps, navigate interfaces, and take actions… possibly within the OS, without having to “open” anything.

Most people are still talking about using “AI” as writing assistants or content aggregators.

Few people are thinking about the future of apps and browsers. Agents are finally getting a bit of traction – agents are AIs that do things for you. This week Anthropic announced their frontier model Claude can use a computer and TED AI held a panel on agents in San Francisco. Occasionally, the “death of page views” shows up on the radar of publishers.

I think browsers and apps themselves are going to disappear. In April, I cautioned:

“Stop worrying about LLMs crawling the web. Start worrying about LLMs learning how to use computers and eating the entire concept of user interfaces. There will be no way to “block” AI because AI will be driving the operating systems. Entire new industries and disciplines are coming.”

Think about the iPhone. For it to be the magic it is today, it needs a lot of “parts” that add up to a sum that’s greater than any one thing. Wifi connectivity. Digital rights management. Content creators. Apps. A camera. Speech to text. Along the way, the phone ate everything on our desk. See this photo (below)? That’s what’s going to happen to everything we’re using now.

In addition to gobbling up hardware and software, the iPhone enabled entirely new businesses to appear. Uber could not work without mobile phones, an app store… and most importantly the addition of GPS to the iPhone 3G. Uber is an entire business with a market cap of $121 billion, and it depends on the sum of parts.

We need to start thinking of AI as a sum rather than the parts: a language model is not a writing assistant, it’s a plain language interface. Object tracking, segmentation, and depthing are not cool tricks to track objects, they are an interface between AI and real life.

First, we’ll chat with the bots. Then, the bots will recognize things in images and videos. Then the bots will understand real life physical objects and context. Then they’ll be embodied and we’ll be talking to them in plain language. Our phones, a drone, a car, a robot… they will all be the same thing.

Our phones, a drone, a car, a robot… they will all be the same thing.

It won’t be long until you can just talk out loud and get what you want, with continued conversations everywhere you go. It’s already like this in your pocket, but you have to take the phone out.

It’s critical to look at all elements of AI as pieces that will come together to replace phones and laptops…. and embody them into speakers, screens, robots, and cars.

A year ago in October 2023, Apple released an open model called Ferret that could identify and ground objects in a UX – essentially segmentation.

In November 2023 I wrote an article that was published in January 2024, “The AI Future: Exploring the Adjacent Possible with Emerging AI Solutions“. In it I wrote a section called The Future of Interfaces:

The Future of Interfaces

“The ‘content’ of any medium is always another medium. The content of writing is speech, just as the written word is the content of print, and print is the content of the telegraph.” – Marshall McLuhan 1964

Each new medium both contains and can emulate the one medium it replaces. The internet contains and emulates film, radio, television, publishing, and retail. The content of AI will include… the Internet.

Language models communicate through conversations, and if we gather and refine information through dialog, we’re not visiting websites. If we need to see, hear, or watch something, the agent can deliver it.

Bill Gates predicted agents in 1995. In the November 2023 edition of “Gates Notes,” Gates reiterates “You won’t have different apps for different tasks. You’ll simply tell your device, in everyday language, what you want to do… Agents are not only going to change how everyone interacts with computers. They’re going to upend the software industry.”

As we converse with our tools using plain, intuitive language, they blend into our lives and depart from the constructs of laptops, browsers, and phones. If you use an Amazon Echo or Apple Siri (early agents) to get what you need, you won’t need to open your laptop or pick up the phone.

In April 2024, Apple published a paper called “Grounded Mobile UI Understanding with Multimodal LLMs”.

I believe Apple is sitting on a Large Action Model that could “use your phone for you” right now. It could use apps, navigate interfaces, and take actions… possibly within the OS, without having to “open” anything.

From Apple in April:

Recent advancements in multimodal large language models (MLLMs) have been noteworthy, yet, these general-domain MLLMs often fall short in their ability to comprehend and interact effectively with user interface (UI) screens. In this paper, we present Ferret-UI, a new MLLM tailored for enhanced understanding of mobile UI screens, equipped with referring, grounding, and reasoning capabilities.

https://huggingface.co/papers/2404.05719

The sooner we start to see the parts combining into the sum the better we can prepare for what’s coming our way…. I predict for a lot of people it’s going to be as sudden as the scene in Braveheart.

Postscript: I share all of my posts with AI to see what it thinks (and get myself into the training data). GPT cracks me up. I gave it the URL of my article and it replied “They may take our home screens, but they’ll never take our… ecosystem!” SOLID answer!!!

3 responses to “Apple is pulling a Braveheart and can change the way we use phones whenever they choose”

AI News #51: Week Ending 09/20/2024 with Executive Summary, Top 80 Links, and Helpful Visuals – Ethan B. Holland

November 3, 2024 at 12:16 pm

[…] Apple is pulling a Braveheart and can change the way we use phones whenever they choose […]

Loading…

Reply
AI News #58: Week Ending 11/08/2024 with Executive Summary, Top 66 Links, and Helpful Visuals – Ethan B. Holland

December 17, 2024 at 12:51 pm

[…] October 25, 2024, I wrote an essay called “Apple is pulling a Braveheart and can change the way we use phones whenever they choose”. This was prompted by criticism that Apple is behind in AI – I […]

Loading…

Reply
AI News #70: Week Ending 01/30/2025 with Executive Summary, Top 74 Links, and Helpful Visuals – Ethan B. Holland

February 2, 2025 at 9:51 pm

[…] at handling multi-step tasks across different apps. This seems like pretty big news. Still waiting for Apple to make their move!_akhaliq | […]

Loading…

Reply

Ethan B. Holland

Apple is pulling a Braveheart and can change the way we use phones whenever they choose

Like this:

3 responses to “Apple is pulling a Braveheart and can change the way we use phones whenever they choose”

Leave a ReplyCancel reply

AI Trends and Fun Demos: 2026 February Encyclopedia Edition

2025–2026 Bay Bath 13: 10 minutes, 37.4° water, 35° air, 25 mph wind

I’m the proud owner of Ethan Mollick’s sold-out first-edition Volume 1 Parameters 0–1,459,999

Local Media Association Podcast: “AI in 2026: How newsrooms can get more value without losing trust”

2025–2026 Bay Bath 10: 10 minutes, 28.8° water, 23° air temperature, 14 mph wind

AI News #122: Week Ending January 30, 2026 with 28 Executive Summaries

AI News #121: Week Ending January 23, 2026 with 21 Executive Summaries

AI News #119: Week Ending January 09, 2026 with 29 Executive Summaries

AI News #118: Week Ending January 02, 2026 with 18 Executive Summaries

AI News #117: Week Ending December 26, 2025 with 34 Executive Summaries

AI News #116: Week Ending December 19, 2025 with 43 Executive Summaries

2025-2026 Bay Bath 1: Ten minutes in 46.8 degree water and 42 degree air

AI News #102: Week Ending September 12, 2025 with 29 Executive Summaries, Top 8 Links, and 1 Lonely Visual

Vail YOLO Adventure Part II: Reuniting with the Gore Range and Sending Off My Daughter 30 Years Later

My YOLO Vail Story – 19 Years Old With a One Way Ticket and No Plan

My friend Mike’s gift of love

Eulogy for Michael Bernstein, my buddy

AI News #89: Week Ending June 13, 2025 with 33 Executive Summaries, Top 45 Links, and 7 Helpful Visuals

AI News #86: Week Ending May 23, 2025 with 18 Executive Summaries, Top 93 Links, and 12 Helpful Visuals

AI News #85: Week Ending May 16, 2025 with 25 Executive Summaries, Top 56 Links, and 10 Helpful Visuals

Delaware Technical and Community College – Artificial Intelligence Keynote – Ethan Holland – April 2025

AI News #82: Week Ending April 25, 2025 with 35 Executive Summaries, Top 67 Links, and 2 Helpful Visuals

Always be kind. To everyone.

Billie Eilish – Bag Guy

Cage The Elephant

Handmade memes had a posse

Apple is pulling a Braveheart and can change the way we use phones whenever they choose

The AI Future: Exploring the Adjacent Possible with Emerging AI Solutions

2025–2026 Bay Bath 14: 11 minutes, 46.6° water, 39° air, 9 mph wind.

Trending

2025–2026 Bay Bath 14: 11 minutes, 46.6° water, 39° air, 9 mph wind.

AI Virtual Inn of Court: Links for Tuesday February 17, 2026

2025–2026 Bay Bath 12: 10 minutes, 39.7° water, 37° air temperature, 5 mph wind.

2025–2026 Bay Bath 11: 10 minutes, 29.8° water, 19° air temperature, 16 mph wind

Apple is pulling a Braveheart and can change the way we use phones whenever they choose

Share this:

Like this:

3 responses to “Apple is pulling a Braveheart and can change the way we use phones whenever they choose”

Leave a ReplyCancel reply

Trending

Discover more from Ethan B. Holland