Been using the Dia browser for a couple of days now and realizing it’s become more of a hassle to navigate to ChatGPT or Perplexity. The deep integration with an LLM changes the experience of using a browser and navigating the internet. The browser wars are about to begin.”” / X https://x.com/alecdewitz/status/1935420754226790842

In the works already. Team moving at a pace that’s fast even for Perplexity standards. https://x.com/AravSrinivas/status/1945537471540072888

Perplexity is now the #1 overall app on App Store in India, ahead of ChatGPT. https://x.com/AravSrinivas/status/1945960772091433081

Copilot on Windows: Vision Desktop Share begins rolling out to Windows Insiders | Windows Insider Blog https://blogs.windows.com/windows-insider/2025/07/15/copilot-on-windows-vision-desktop-share-begins-rolling-out-to-windows-insiders/

Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We’re building multimodal AI that works with how you naturally interact with the world – through conversation, through sight, through the messy way we collaborate. We’re”” / X https://x.com/miramurati/status/1945166365834535247

Thinking Machines Lab Raises a Record $2 Billion, Announces Cofounders | WIRED https://www.wired.com/story/thinking-machines-lab-mira-murati-funding/

ChatGPT may soon edit Excel and PowerPoint files natively, challenging Microsoft Office: Report | Mint https://www.livemint.com/technology/tech-news/chatgpt-may-soon-edit-excel-and-powerpoint-files-natively-challenging-microsoft-office-report-11752665586822.html

A new agentic browser just shipped from Perplexity and it’s pretty wild. Watch this video of @PerplexityComet taking over my LinkedIn tab and taking actions on my part. Interesting UX where the tab glows blue as it’s taking actions. I like the integration of agentic actions https://x.com/ryancarson/status/1942962447369036201

AI-powered browsers like Perplexity’s Comet promise to do your web surfing for you. But do they really save time, or just add more noise? 🌐 https://x.com/fdaudens/status/1945121374063698080

Ask Comet to book a meeting or send an email. Comet transforms entire sessions into single, seamless interactions. https://x.com/PerplexityComet/status/1943026179960873207

asked @PerplexityComet to load up our brand colors in @MeetGamma then shifted my focus to building the actual content of the deck https://x.com/jennysvng/status/1943074383091671529

Been using @PerplexityComet, and there are soo many new use cases for it, but this has got to be one of my favs: I received a verification link sent to my Gmail, and I asked Comet Assistant to click it and verify me on my behalf. And it did it! Simple yet useful ^_^ https://x.com/_Matskuu/status/1942977239974400170

BREAKING 🚨: Comet Browser can now control an open web page from a sidecar! Now it can simply take it over and click around. Making Comet to publish a blog post for me 👀 https://x.com/testingcatalog/status/1928546603448562087

Browse at the speed of thought. https://x.com/PerplexityComet/status/1942968195419361290

Comet browser applying for a job for me 👀 Soon, you will be able to execute such things on a schedule. https://x.com/testingcatalog/status/1926043202684854674

Comet has become a natural extension of all my workflows, ideas, and content since I started using it. I can easily recall any saved information and connect to all of my personal knowledge management tools. Effortless networked intelligence. Proud of this team! https://x.com/camerontstow/status/1943047355944833153

Comet… is nuts. I asked it to go find the subreddits that people would ask cooking questions on. Then, find common questions and come up with ad angles for those questions for Hexclad. For kicks, I asked it to make a static ad for me with my fav angle Results. Are. Insane. https://x.com/NathanSnell/status/1943095214932943291

cool query on my comet browser for handling my X addiction. https://x.com/AravSrinivas/status/1912592179291385896

First test of Perplexity’s new agentic browser, Comet 👇 Comet authenticates into your accounts (e.g. email, calendar) to take actions on your behalf. It pulled a list of all my email newsletters, and unsubscribed from the specific ones I asked it to 🤯 https://x.com/omooretweets/status/1943078090718220653

Hooolllyyy crap. Perplexity’s comet browser is insane. Operator was a total dud. Manus is better but meh. Videos coming. I asked it to duplicate a meta campaign for me. No problem. All automated. Anyone want me to try anything specific? https://x.com/NathanSnell/status/1943062637656338805

How to watch YouTube on Comet https://x.com/AravSrinivas/status/1946240617031606672

I feel like I’m living in the future right now. Been using the new browser called Comet from @perplexity_ai (thanks @AravSrinivas for getting me access!) Like millions of others, I spend hours and hours a day in a browser. Specifically, Chrome. And, Chrome hasn’t”” / X https://x.com/dharmesh/status/1943084541733933189

Let Comet handle the customer support reps for you. Customer support is already a lot of AI anyway. So let your AI talk to the other AIs while you watch YouTube or do some work :-)”” / X https://x.com/AravSrinivas/status/1944778316323717437

Memory is magic when it works. Comet is “memory-native” – the closest approximation of truly understanding the user there is. https://x.com/AravSrinivas/status/1944078543324844077

Perplexity Comet https://comet.perplexity.ai/

Perplexity Comet vs ChatGPT Agent”” / X https://x.com/AravSrinivas/status/1946076236683624616

PERPLEXITY COMET WORKS ON DUNE FOR CONTENT IDEATION!!!! SO COOL! https://x.com/0xDataWolf/status/1943265415322595630

Perplexity is testing new feature with Comet browser which will be able to just go out there and do things for you via prompts. Exciting times ahead https://x.com/AIProductPM/status/1940108252559081764

Prime Day Shopping with Comet. User saves $280 in less than 5 minutes by asking Comet to compare prices.”” / X https://x.com/AravSrinivas/status/1944183680915714548

RT @itsPaulAi: Perplexity Comet can automate any task in your browser This is the first time you REALLY have an AI agent working autonomou…”” / X https://x.com/denisyarats/status/1945321982725382170

RT @PerplexityComet: Clean up your inbox. Ask Comet to unsubscribe you from spam and unwanted emails. https://x.com/AravSrinivas/status/1945232153609978273

RT @rowancheung: Perplexity Comet is not like other agents I’ve been testing it all week, and it’s starting to actually *stick* Having in…”” / X https://x.com/AravSrinivas/status/1945620938068037633

The Cursor for Web Browsing, is here. And it’s better than Comet at turning your open tabs and bookmarks into a codebase. Here is a full breakdown of how i’m using @diabrowser Exploring the Future of Browsing with DIA Browser: Essential Features for Content Creators & https://x.com/rileybrown_ai/status/1943041778304847889

The most interesting thing about Perplexity Comet is that it can actually do things in Cal / Gmail Ex. I asked it to reschedule a 1:1 – it moved the invite and sent an email Neither Google nor OpenAI have done this in their agents…maybe for safety reasons, but it’s limiting 🤔 https://x.com/omooretweets/status/1943116119243416009

The TAM for Comet is bigger than Perplexity because it appeals to people who don’t even want AI. Just the best core browser in the market at the end of the day.”” / X https://x.com/AravSrinivas/status/1946035102150238475

USE CASE 2: Cross-tab product comparison If you’re looking for a new product or looking for flights, Comet can compare tabs in real time It’s surprisingly fast and analyzes the reviews of the tabs too https://x.com/rowancheung/status/1945524017915674879

USE CASE 3: Summarize any YT video with a click You can summarize + chat with any long YT video and get key moments This is also possible in Gemini, but having it in the browser means you can watch the video AND chat/learn with Comet in the side tab at the same time https://x.com/rowancheung/status/1945524019681480992

Vibe coding with @PerplexityComet – asked the browser agent to build me a simple (locally run) yt-dlp wrapper. It navigated to github,created the repo, wrote/committed/pushed the code. You can even make changes to your code from the sidecar, feels like an AI IDE lmao 😂 https://x.com/killuaz0ldyck07/status/1942976067075281248

When you’re on Comet, you’re operating at an abstraction above which AI to use and how to pull in relevant context. Agents are powerful and operate like a human would to complete the task. You go from chat turns to end-to-end workflows. https://x.com/AravSrinivas/status/1944024356138758367

New AI features in Google Search: Call a business or do research
https://blog.google/products/search/deep-search-business-calling-google-search/

We’re bringing Gemini 2.5 Pro to AI Mode: giving you access to our most intelligent AI model, right in @Google Search. With its advanced reasoning capabilities, watch how it can tackle incredibly difficult math problems, with links to learn more ↓ https://x.com/GoogleDeepMind/status/1945515683451736246

🎥 Want the text from any YouTube video? Now you can — no plugins, no installs. Just drop the link, and our YouTube MCP turns it into text instantly. Try it now with this Agent: https://x.com/OmniMCP/status/1942855673324397021

It looks like scale + tool use + multimodal remains the chosen path forward.”” / X https://x.com/emollick/status/1943169759312322604

GPT Agent Now rolled out to 100% of Pro users. Due to higher than expected demand, Plus and Team users will begin getting access Monday.”” / X https://x.com/OpenAI/status/1946024465214935279

SpatialTrackerV2: unified, end-to-end 3D point tracking model which simultaneously estimates Camera Motion, Consistent Geometry and Pixel-wise 3D Trajectories. https://x.com/bilawalsidhu/status/1945154158782505345

DAViD: Data-efficient and Accurate Vision Models from Synthetic Data https://microsoft.github.io/DAViD/

Mistral AI on X: “Introducing the world’s best (and open) speech recognition models! https://t.co/tUnPcdCrbZ” / X
https://x.com/MistralAI/status/1945130173751288311

Le Chat dives deep. | Mistral AI https://mistral.ai/news/le-chat-dives-deep

RT @MistralAI: Introducing the world’s best (and open) speech recognition models! https://x.com/ClementDelangue/status/1945233605745135754

RT @MistralAI: Try Voxtral with an API call: https://x.com/ClementDelangue/status/1945233623164006523

the best part about the mistral release is that the models don’t loose as much on text – this has been a biggest pain point for a audioLMs for a long while https://x.com/reach_vb/status/1945140430288417007

Voxtral | Mistral AI https://mistral.ai/news/voxtral

SoulDance https://xjli360.github.io/SoulDance/

StreamME Project Page https://songluchuan.github.io/StreamME/

SceneScript treats 3D reconstruction as a language problem rather than a geometry one. The model watches a video of a room and just learns to write a script for it. It autoregressively spits out text commands like make_wall(…) or make_bbox(…) that define the scene. https://x.com/bilawalsidhu/status/1944760878831923522

Fine-tune Gemma3n on videos with audios inside with Colab A100 🔥 Just dropped the notebook where you can learn how to fine-tune Gemma3n on images+audio+text at the same time! https://x.com/mervenoyann/status/1945481841298813403

RT @reach_vb: Lets GOOO! @NVIDIAAIDev just dropped Canary Qwen 2.5 – SoTA on Open ASR Leaderboard, CC-BY licensed 🔥 > Works in both ASR an…”” / X https://x.com/reach_vb/status/1946087224346313175

[2507.06261] Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities https://arxiv.org/abs/2507.06261

Building a Multi-Agent Deep Researcher with Gemini 2.5 Pro 🧑‍🔬📑 We’re excited to collaborate with @_philschmid and the @googleaidevs team on a brand-new tutorial 🧑‍🏫 : Build a multi-agent system with a researcher, writer, review agent that can search the web, record input, https://x.com/jerryjliu0/status/1944882346731430127

Google engineers shifted to a sparse mixture‑of‑experts transformer that picks only the needed mini‑networks per token, so compute stays low while total capacity rises. —- Paper – arxiv. org/abs/2507.06261 Paper “”Gemini 2.5: Pushing the Frontier with Advanced https://x.com/rohanpaul_ai/status/1944022179869241354

Google’s Gemini 2.5 paper has 3295 authors https://x.com/hardmaru/status/1944385851435205035

New Guide! Learn how to build a multi-agent “Deep Research” system with Gemini 2.5 and @llama_index. It dynamically searches the web, takes notes, and writes a comprehend research report with a feedback loop 🚀 🔍 Search the web with google 📝 Take notes with a dedicated https://x.com/_philschmid/status/1944835088039977124

Today we are rolling out our first Gemini Embedding model, which ranks #1 on the MTEB leaderboard, as a generally available stable model. It is priced at $0.15 per million tokens and ready for at scale production use! https://x.com/OfficialLoganK/status/1944806630979461445

Tried out Google’s ADK(agent development kit) and legit, the inbuilt UI with Gemini Free API is wild 🤯 So easy to use and looks sick! #ADK #GoogleAI #GeminiAPI https://x.com/027_Priyanshu/status/1934106038632153243

What if you had a smart personal assistant living in your watch that could share info and manage tasks for you when your hands are full? 🧠 You’re about to find out. Meet Gemini, rolling out now on Wear OS 4+ watches: https://x.com/WearOSbyGoogle/status/1942961942693359894

Build your first AI agent + MCP Server in Python. Here is everything you need to build your first AI agent in less than 20 minutes. About the code you’ll see here: 1. I used Google ADK with Gemini Flash to power the agent 2. The agent connects to an MCP server 3. It also https://x.com/svpino/status/1929881755915366772

Gemini CLI can automate your computer using MCP 🔥 Add Windows MCP (or macOS MCP) to Gemini CLI and you can tell it what to do autonomously. Gemini then takes control of your entire system to achieve the goal you’ve set. Links below https://x.com/itsPaulAi/status/1940903613888696776

Someone vibe coded an Al Agent that can use your phone on its own. He outlines using this as a ChatGPT-like interface, except things actually get done automatically. The person built this using Google ADK and the Gemini API 💀 Credits: Tyrange-D via r/singularity https://x.com/DigestibleAICo/status/1924218874678960504

RT @OfficialLoganK: Today we are rolling out our first Gemini Embedding model, which ranks #1 on the MTEB leaderboard, as a generally avail…”” / X https://x.com/demishassabis/status/1944870402251219338

Gemini-CLI is bad compared to Claude code in very fixable ways codex-cli is bad in odd ways. Feels unfriendly, unlike the GUI version of Codex and unusual for product-strong OpenAI”” / X https://x.com/kylebrussell/status/1945242558487044118

18/ How are you using comet? Any use cases that I missed? Follow @AtomSilverman and @AgentOpsAI for everything AI agent-related Have you tried the @AgentOpsAI MCP server? Link in bio. Last week’s thread: https://x.com/AtomSilverman/status/1944456541169762363

a fresh batch of comet invites just went out”” / X https://x.com/AravSrinivas/status/1945669970618421699

Looks like Grok 4 is 10^27 FLOPs given their graphs? HLE score is 26% without tools, Gemini 2.5 is 21.6% without tools. Curious what the tool piece is.”” / X https://x.com/emollick/status/1943162710725657055

LLMs for IMO 2025: gemini-2.5-pro (31.55%), o3 high (16.67%), Grok 4 (11.90%). https://x.com/denny_zhou/status/1945887753864114438

Gemini generates the best prompts for Veo 3. Full code below. ““python import time from google import genai from google.genai import types client = genai.Client() operation = client.models.generate_videos( model=””veo-3.0-generate-preview””, prompt=””””””{ “”character_name””: https://x.com/_philschmid/status/1945898590821584989

Generate videos with Veo 3  |  Gemini API  |  Google AI for Developers https://ai.google.dev/gemini-api/docs/video

Start building with Veo 3: our state-of-the-art video generation model now available in paid public preview via the Gemini API and @Google AI Studio. 🎨 Here’s how to try it → https://x.com/GoogleDeepMind/status/1945886603328778556

RT @GeminiApp: A new Gemini feature just dropped and everything is alive?! Now you can turn photos into videos with sound in Gemini.”” / X https://x.com/demishassabis/status/1944939563170062804

merve on X: “GLM-4.1V-9B-Thinking is the BEST thinking vision LM out there 😍 it’s now served in @huggingface Inference Providers through @novita_labs 🤝 https://t.co/XUwXBLmoz7″ / X
https://x.com/mervenoyann/status/1945432520339734647

Today in the journal Science: BioEmu from Microsoft Research AI for Science. This generative deep learning method emulates protein equilibrium ensembles – key for understanding protein function at scale. https://x.com/MSFTResearch/status/1943373860012744737

Big vision language models waste compute and still miss rare corner cases while driving. MoSE fixes both by turning on only the skills needed for perception, prediction, then planning. It builds a routing table that tags every training example with human style skills like https://x.com/rohanpaul_ai/status/1944010854737080566

all modality RAG 🔥 ColQwen-Omni is a new multimodal retrieval model that can retrieve anything (videos, audios, documents and more!) use with transformers 🤗 here’s a smol demo on video retrieval ↙️ https://x.com/mervenoyann/status/1945862039022207386

RT @ManuelFaysse: Introducing ColQwen-Omni, a 3B omnimodal retriever that extends the ColPali concept of multimodal retrieval with late int…”” / X https://x.com/andersonbcdefg/status/1945855681976021268

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading