Been using the Dia browser for a couple of days now and realizing it’s become more of a hassle to navigate to ChatGPT or Perplexity. The deep integration with an LLM changes the experience of using a browser and navigating the internet. The browser wars are about to begin.”” / X https://x.com/alecdewitz/status/1935420754226790842
In the works already. Team moving at a pace that’s fast even for Perplexity standards. https://x.com/AravSrinivas/status/1945537471540072888
Perplexity is now the #1 overall app on App Store in India, ahead of ChatGPT. https://x.com/AravSrinivas/status/1945960772091433081
Copilot on Windows: Vision Desktop Share begins rolling out to Windows Insiders | Windows Insider Blog https://blogs.windows.com/windows-insider/2025/07/15/copilot-on-windows-vision-desktop-share-begins-rolling-out-to-windows-insiders/
Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We’re building multimodal AI that works with how you naturally interact with the world – through conversation, through sight, through the messy way we collaborate. We’re”” / X https://x.com/miramurati/status/1945166365834535247
Thinking Machines Lab Raises a Record $2 Billion, Announces Cofounders | WIRED https://www.wired.com/story/thinking-machines-lab-mira-murati-funding/
ChatGPT may soon edit Excel and PowerPoint files natively, challenging Microsoft Office: Report | Mint https://www.livemint.com/technology/tech-news/chatgpt-may-soon-edit-excel-and-powerpoint-files-natively-challenging-microsoft-office-report-11752665586822.html
A new agentic browser just shipped from Perplexity and it’s pretty wild. Watch this video of @PerplexityComet taking over my LinkedIn tab and taking actions on my part. Interesting UX where the tab glows blue as it’s taking actions. I like the integration of agentic actions https://x.com/ryancarson/status/1942962447369036201
AI-powered browsers like Perplexity’s Comet promise to do your web surfing for you. But do they really save time, or just add more noise? 🌐 https://x.com/fdaudens/status/1945121374063698080
Ask Comet to book a meeting or send an email. Comet transforms entire sessions into single, seamless interactions. https://x.com/PerplexityComet/status/1943026179960873207
asked @PerplexityComet to load up our brand colors in @MeetGamma then shifted my focus to building the actual content of the deck https://x.com/jennysvng/status/1943074383091671529
Been using @PerplexityComet, and there are soo many new use cases for it, but this has got to be one of my favs: I received a verification link sent to my Gmail, and I asked Comet Assistant to click it and verify me on my behalf. And it did it! Simple yet useful ^_^ https://x.com/_Matskuu/status/1942977239974400170
BREAKING 🚨: Comet Browser can now control an open web page from a sidecar! Now it can simply take it over and click around. Making Comet to publish a blog post for me 👀 https://x.com/testingcatalog/status/1928546603448562087
Browse at the speed of thought. https://x.com/PerplexityComet/status/1942968195419361290
Comet browser applying for a job for me 👀 Soon, you will be able to execute such things on a schedule. https://x.com/testingcatalog/status/1926043202684854674
Comet has become a natural extension of all my workflows, ideas, and content since I started using it. I can easily recall any saved information and connect to all of my personal knowledge management tools. Effortless networked intelligence. Proud of this team! https://x.com/camerontstow/status/1943047355944833153
Comet… is nuts. I asked it to go find the subreddits that people would ask cooking questions on. Then, find common questions and come up with ad angles for those questions for Hexclad. For kicks, I asked it to make a static ad for me with my fav angle Results. Are. Insane. https://x.com/NathanSnell/status/1943095214932943291
cool query on my comet browser for handling my X addiction. https://x.com/AravSrinivas/status/1912592179291385896
First test of Perplexity’s new agentic browser, Comet 👇 Comet authenticates into your accounts (e.g. email, calendar) to take actions on your behalf. It pulled a list of all my email newsletters, and unsubscribed from the specific ones I asked it to 🤯 https://x.com/omooretweets/status/1943078090718220653
Hooolllyyy crap. Perplexity’s comet browser is insane. Operator was a total dud. Manus is better but meh. Videos coming. I asked it to duplicate a meta campaign for me. No problem. All automated. Anyone want me to try anything specific? https://x.com/NathanSnell/status/1943062637656338805
How to watch YouTube on Comet https://x.com/AravSrinivas/status/1946240617031606672
I feel like I’m living in the future right now. Been using the new browser called Comet from @perplexity_ai (thanks @AravSrinivas for getting me access!) Like millions of others, I spend hours and hours a day in a browser. Specifically, Chrome. And, Chrome hasn’t”” / X https://x.com/dharmesh/status/1943084541733933189
Let Comet handle the customer support reps for you. Customer support is already a lot of AI anyway. So let your AI talk to the other AIs while you watch YouTube or do some work :-)”” / X https://x.com/AravSrinivas/status/1944778316323717437
Memory is magic when it works. Comet is “memory-native” – the closest approximation of truly understanding the user there is. https://x.com/AravSrinivas/status/1944078543324844077
Perplexity Comet https://comet.perplexity.ai/
Perplexity Comet vs ChatGPT Agent”” / X https://x.com/AravSrinivas/status/1946076236683624616
PERPLEXITY COMET WORKS ON DUNE FOR CONTENT IDEATION!!!! SO COOL! https://x.com/0xDataWolf/status/1943265415322595630
Perplexity is testing new feature with Comet browser which will be able to just go out there and do things for you via prompts. Exciting times ahead https://x.com/AIProductPM/status/1940108252559081764
Prime Day Shopping with Comet. User saves $280 in less than 5 minutes by asking Comet to compare prices.”” / X https://x.com/AravSrinivas/status/1944183680915714548
RT @itsPaulAi: Perplexity Comet can automate any task in your browser This is the first time you REALLY have an AI agent working autonomou…”” / X https://x.com/denisyarats/status/1945321982725382170
RT @PerplexityComet: Clean up your inbox. Ask Comet to unsubscribe you from spam and unwanted emails. https://x.com/AravSrinivas/status/1945232153609978273
RT @rowancheung: Perplexity Comet is not like other agents I’ve been testing it all week, and it’s starting to actually *stick* Having in…”” / X https://x.com/AravSrinivas/status/1945620938068037633
The Cursor for Web Browsing, is here. And it’s better than Comet at turning your open tabs and bookmarks into a codebase. Here is a full breakdown of how i’m using @diabrowser Exploring the Future of Browsing with DIA Browser: Essential Features for Content Creators & https://x.com/rileybrown_ai/status/1943041778304847889
The most interesting thing about Perplexity Comet is that it can actually do things in Cal / Gmail Ex. I asked it to reschedule a 1:1 – it moved the invite and sent an email Neither Google nor OpenAI have done this in their agents…maybe for safety reasons, but it’s limiting 🤔 https://x.com/omooretweets/status/1943116119243416009
The TAM for Comet is bigger than Perplexity because it appeals to people who don’t even want AI. Just the best core browser in the market at the end of the day.”” / X https://x.com/AravSrinivas/status/1946035102150238475
USE CASE 2: Cross-tab product comparison If you’re looking for a new product or looking for flights, Comet can compare tabs in real time It’s surprisingly fast and analyzes the reviews of the tabs too https://x.com/rowancheung/status/1945524017915674879
USE CASE 3: Summarize any YT video with a click You can summarize + chat with any long YT video and get key moments This is also possible in Gemini, but having it in the browser means you can watch the video AND chat/learn with Comet in the side tab at the same time https://x.com/rowancheung/status/1945524019681480992
Vibe coding with @PerplexityComet – asked the browser agent to build me a simple (locally run) yt-dlp wrapper. It navigated to github,created the repo, wrote/committed/pushed the code. You can even make changes to your code from the sidecar, feels like an AI IDE lmao 😂 https://x.com/killuaz0ldyck07/status/1942976067075281248
When you’re on Comet, you’re operating at an abstraction above which AI to use and how to pull in relevant context. Agents are powerful and operate like a human would to complete the task. You go from chat turns to end-to-end workflows. https://x.com/AravSrinivas/status/1944024356138758367
New AI features in Google Search: Call a business or do research
https://blog.google/products/search/deep-search-business-calling-google-search/
We’re bringing Gemini 2.5 Pro to AI Mode: giving you access to our most intelligent AI model, right in @Google Search. With its advanced reasoning capabilities, watch how it can tackle incredibly difficult math problems, with links to learn more ↓ https://x.com/GoogleDeepMind/status/1945515683451736246
🎥 Want the text from any YouTube video? Now you can — no plugins, no installs. Just drop the link, and our YouTube MCP turns it into text instantly. Try it now with this Agent: https://x.com/OmniMCP/status/1942855673324397021
It looks like scale + tool use + multimodal remains the chosen path forward.”” / X https://x.com/emollick/status/1943169759312322604
GPT Agent Now rolled out to 100% of Pro users. Due to higher than expected demand, Plus and Team users will begin getting access Monday.”” / X https://x.com/OpenAI/status/1946024465214935279
SpatialTrackerV2: unified, end-to-end 3D point tracking model which simultaneously estimates Camera Motion, Consistent Geometry and Pixel-wise 3D Trajectories. https://x.com/bilawalsidhu/status/1945154158782505345
DAViD: Data-efficient and Accurate Vision Models from Synthetic Data https://microsoft.github.io/DAViD/
Mistral AI on X: “Introducing the world’s best (and open) speech recognition models! https://t.co/tUnPcdCrbZ” / X
https://x.com/MistralAI/status/1945130173751288311
Le Chat dives deep. | Mistral AI https://mistral.ai/news/le-chat-dives-deep
RT @MistralAI: Introducing the world’s best (and open) speech recognition models! https://x.com/ClementDelangue/status/1945233605745135754
RT @MistralAI: Try Voxtral with an API call: https://x.com/ClementDelangue/status/1945233623164006523
the best part about the mistral release is that the models don’t loose as much on text – this has been a biggest pain point for a audioLMs for a long while https://x.com/reach_vb/status/1945140430288417007
Voxtral | Mistral AI https://mistral.ai/news/voxtral
SoulDance https://xjli360.github.io/SoulDance/
StreamME Project Page https://songluchuan.github.io/StreamME/
SceneScript treats 3D reconstruction as a language problem rather than a geometry one. The model watches a video of a room and just learns to write a script for it. It autoregressively spits out text commands like make_wall(…) or make_bbox(…) that define the scene. https://x.com/bilawalsidhu/status/1944760878831923522
Fine-tune Gemma3n on videos with audios inside with Colab A100 🔥 Just dropped the notebook where you can learn how to fine-tune Gemma3n on images+audio+text at the same time! https://x.com/mervenoyann/status/1945481841298813403
RT @reach_vb: Lets GOOO! @NVIDIAAIDev just dropped Canary Qwen 2.5 – SoTA on Open ASR Leaderboard, CC-BY licensed 🔥 > Works in both ASR an…”” / X https://x.com/reach_vb/status/1946087224346313175
[2507.06261] Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities https://arxiv.org/abs/2507.06261
Building a Multi-Agent Deep Researcher with Gemini 2.5 Pro 🧑🔬📑 We’re excited to collaborate with @_philschmid and the @googleaidevs team on a brand-new tutorial 🧑🏫 : Build a multi-agent system with a researcher, writer, review agent that can search the web, record input, https://x.com/jerryjliu0/status/1944882346731430127
Google engineers shifted to a sparse mixture‑of‑experts transformer that picks only the needed mini‑networks per token, so compute stays low while total capacity rises. —- Paper – arxiv. org/abs/2507.06261 Paper “”Gemini 2.5: Pushing the Frontier with Advanced https://x.com/rohanpaul_ai/status/1944022179869241354
Google’s Gemini 2.5 paper has 3295 authors https://x.com/hardmaru/status/1944385851435205035
New Guide! Learn how to build a multi-agent “Deep Research” system with Gemini 2.5 and @llama_index. It dynamically searches the web, takes notes, and writes a comprehend research report with a feedback loop 🚀 🔍 Search the web with google 📝 Take notes with a dedicated https://x.com/_philschmid/status/1944835088039977124
Today we are rolling out our first Gemini Embedding model, which ranks #1 on the MTEB leaderboard, as a generally available stable model. It is priced at $0.15 per million tokens and ready for at scale production use! https://x.com/OfficialLoganK/status/1944806630979461445
Tried out Google’s ADK(agent development kit) and legit, the inbuilt UI with Gemini Free API is wild 🤯 So easy to use and looks sick! #ADK #GoogleAI #GeminiAPI https://x.com/027_Priyanshu/status/1934106038632153243
What if you had a smart personal assistant living in your watch that could share info and manage tasks for you when your hands are full? 🧠 You’re about to find out. Meet Gemini, rolling out now on Wear OS 4+ watches: https://x.com/WearOSbyGoogle/status/1942961942693359894
Build your first AI agent + MCP Server in Python. Here is everything you need to build your first AI agent in less than 20 minutes. About the code you’ll see here: 1. I used Google ADK with Gemini Flash to power the agent 2. The agent connects to an MCP server 3. It also https://x.com/svpino/status/1929881755915366772
Gemini CLI can automate your computer using MCP 🔥 Add Windows MCP (or macOS MCP) to Gemini CLI and you can tell it what to do autonomously. Gemini then takes control of your entire system to achieve the goal you’ve set. Links below https://x.com/itsPaulAi/status/1940903613888696776
Someone vibe coded an Al Agent that can use your phone on its own. He outlines using this as a ChatGPT-like interface, except things actually get done automatically. The person built this using Google ADK and the Gemini API 💀 Credits: Tyrange-D via r/singularity https://x.com/DigestibleAICo/status/1924218874678960504
RT @OfficialLoganK: Today we are rolling out our first Gemini Embedding model, which ranks #1 on the MTEB leaderboard, as a generally avail…”” / X https://x.com/demishassabis/status/1944870402251219338
Gemini-CLI is bad compared to Claude code in very fixable ways codex-cli is bad in odd ways. Feels unfriendly, unlike the GUI version of Codex and unusual for product-strong OpenAI”” / X https://x.com/kylebrussell/status/1945242558487044118
18/ How are you using comet? Any use cases that I missed? Follow @AtomSilverman and @AgentOpsAI for everything AI agent-related Have you tried the @AgentOpsAI MCP server? Link in bio. Last week’s thread: https://x.com/AtomSilverman/status/1944456541169762363
a fresh batch of comet invites just went out”” / X https://x.com/AravSrinivas/status/1945669970618421699
Looks like Grok 4 is 10^27 FLOPs given their graphs? HLE score is 26% without tools, Gemini 2.5 is 21.6% without tools. Curious what the tool piece is.”” / X https://x.com/emollick/status/1943162710725657055
LLMs for IMO 2025: gemini-2.5-pro (31.55%), o3 high (16.67%), Grok 4 (11.90%). https://x.com/denny_zhou/status/1945887753864114438
Gemini generates the best prompts for Veo 3. Full code below. ““python import time from google import genai from google.genai import types client = genai.Client() operation = client.models.generate_videos( model=””veo-3.0-generate-preview””, prompt=””””””{ “”character_name””: https://x.com/_philschmid/status/1945898590821584989
Generate videos with Veo 3 | Gemini API | Google AI for Developers https://ai.google.dev/gemini-api/docs/video
Start building with Veo 3: our state-of-the-art video generation model now available in paid public preview via the Gemini API and @Google AI Studio. 🎨 Here’s how to try it → https://x.com/GoogleDeepMind/status/1945886603328778556
RT @GeminiApp: A new Gemini feature just dropped and everything is alive?! Now you can turn photos into videos with sound in Gemini.”” / X https://x.com/demishassabis/status/1944939563170062804
merve on X: “GLM-4.1V-9B-Thinking is the BEST thinking vision LM out there 😍 it’s now served in @huggingface Inference Providers through @novita_labs 🤝 https://t.co/XUwXBLmoz7″ / X
https://x.com/mervenoyann/status/1945432520339734647
Today in the journal Science: BioEmu from Microsoft Research AI for Science. This generative deep learning method emulates protein equilibrium ensembles – key for understanding protein function at scale. https://x.com/MSFTResearch/status/1943373860012744737
Big vision language models waste compute and still miss rare corner cases while driving. MoSE fixes both by turning on only the skills needed for perception, prediction, then planning. It builds a routing table that tags every training example with human style skills like https://x.com/rohanpaul_ai/status/1944010854737080566
all modality RAG 🔥 ColQwen-Omni is a new multimodal retrieval model that can retrieve anything (videos, audios, documents and more!) use with transformers 🤗 here’s a smol demo on video retrieval ↙️ https://x.com/mervenoyann/status/1945862039022207386
RT @ManuelFaysse: Introducing ColQwen-Omni, a 3B omnimodal retriever that extends the ColPali concept of multimodal retrieval with late int…”” / X https://x.com/andersonbcdefg/status/1945855681976021268




