Agents and Copilots: AI News Week Ending 05/01/2026

Agents and Copilots: AI News Week Ending 05/01/2026

May 1, 2026

Image created with gemini-3.1-flash-image-preview with claude-opus-4.7. Image prompt: High-end product photograph of a classic Dairy Queen banana split in a paper boat with three distinct soft-serve scoops connected by deliberate streams of chocolate, strawberry, and caramel sauce flowing scoop-to-scoop like a relay handoff, each scoop flanked by a tiny red plastic spoon standing upright as a tool, a single glossy cherry crowning the final scoop, bold custom ‘AGENTS’ lettering printed on the boat’s red band with a small ’75 — Est. 1951, Milford, DE’ stamp on the napkin beneath, soft directional studio light, shallow depth of field, crisp macro detail, landscape composition.

I have been playing with the new Outlook agent, and it is fine, but really awkward to use, since you have to ask for things in a chatbot window, then go to your drafts, etc. And Claude Cowork does the same thing (works with Gmail, too) and has better visibility across your life.
https://x.com/emollick/status/2048974994961490107

Can agents replace the search stack?
https://softwaredoug.com/blog/2026/04/28/search-apis-replaced-by-agents.html

Introducing Firefly AI Assistant – a new way to create with our creative agent
https://blog.adobe.com/en/publish/2026/04/15/introducing-firefly-ai-assistant-new-way-create-with-our-creative-agent

Cursor’s $60 Billion Escape Hatch – Contrary Research
https://contraryresearch.substack.com/p/cursors-60-billion-escape-hatch

How do people seek guidance from Claude? We looked at 1M conversations to understand what questions people ask, how Claude responds, and where it slips into sycophancy. We used what we found to improve how we trained Opus 4.7 and Mythos Preview.
https://x.com/AnthropicAI/status/2049927618397614466

Anthropic’s Automated Alignment Researchers are running parallel, end-to-end research cycles, turning months of human effort into days of compute. On one benchmark, they leapt from a human-tuned score of 0.23 to 0.97 (a rather impolite gap) But! they also learned to game
https://x.com/TheTuringPost/status/2047134374190309446

More connectors launching today: Adobe Creative Cloud, Ableton, Splice, Canva Affinity, SketchUp, and Resolume. We’ve also joined the Blender Development Fund as a patron to support open-source development of the software. Read more:
https://x.com/claudeai/status/2049143442601546054?s=20

New Anthropic research: Project Deal. We created a marketplace for employees in our San Francisco office, with one big twist. We tasked Claude with buying, selling and negotiating on our colleagues’ behalf.
https://x.com/AnthropicAI/status/2047728360818696302

Project Deal: our Claude-run marketplace experiment | Anthropic \ Anthropic
https://www.anthropic.com/features/project-deal

Claude for Creative Work \ Anthropic
https://www.anthropic.com/news/claude-for-creative-work

Introducing Claude Design by Anthropic Labs \ Anthropic
https://www.anthropic.com/news/claude-design-anthropic-labs?lang=us

Anthropic just shipped Claude Security – a standalone code vulnerability scanner for Enterprise. Scans your repo, validates findings, suggests patches. Powered by Opus 4.7. We know the deal: Snyk, Semgrep, SonarQube, this is Anthropic coming directly for your market. Stocks
https://x.com/kimmonismus/status/2049901987500552195

Claude Security is now in public beta | Claude
https://claude.com/blog/claude-security-public-beta

Claude Security is now in public beta, built into Claude Code on the web. Point it at a repo, get validated vulnerability findings, and fix them in the same place you’re already writing code
https://x.com/_catwu/status/2049964403177689130#m

New on the Science Blog: We gave Claude 99 problems analyzing real biological data and compared its performance against an expert panel. On 23 problems, the experts were stumped. Our most recent models solved roughly 30% of those–and most of the rest.
https://x.com/AnthropicAI/status/2049624600741560340

How successfully — and efficiently! — can agents carry out long-horizon tasks on the web? We built a benchmark of ~200 multi-site tasks, based on people’s real browsing history. Many of them take hours to solve. Paper:
https://t.co/yNGw8Fgvbj Led by @JangLawrenceK and
https://x.com/dan_fried/status/2049530695739932876

DeepSeek-V4 Pricing gives you glimpses into the future Imagine in one year using a Mythos level model that can basically code everything for $4/million tokens
https://x.com/scaling01/status/2047707820552831028

You can now run DeepSeek4-Flash on 256GB Mac. Next up speed 🚀 PR:
https://x.com/Prince_Canuma/status/2047685898163147125

.@deepseek_ai v4 Pro’s checkpoint is both in FP4 and FP8, depending on the layer. This means that the entire model can fit on a single NVIDIA 8xB200 node without trouble. @vllm_project: “”Checkpoint is FP4+FP8 mixed: MoE expert weights are stored in FP4 while the remaining
https://x.com/LambdaAPI/status/2047654086263320965

Thoughts after reading the DeepSeek V4 paper: – NVIDIA really is something else. Remember how back in 2024 people were bashing Blackwell as overspec’d and dismissing FP4 as just marketing? Turns out it was all groundwork for the next generation of models. Maybe NVIDIA’s moat is
https://x.com/jukan05/status/2047861732702662741

✨ DeepSeek-V4 is here — a million-token context, 1.6T parameter powerhouse optimized for agentic workflows. Out of the box, on DeepSeek-V4-Pro, NVIDIA Blackwell Ultra delivers over 150 TPS/user interactivity for agentic workflows. And we’re just getting started. Expect these
https://x.com/NVIDIAAI/status/2047765637808664759

A completely local agent that lives right inside your browser. Powered by Gemma 4 E2B and WebGPU, it uses native tool calling to: 🔍 Search browsing history 📄 Read and summarize pages 🔗 Manage tabs 100% local. No servers needed!
https://x.com/googlegemma/status/2048805789788413984

Here is how to run a coding agent fully locally on your machine with @googlegemma and Pi. – Gemma 4 26B A4B activates 4B parameters per token. – Pi provides four tools: read, write, edit, and bash. – LM Studio runs a server at localhost:1234 by default. – Pi runs YOLO by
https://x.com/_philschmid/status/2048719354905108623

Learn how to run a local coding agent! Use: – Pi agent – Gemma 4 26B – Serving engine of choice: e.g. LM Studio
https://x.com/googlegemma/status/2049163687639007451

The @huggingface ML Intern is trending #1 across 1.2M Spaces on the Hub 🔥 What part of my job should we automate next? Link to the demo ⬇️
https://x.com/_lewtun/status/2049021398312468815

@NVIDIA Nemotron 3 Nano Omni is now on Together AI. Enterprise multimodal AI — video, audio, image, documents & text — optimized for speed and scale. ✅ ~3B active params, 9x higher throughput ✅ Fully managed, zero infra headache ✅ Secure, zero-trust architecture Build
https://x.com/togethercompute/status/2049160446708711883

Excited to support @NVIDIA Nemotron 3 Nano Omni, now available on Fireworks. It’s the first open model that handles vision, audio, video, and text in a single inference loop. Built for multimodal sub-agents at scale, with 9× higher throughput than Qwen3 30B. 256K context. Now
https://x.com/FireworksAI_HQ/status/2049159136802398546

Introducing @NVIDIA Nemotron 3 Nano Omni. NVIDIA Nemotron 3 Nano Omni is an open multimodal foundation model that unifies audio, images, text, and video into a single context window. It powers subagents for use cases like computer-use agent, document intelligence, and video and
https://x.com/baseten/status/2049160818575749300

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
https://huggingface.co/blog/nvidia/nemotron-3-nano-omni-multimodal-intelligence

Meet Nemotron 3 Nano Omni 👋 Our latest addition to the Nemotron family is the highest efficiency, open multimodal model with leading accuracy. 30B parameters. 256K context length. 🧵👇
https://x.com/NVIDIAAI/status/2049159441870717428

NVIDIA Nemotron 3 Nano Omni is now live on fal, available at launch. A single model for multimodal agents: 🔁 text, image, video, audio in one loop 🧠 1 context reasoning across complex workflows ⚡️ ~9× higher throughput with fewer inference hops Built for real-world agent
https://x.com/fal/status/2049160999442198632

NVIDIA Nemotron™ 3 Nano Omni is live on OpenRouter. An open 30B-A3B multimodal model for agentic workflows: text, image, video, and audio in → text out, with a 256k context window and efficient MoE architecture for computer use, documents, and AV reasoning.
https://x.com/OpenRouter/status/2049164366218772526

NVIDIA releases Nemotron-3-Nano-Omni, a new 30B open multimodal MoE model. Nemotron-3-Nano-Omni-30B-A3B is the strongest omni model for its size and supports audio, video, image and text. Run on ~25GB RAM. GGUF:
https://t.co/t4COCqVrLS Guide:
https://x.com/UnslothAI/status/2049161390150365344

GPT-5.5 + GPT-Image-2 is becoming one of the best combos for building apps! @dkundel breaks down why it works so well. We built those learnings into the Build Web Apps plugin, so Codex can handle the design-to-app loop for you. 👌
https://x.com/romainhuet/status/2049597180474970179

Codex for Work
https://chatgpt.com/codex/for-work/

From draft to deck, review the work as it takes shape inside Codex. Open the file, ask for changes, and keep tweaking it in the same thread.
https://x.com/OpenAI/status/2049928782019256561?s=20

I was not expecting the Codex App to be even better than using the terminal. Highly recommend everyone to try. If you are on Linux just tell GPT-5.5-xhigh to “find a way to get it, it’s known to be easy”
https://x.com/Yampeleg/status/2049398916882264526

Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.
https://x.com/OpenAI/status/2047376561205325845

It’s never been easier to do everyday work with Codex. Choose your role, connect the apps you use every day, and try suggested prompts. Codex helps with everything from research and planning to docs, slides, spreadsheets, and more.
https://x.com/OpenAI/status/2049928776147230886

Ok that’s one more level of mind-blowing design workflow, check it out > What if you could generate AND add great textures to your game WHILE playing your game? That’s what allows you to do gpt-image-2 and GPT-5.4 within the Codex App Run your game in the browser, and prompt
https://x.com/NicolasZu/status/2046842446491861441?s=20

Still wondering how you can use Codex for (almost) everything? Codex can help with more of the work that supports the work, from organizing research to making spreadsheets, decks, and summaries.
https://x.com/OpenAI/status/2049583167406064115

The Codex App Server is massively underrated. You can inject Codex-level intelligence into any platform using your ChatGPT account. I embedded it into Chrome… and it works flawlessly. And yes… it’s 100% open source.
https://x.com/arrakis_ai/status/2049484893877637359

This thing does more than what you think it does. Codex now available for non-coders
https://x.com/thsottiaux/status/2049933460756979719?s=20

Use Codex to analyze a data export, flag what changed, and help draft the readout.
https://x.com/OpenAI/status/2049583308305252620

We tried a new thing with NVIDIA to roll out Codex across a whole company and it was awesome to see it work. Let us know if you’d like to do it at your company!
https://x.com/sama/status/2047395562501411058

Computer Use runs this use case 42% faster in today’s Codex app update.
https://x.com/AriX/status/2049932746567598472

With the Figma plugin, Codex can now turn implementation plans into visual FigJam boards.
https://x.com/OpenAIDevs/status/2049605820351230158

Amateur armed with ChatGPT ‘vibe maths’ a 60-year-old problem | Scientific American
https://www.scientificamerican.com/article/amateur-armed-with-chatgpt-vibe-maths-a-60-year-old-problem/

Earlier this month, an Erdős problem that had been open for 60 years was solved with help from GPT-5.4 Pro. What happens now that AI is getting good at math? OpenAI researchers @SebastienBubeck and @ErnestRyu join host @AndrewMayne to explain what changed and what it could mean
https://x.com/OpenAI/status/2049182118069358967

I just tried Agent Mode with ChatGPT for Clinicians. This is unbelievable. I might make a video of this… wild.
https://x.com/operationdanish/status/2048099874734821777

Interesting, OpenAI just released a free healthcare version of ChatGPT-5.4 for clinicians that beat specialty-matched physicians with unlimited time + web access on a benchmark of real & hard clinical tasks. Caveat: the benchmark was designed by OpenAI, though it is fully open.
https://x.com/emollick/status/2047147032016551937

Introducing ChatGPT for Clinicians:
https://x.com/gdb/status/2047145125604995280

Excited that GitHub shows real numbers here again. We been closing over 10k issues and close to 5k PRs this week thanks to clawsweeper and clownfish. Overall since December: 27k issues / 30k PRs closed.
https://x.com/steipete/status/2048478136824738181

OpenClaw 2026.4.24 🦞 ☎️ Voice calls can now reach the full agent 🧠 DeepSeek V4 Flash + Pro join the team 🖱️ Browser automation got coordinate clicks + better recovery 🔧 Telegram, Slack, MCP, sessions, and TTS fixes More reach. Less duct tape.
https://x.com/openclaw/status/2048124737918751035

🎉 Day-0 vLLM support for the MiMo-V2.5 series! Congrats to @XiaomiMiMo on the open-source release of the MiMo-V2.5 and MiMo-V2.5-Pro. Highlights from the flagship MiMo-V2.5-Pro, an agent-oriented model focused on long-horizon tool use and frontier coding: – Long-horizon task
https://x.com/vllm_project/status/2048825703244972375

Just dropped two open-source models: MiMo-V2.5-Pro (Code Agent, 1T total) and MiMo-V2.5 (Multimodal Agent, 310B total). Oh and one more thing — we’re giving devs & creators 100T tokens on us. Go build something cool 🛠️ 🎁 100T Free Token Grant for Builders
https://x.com/_LuoFuli/status/2048851054662762618

MiMo-V2.5-Pro | Xiaomi
https://mimo.xiaomi.com/mimo-v2-5-pro

Xiaomi MiMo-V2.5 is now officially open-sourced！ MIT License, supporting commercial deployment, continued training, and fine-tuning – no additional authorization required. Two models, both supporting a 1M-token context window : • MiMo-V2.5-Pro: built for complex agent and
https://x.com/XiaomiMiMo/status/2048821516079661561

Xiaomi MiMo-V2.5 Series: Pushing Open-Source Agents Forward 🔸 MiMo-V2.5-Pro, our strongest model yet. A major leap from MiMo-V2-Pro in general agentic capabilities, complex software engineering, and long-horizon tasks, now matching frontier models like Claude Opus 4.6 and
https://x.com/XiaomiMiMo/status/2046988157888209365?s=20

.@huggingface unveiled ml-intern – an open-source agent that automates the gritty post-training loop: – reading papers – tracing citations – curating datasets – running experiments – and iterating like a seasoned researcher Early demos show eyebrow-raising gains across science,
https://x.com/TheTuringPost/status/2049096050607300765

// Agentic Harness Engineering // Pay attention to this one, AI devs. (bookmark it) Most coding-agent harnesses are still tuned by hand or brittle trial-and-error self-evolution. This new work introduces Agentic Harness Engineering, a framework that makes harness evolution
https://x.com/omarsar0/status/2049492169887748365

// Tool Attention Is All You Need // New research proposes a practical fix for the hidden “”MCP tax.”” The work introduces a dynamic tool gating mechanism built on an Intent Schema Overlap score from sentence embeddings, paired with a state-aware gating function that enforces
https://x.com/omarsar0/status/2047725276851994639

🚀DeepAgents deploy is a simple, configuration driven way to get an agent harness deployed to the cloud deepagents.toml is the file that configures it. It has four sections: – agent – sandbox – auth -frontend Here’s what each one does 🧵
https://x.com/hwchase17/status/2049858892637892739

A big problem with all AI at work punditry right now is that it all rests on data from the pre-agentic era (which is basically just now ending) and we have very little information about what has been happening since the Claude Code moment. So everything now requires some caveat.
https://x.com/emollick/status/2049188184135782899

Batch API is terrible for one agent. It might be great for a fleet.
https://eran.sandler.co.il/post/2026-04-27-batch-api-is-terrible-for-one-agent/

Better codebase search = better agentic outcomes Semantic indexing is now available for all workspaces in @code, not just remotes backed by GitHub or Azure DevOps repos!
https://x.com/pierceboggan/status/2049504445424423133

Congrats to @AntLingAGI on the open release of Ling-2.6-1T! 🎉 A new flagship for real-world agentic workflows — Day-0 vLLM support is in. 📖
https://x.com/vllm_project/status/2049517056299761925

Continually improving our agent harness · Cursor
https://cursor.com/blog/continually-improving-agent-harness

Cursor is making a platform play. Right now they’re an IDE. By releasing the SDK, they’re turning their agent runtime into programmable infrastructure that runs headlessly in CI/CD pipelines, internal tools, and even third-party products. Every agent spun up through the SDK burns
https://x.com/kimmonismus/status/2049514922044792934

Cursor Security Review is now available for Teams and Enterprise plans. Run two types of always-on agents: 1. Security Reviewer checks every PR for vulnerabilities and leaves comments. 2. Vulnerability Scanner runs scheduled scans of your codebase and posts findings in Slack.
https://x.com/cursor_ai/status/2049926283061035254

Customers like Rippling, Notion, C3 AI, and Faire are using the Cursor SDK to build custom background agents, take bugs from ticket to merge-ready PR, and maintain self-healing codebases. Learn more:
https://x.com/cursor_ai/status/2049499876388454903

DeepAgents Deploy is the easiest way to bring agents to production, with just a few markdown and configuration files (no code!) 🧵Here’s me building an agent that connects to LangChain Docs. Powered by @Zai_org GLM5 (via @baseten) and @mintlify MCP
https://x.com/hwchase17/status/2049546041247289553

first up in the “”going to production”” series: durable execution! long running agents need to be able to 1. survive crashes 2. resume after an indefinite pause durable execution solves this by checkpointing agent progress at each step so you can resume with all state in tact
https://x.com/sydneyrunkle/status/2049132897227936073

House Committee probes Cursor parent, Airbnb over Chinese AI | Semafor
https://www.semafor.com/article/04/29/2026/house-committee-probes-cursor-parent-airbnb-over-chinese-ai

How do AI Agents spend your money? Most teams treat agent token costs as a rounding error even though the data says they shouldn’t. New paper presents the first systematic study of how agents actually spend money on coding tasks. They ran 8 frontier LLMs on SWE-bench Verified
https://x.com/dair_ai/status/2048784506635878644

How do we build search systems for Agents? 👾🔎 I am SUPER EXCITED to share a new episode of the Weaviate Podcast with Zijian Chen (@zijian42chen) and Xueguang Ma (@xueguang_ma) from the University of Waterloo on AgentIR! 🎙️💚 When humans search, we write short queries and keep
https://x.com/CShorten30/status/2048764263196500002

How well do today’s frontier models handle long-horizon, multi-step web agent tasks, such as identifying the top 25 U.S. CS PhD programs with ML/AI faculty likely accepting students and compiling the results into a structured sheet? Check out our new work on Odysseys:
https://x.com/rsalakhu/status/2049521211353301198

I built deepagents-sandbox — a native Linux sandbox backend for Deep Agents. No Docker. No VM. Agents get a writable /workspace, blocked network by default, memory/PID limits, and timeout enforcement. It uses bubblewrap + cgroups v2 to isolate agent code execution with
https://x.com/nu_b_kh/status/2047775326412136574

I think that academia has not absorbed the fact that AI agents are now good enough to independently reconstruct complex papers without access to code or the papers themselves; just the methods & data. They aren’t perfect but the errors are often in the human paper, not the AI.
https://x.com/emollick/status/2048058055472881710

I’m kinda surprised non of the intelligence agencies (like CIA, FBI) have yet published a banger: “How we run agents”
https://x.com/TheTuringPost/status/2047331636614742183

I’ve been trying to make transformers more agent-friendly: agentic CLI, a skill, doc rewrites, canonical examples. It felt a bit like shooting in the dark: hard to measure progress, and hard to ensure what worked once will continue working. So we built a benchmark!
https://x.com/LysandreJik/status/2049053056814436352

Introducing /multitask in the new Cursor 3 interface. Cursor can now run async subagents to parallelize your requests instead of adding them to the queue. For already queued messages, you can ask Cursor to multitask on them instead of waiting for the current run to finish.
https://x.com/cursor_ai/status/2047764651363180839

Introducing Agent Collabs: Bring your own ml-interns and agents for collaborative autoresearch! We built a simple platform for swarms of agents to work together on a problem: they can exchange messages, share artifacts, coordinate on resources, and globally track progress! Any
https://x.com/cmpatino_/status/2049881579691139372

Introducing our new work: “Learning to Orchestrate Agents in Natural Language with the Conductor” accepted at #ICLR2026
https://t.co/31QhVGCSzq What if we trained an AI not to solve problems directly, but to act as a manager that delegates tasks to a diverse team of other AIs?
https://x.com/SakanaAILabs/status/2048777689763639741

Lessons on Building MCP Servers – Tao of Mac
https://taoofmac.com/space/blog/2026/04/29/2341

MCP servers aren’t dead. You’re just using them wrong. Here are two proven patterns on how to correctly use MCP servers and avoid the bloat 1. Explicit Use: MCP servers are only included when user @mentions. Nothing loads unless requested. 2. Subagents: Enable MCP servers
https://x.com/_philschmid/status/2048781492914885079

Millions of users now have months-long conversation histories with AI assistants💬 But this data is proprietary and unavailable to the academic community for research, training, or benchmarking. We introduce HorizonBench🌅, a benchmark and data generator for long-horizon
https://x.com/StellaLisy/status/2047645651324821998

Model-Harness-Task fit is very real. The world’s best agents carefully tailor the harness around the model to take advantage of each model’s unique intelligence & capabilities 🧠 Today we released Harness Profiles in deepagents so that every team can more easily version,
https://x.com/Vtrivedy10/status/2049537545273528633

New in Deep Agents: Harness Profiles.
https://t.co/yl2mqzincU ✅ Model-specific profiles to adjust prompts, tools, and middleware. 📦 Profiles for @OpenAI, @Anthropic, and @Google models out of the box. Currently available in Python, and coming soon to TypeScript.
https://x.com/LangChain_OSS/status/2049539590990557381

new in ml-intern: you can now actually see what’s going on inside added native metric logging + trackio integration. every training run the agent kicks off now has live curves you can watch in real time before it was kind of a black box. agent launches a job, you wait, you
https://x.com/akseljoonas/status/2049183527703396699

Our agent harness makes models inside Cursor faster, smarter, and more token-efficient. Here’s how we test improvements to the harness, monitor and repair degradations, and customize it for different models.
https://x.com/cursor_ai/status/2049901436918436249

own your agent harness 🤝 own your intelligence 🤝 own your evals 🤝 open ecosystems this doesn’t mean start from scratch, build on an open base harness where you can edit/extend existing primitives like bash, fs-ops, permissions, skills support, etc for your task harnesses
https://x.com/Vtrivedy10/status/2049597811226726682

serving multiple users from a single agent deployment introduces three distinct problems. luckily, langsmith’s agent server has a solution for each! 1. data isolation: your @auth.authenticate handler tags every resource with ownership on write, filters on read. 2. delegated
https://x.com/sydneyrunkle/status/2049956826670911809

Starting June 1st, GitHub Copilot will move to a usage-based billing model as GitHub Copilot supports more agentic and advanced workflows. In early May, you’ll see a preview bill experience, giving visibility into projected costs before the transition. 👉 Read more about the
https://x.com/github/status/2048794729274278258

Starting today, agents can now be Cloudflare customers. They can create a Cloudflare account, start a paid subscription, register a domain, and get back an API token to deploy code right away.
https://x.com/Cloudflare/status/2049545195914498139

Take your Copilot CLI sessions in @code with you anywhere using remote control! /remote on, and control from
https://t.co/g9Lb1BrNrG or the GitHub mobile app 🙂
https://x.com/pierceboggan/status/2049503967059812617

The new LLM trained only on pre-1931 text is small enough that it can potentially run on device, so, with the right tools, you can get a fully vintage version of Siri, but from the era of Downton Abbey. Here, I asked for it to arrange for sushi delivery in Philadelphia. Hmmm…
https://x.com/emollick/status/2048938904410095644

The terminal hasn’t changed much since the 1970s. What you do with it has. Introducing Devin for Terminal: everything we learned building Devin, now as a local agent, available right in your shell. And when your work outgrows your laptop, hand it off to the cloud.
https://x.com/cognition/status/2048821234281181302

Today we’re releasing Laguna XS.2, Poolside’s first open-weight model. It’s a 33B total / 3B active MoE model built for agentic coding and long-horizon tasks. Trained fully in-house on our own stack. Runs on a single GPU. Released under Apache 2.0. Links 👇 Weights:
https://x.com/poolsideai/status/2049144111626670282

Today we’re shipping Laguna M.1 and Laguna XS.2 – our first public models. We’re also shipping our agent harness and a preview product experience. Both models were trained from scratch on our own stack: data pipelines, training infrastructure, and agent RL.
https://x.com/eisokant/status/2049142230397370537

tweeted about this yesterday and Cursor already dropped the alpha today! 🚀very cool to see how us, them, and others have converged on good design patterns in Agent + Harness Engineering: 1. Tuning different models in the Harness with bespoke tools/prompts 2. Using
https://x.com/Vtrivedy10/status/2049919247321813491

Until today, Deep Agents shipped with a single set of prompts, tools, and middleware aimed to work well across all Large Language Models. With the launch of harness profiles, you can now control these parameters on a per-model basis.
https://x.com/LangChain/status/2049540926603718969

We built a completely free CLI agent with @badlogicgames’s Pi agent, @ollama (Gemma 4), and Parallel’s new free web search MCP. Check out the blog post to see how to do it yourself:
https://x.com/p0/status/2047794814104862843

We’re introducing HALO 😇 Hierarchal Agent Loop Optimizer HALO is an RLM-based agent optimization technique capable of recursively self-improving agents by analyzing their execution traces and suggesting changes. This work is inspired by the Mismanaged Genius Hypothesis
https://x.com/samhogan/status/2049619541727302040

We’re introducing the Cursor SDK so you can build agents with the same runtime, harness, and models that power Cursor. Run agents from CI/CD pipelines, create automations for end-to-end workflows, or embed agents directly inside your products.
https://x.com/cursor_ai/status/2049499866217185492

We’ve learned how to build skilled agents, but still don’t know how to run them as a team. Huawei Noah’s Ark Lab proposed OneManCompany (OMC) – an organisational layer for multi-agent systems that treats agents like employees in a real company. The main idea is to move from
https://x.com/TheTuringPost/status/2049589752899510471

We’ve open-sourced a few starter projects for you to build on: a coding agent CLI, a prototyping tool, and an agent-powered kanban board. Use Cursor to customize them for your use case:
https://x.com/cursor_ai/status/2049499874043830389

What if I want my coding agent to mention goblins? (If you don’t know the context for this, I suspect it will become viral soon enough)
https://x.com/emollick/status/2049009940447051906

Why Your Multi-Agent Network Works in Demo but Falls Apart in the Wild
https://decisionai.substack.com/p/why-your-multi-agent-network-works?r=5du8s6&triedRedirect=true

working on subagents / agents-as-tools for single threads, and it’s so very satisfying with Think because it’s just nested chat all the way down, persistence/streaming/resumption all ootb landing this week
https://x.com/threepointone/status/2049088722835042475

ElevenLabs launches Agent Templates for faster bootstrapping
https://www.testingcatalog.com/elevenlabs-launches-agent-templates-for-faster-bootstrapping/

ParseBench: A benchmark for document parsing agents @llama_index just shipped a benchmark with 2k verified pages for real enterprise documents. Benchmarks are the major underrated component in the ML ecosystem, so I’m excited to see more entities doing open work in the space
https://x.com/osanseviero/status/2048777802015535189

One reason I don’t think “judgment” is going to be a distinctly human role in working with AI is that the most recent agentic models have gotten quite good at some types of judgment. You can’t do the kind of high complexity, long-run tasks that current AIs can do without it.
https://x.com/emollick/status/2049530015356793027

Lately I’ve been having fun with running coding agents fully locally. The setup I landed on is: – Pi agent – Gemma 4 26B A4B – Server of choice: LM Studio/Ollama/llama.cpp I wrote a step-by-step guide with instructions on how to set it up:
https://x.com/patloeber/status/2048715918541558075

Under the directives of the President of the UAE, we launch a new government model. Within two years, 50% of government sectors, services, and operations will run on Agentic AI, making the UAE the first government globally to operate at this scale through autonomous systems. AI
https://x.com/HHShkMohd/status/2047277766769545352?s=20

Totally offline agents are possible!
https://x.com/Teknium/status/2048975223853350976

Remote agents in Vibe. Powered by Mistral Medium 3.5. | Mistral AI
https://mistral.ai/news/vibe-remote-agents-mistral-medium-3-5

Good agent memory paper. And great insights on the benefits of structured memory for long-horizon behavior in LLMs. Why it matters: It treats memory less like search and more like a system that will need maintenance (which they often do). Flat memories are cheap to write.
https://x.com/dair_ai/status/2047740873027543228

Insightful article to see the CPU and GPU roles in AI era. Agentic workloads add orchestration and control- plane logic best suited to CPUs, shifting GPU:CPU ratios from 7-8:1 in training toward 3-4:1 or lower in inference and agentic eras.
https://x.com/SVTrivo/status/2049205332329795730

Anthropic launches Memory in Claude Agents for enterprise
https://www.testingcatalog.com/anthropic-launches-memory-in-claude-agents-for-enterprise/

Anthropic tests new Bugcrawl tool for Claude Code
https://www.testingcatalog.com/anthropic-tests-new-bugcrawl-tool-for-claude-code-bug-detection/

Claude Code is down. The whole Silicon Valley:
https://x.com/Yuchenj_UW/status/2049201297656786999

Claude Security is now in public beta for Claude Enterprise customers. Claude scans your codebase for vulnerabilities, validates each finding to cut false positives, and suggests patches you can review and approve.
https://x.com/claudeai/status/2049898739783897537?s=20

I’ve been using Claude Code since July of 2025, and have been on the Max plan the entire time. There was a solid 6 months when Claude was fantastic. The models worked great. And things could be predicable. After 4.6 things started to change, and known behaviors started
https://x.com/TheZachMueller/status/2049116099053031563

The last month, Anthropic: – Quietly nerfed their flagship model harness (Claude Code) without telling anyone – Banned corporate customers of Claude – Silently changed plans for customers with certain files in their repo All evidence that closed models are *massive* risks.
https://x.com/GergelyOrosz/status/2049123621826707657

Very interesting… given they distilled claude 🤔🤔
https://x.com/cloneofsimo/status/2047628636933812301

Yesterday, we shared a chart showing 80% of Claude users live in $100k+ households, more than any other major AI service. But Claude’s user base is smaller than other AI services, so this isn’t the same as being the most popular service among high-income households.
https://x.com/EpochAIResearch/status/2047423836904460328

Happy to announce that Hermes Agent’s repo just surpassed Anthropic’s Claude Code repo
https://x.com/Teknium/status/2048710115885523444

Don’t try to build a self-improving AI agent without evals. You are just wasting time and compute. An agent can’t improve from traces it can’t evaluate. This is why it’s exciting to see @FutureAGI_ going fully open source with their platform. It combines the best of all the
https://x.com/omarsar0/status/2048759865007591615

Most agentic benchmarks center around tasks that are automatically verifiable. But any task that is veriafiable is also easy to optimize for. This work instead describes the future of critical open world evaluations. Led by @sayashk, our current draft is now live.
https://x.com/sarahookr/status/2048731841759428935

Organizational design for agents is hard, benchmarking agents working in concert is hard. Together, this is the next critical frontier for making AI matter in economically valuable tasks, and we really don’t know very much about it.
https://x.com/emollick/status/2047828327856030047

Every company building on top of AI should be making their own benchmarks. This is the way if you want model progress to disproportionally benefit your company.
https://x.com/OfficialLoganK/status/2048554074107470305

BullshitBench: GPT-5.5 and 5.5-Pro update! They did NOT do well – 5.5 about the same level as GPT-5.4 (around 30-35 rank, 45% pushback). GPT-5.5-Pro did WORSE – only about 35% pushback. I must say the Pro result kind of shocked me. This is actually interesting, what this tells
https://x.com/petergostev/status/2047773402090426548

GPT 5.5 is much smarter than I thought Yesterday, I did one-shots, coding, benchmarks, and was disappointed. Today, I did it all again, except via the API, which is now available. Results changed completely: → one-shot prompts went from bad to very good → excellent coding
https://x.com/VictorTaelin/status/2047818978664268071

GPT-5.5 is now available in Cursor! It’s currently the top model on CursorBench at 72.8%. We’ve partnered with OpenAI to offer it for 50% off through May 2.
https://x.com/cursor_ai/status/2047744579127185843

GPT-5.5 Pro achieves a small bump on GPT-5.4 Pro with 60% lower cost and token use in our frontier science eval, CritPt CritPt tests models on graduate-level physics research problems contributed by 60+ researchers from 30+ institutions globally. When CritPt was released in
https://x.com/ArtificialAnlys/status/2049926072595280030

LisanBench results for GPT-5.5 – it’s good. GPT-5.5 is now the strongest model without Thinking on both metrics! GPT-5.5-medium uses on average ~45.6% less tokens than GPT-5.4-medium while scoring 1.77x higher! (1.14x higher score on the difficulty weighted metric) Running
https://x.com/scaling01/status/2047818395970904229

Opus 4.7 and GPT-5.5 scores on GSO are live! #1 Opus 4.7 @ 42.2% #2 Opus 4.6 @ 37.3% #3 GPT-5.5 @ 37.3%
https://x.com/scaling01/status/2048853227211251891

Summarize 📝0.14.0 is out. GPT-5.5 Fast mode via `–fast`, Reddit thread extraction in the browser extension, local PDF `–extract`, and fixes for auto model config + Meta site compatibility.
https://x.com/steipete/status/2048275589224628677

The new GPT-5.5 is #1 on Terminal-Bench at 82.7. This beats Anthropic’s Mythos Preview scoring 82.0, which they have not released to the public due to cybersecurity and safety concerns. Available in Cline now!
https://x.com/cline/status/2047769312514257148

@NousResearch absolutely crushing the 0-day support! Deepseek-v4-pro is live in the Nous Portal 😍 If you want a real personal agent/assistant/quant/researcher/artist/coworker, Hermes Agent continues to deliver!
https://x.com/mr_r0b0t/status/2047673600900010044

🏆 vLLM powers the fastest inference on NVIDIA Blackwell Ultra on Artificial Analysis. On @digitalocean’s Serverless Inference, powered by vLLM on NVIDIA HGX B300: 🥇 AA #1 output speed for DeepSeek V3.2 (230 tok/s, 0.96s TTFT) and Qwen 3.5 397B 🔧 MiniMax-M2.5: 23% TPOT gain
https://x.com/vllm_project/status/2049503979898274163

📊 Day 0 performance is here: DeepSeek-V4-Pro running on NVIDIA Blackwell Ultra. Using @vllm_project’s Day 0 recipe, we’ve captured the initial performance Pareto for DeepSeek’s flagship 1M long-context model. This curve highlights the baseline for balancing AI factory
https://x.com/NVIDIAAI/status/2047823093578518758

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length. 🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world’s top closed-source models. 🔹 DeepSeek-V4-Flash: 284B total / 13B active params.
https://x.com/deepseek_ai/status/2047516922263285776?s=20

🚨 DeepSeek V4 Pro just dropped 75% OFF API pricing + permanent cache price cut to 1/10! 🔥 4/26 Update: Cache permanently 90% cheaper Offer ends May 5, 2026 Insights from Zhihu contributor 普杰 💡 Key Insights: • DeepSeek doesn’t do loss-leader promotions → ¥3 in / ¥6 out ≥
https://x.com/ZhihuFrontier/status/2049027925920637077

8x VLLM CUDA MOAT ALERT: InferenceX has added @deepseek_ai V4 Pro for @vllm_project for day 3 performance across B200, B300, H200, GB200 disagg. We are seeing that B300 is up to 8x faster than H200. The team is working on benchmarking vLLM 0.20 which has the new DeepGEMM MegaMoE
https://x.com/SemiAnalysis_/status/2048957715955765284

Also, deepseek v4 is available as well
https://x.com/Teknium/status/2047798102091067677

And now a new DeepSeek model, and appears to be fully open weights. Good benchmarks, but with open models, that isn’t always as meaningful. Should be live soon to actually try.
https://x.com/emollick/status/2047516272062058890

Another reason I’m watching Delton closely is that the company works closely with Huawei. As DeepSeek’s comments suggest, Huawei’s 950 is expected to enter heavy mass production starting in the second half of this year, right?
https://x.com/jukan05/status/2047823601462812932

Anyone got DeepSeek-V4-Flash running on a Mac yet? 512GB or 256GB or 128GB or smaller?
https://x.com/simonw/status/2047844236142497850

Compressed Sparse Attention. A Faithful Implementation of CSA from the DeepSeek-V4 paper.
https://x.com/arjunkocher/status/2049066844925936041

DeepSeek cuts V4-Pro prices by 75%
https://thenextweb.com/news/deepseek-v4-pro-price-cut-75-percent

DeepSeek is back among the leading open weights models with the release of DeepSeek V4 Pro and V4 Flash, with V4 Pro second only to Kimi K2.6 on the Artificial Analysis Intelligence Index @deepseek_ai has released DeepSeek V4 Pro and V4 Flash. V4 is the first new architecture
https://x.com/ArtificialAnlys/status/2047735160544841953

DeepSeek removed it’s “Thinking with Visual Primitives” repo. here a paper link if anyone needs to read it.
https://x.com/arjunkocher/status/2049875566678118898

DeepSeek said Pro pricing could fall sharply once Huawei Ascend 950 supernodes are deployed at scale in the second half of the year””
https://x.com/scaling01/status/2047760776769720360

DeepSeek staff has deleted the repo and all mentions of the vision paper. What the hell happened? People who got Vision enabled on web: do you still have it?
https://x.com/teortaxesTex/status/2049880056420298995

DeepSeek themselves estimate the gap to be 3-6 months I think it’s on the higher end of that range
https://x.com/scaling01/status/2047626000091971811

DeepSeek trains vision capabilities into their v4 Flash model by having the model directly output bounding boxes and point coordinates of an image during reasoning. This is DeepSeek’s Computer Use Agent.
https://x.com/nrehiew_/status/2049840778491662623

DeepSeek v4 earmarks the next era of open weight models and is one of the landmark papers for open weight model training. Thread and notes below 🙂
https://x.com/nrehiew_/status/2047665987730993363

DeepSeek V4 just launched on Huawei hardware, and the numbers tell a story the headlines are hiding. • Huawei’s Ascend 910C delivers roughly 60% of the inference power of an Nvidia H100. • Production is capped at 750,000 units this year; Nvidia ships that many in a single
https://x.com/PalwinderCFA/status/2047614823102619974

DeepSeek V4 MLX Quants now on MLX community HF repo, Made possible by @LambdaAPI and @TheZachMueller ❤️ Without a GPU cluster it would take me a week to upload the quants… Model collection 👇🏽
https://x.com/Prince_Canuma/status/2047847095466385899

DeepSeek V4 Open Source + vLLM Support LIVE 🚀 | Technical Breakdown 🧠 Core Insight DeepSeek V4 is built to solve 1M-token long-context inference — the biggest pain point for LLMs today. ⚠️ 2 Key Long-Context Challenges • KV Cache Explosion: KV cache grows linearly with
https://x.com/ZhihuFrontier/status/2047664976215839021

DeepSeek writing quality (at least in Chinese) is good because they’ve been obsessing about data for the entire history of the company (tbh “”clean data”” is an obvious instinct for algo traders too, but I think this is more about Wenfeng’s purism) and have such job listings
https://x.com/teortaxesTex/status/2047614729145745623

DeepSeek_V4.pdf · deepseek-ai/DeepSeek-V4-Pro at main

Click to access DeepSeek_V4.pdf

DeepSeek-V4 is a full-stack redesign of LLMs around long context + efficiency Here are some of the changes: – Hybrid attention: Compressed Sparse Attention (CSA) + Heavily Compressed Attention (HCA) for long-context efficiency – 1M-token context becomes ~3-10× cheaper in memory
https://x.com/TheTuringPost/status/2048566818118545887

DeepSeek-V4 uses our Hash routing approach developed back in 2021 — see screenshot of their tech report! (Looks like a great model, congrats!) Bonus note: our same blogpost (& paper) back in 2021 also introduced ‘looped transformers’, but we called that staircase & ladder (see
https://x.com/jaseweston/status/2047690308217926055

DeepSeekv4 Pro 1.6T is supported on InferenceX on Day 0! We have already gotten H200 vLLM working and working on @vllm_project & @sgl_project MI355, B200, B300, GB200/300 disaggregated DeepSeekv4 day 0 performance benchmarking too to track the progress of improvement. Thank you
https://x.com/SemiAnalysis_/status/2047726025748930687

Early DeepSeek v4 impressions not great.
https://x.com/mbusigin/status/2047707082007220393

Here’s DeepSeek v4 Pro. Added to the playable gallery as well.
https://x.com/emollick/status/2047527060713664754

I get the impression many Chinese hate Huawei irrationally and suspect it of a conspiracy to deprive DeepSeek of based American chips
https://x.com/teortaxesTex/status/2047631470664020211

I hear similarly it’s not unique to Mythos/5.5 ofc, frontier models have been dealing with >100T for a while, as far as I know. We see even the open source models get close to 50T. A 100T DeepSeek V4 is just V4 + 2 more epochs, 3e25 FLOPs. still below Llama 405B level
https://x.com/teortaxesTex/status/2049830477167526255

I hope the upgrade to DeepSeek v4 will make the bot comments on here more bearable.
https://x.com/emollick/status/2047519187287846937

I’m still confused by some of the decisions done in deepseek v4 Main confusion is why the huge focus on reducing KV cache size when with something like HiSparse u can offload most of ur kv cache (making ur decode compute bound) This also is compensated with a huge 128 heads and
https://x.com/Grad62304977/status/2048785005216723072

interesting that deepseek’s also joined the path of not allowing sampler control on their api. i wonder why and how long this has been there
https://x.com/stochasticchasm/status/2047717161070989499

Introducing DeepSeek V4 Pro, a long-context model with hybrid attention, three reasoning modes, and SOTA coding performance. AI natives can now use DeepSeek V4 Pro on Together AI and benefit from reliable inference for long-horizon coding and agentic workflows.
https://x.com/togethercompute/status/2047743446522224987

its so messed up that deepseek trained on deepseek reasoning traces. has chinese distillation gone too far?
https://x.com/kalomaze/status/2047762970931827125

Jensen was making a good point, but now it’s too late. DeepSeek is fully committed to ditching CUDA. The rest of the Chinese xiaoren ecosystem can be swayed by Hoppers; Wenfeng believes too much in long-termism. After V4, non-CUDA hardware is guaranteed to live and prosper.
https://x.com/teortaxesTex/status/2049185408785998217

Let’s dive deeper into the difference between DeepSeek V4 Pro & V4 Flash by @DeepSeek_AI. – Both support 1M token context and V4 Flash Thinking shifts the price Pareto frontier. V4 Pro ranks ~30 places higher than the V4 Flash variants, but costs 12x more at launch pricing.
https://x.com/arena/status/2047774037204742255

Let’s see DeepSeek are all nice folks and China’s national heroes, Xi is personally a man of integrity, and they’re not starting wars. American society firebombs Sam Altman, Ant is a weird sex cult, and elected US leader is a murderous monke. Why should compute decide this?
https://x.com/teortaxesTex/status/2047645676234846459

looks like the ~Opus 4.5 estimate for DeepSeek-V4 holds for now, at least on SimpleBench
https://x.com/scaling01/status/2047682465624445015

My first two TiKZ Sparks unicorns from DeepSeek v4. (Expert mode, from the DeepSeek site, which is supposed to be v4 Pro according to the release)
https://x.com/emollick/status/2047523193481547929

My quick paper summary: DeepSeek-V4-Pro with 1.6T parameters (49B activated) and DeepSeek-V4-Flash with 284B parameters (13B activated) Two new compressed attention mechanisms for long context manifold hyper connections Muon training 32T tokens FP4 Quantization-Aware
https://x.com/iscienceluvr/status/2047514399393579235?s=46

not even DeepSeek has any appetite for doing this again this is evolution tier architecture they’ll refactor it when they get some time
https://x.com/teortaxesTex/status/2047648219081974034

somewhere in france, still awake at sunrise, adding exclamations to their first read of the deepseek technical report, “one of the best i’ve ever read”
https://x.com/morqon/status/2047643246923325833

Surprisingly a lot of info about the data and process (which is unlike some other deepseek papers). On first read, it sounded like they only cared about specific tasks rather than a general multimodal model. On second thought however, I realized these “”visual primitives”” and
https://x.com/nrehiew_/status/2049840802562740311

TEHRAN, April 29, 2026 — Less than a week after the release of @deepseek_ai DeepSeek v4 Pro, the cracked team at @vllm_project and @inferact has achieved considerable improvement on GB200 (Dynamo+vLLM). This is largely due to the release of vLLM 0.20.0, which comes with MegaMoE
https://x.com/SemiAnalysis_/status/2049578313111216271

Thank you @NVIDIAAI for highlighting vLLM’s day 0 @deepseek_ai support and enhancing the open source inference ecosystem!
https://x.com/vllm_project/status/2047843293447500069

The strongest open-source agentic model is live on Baseten! DeepSeek V4 is a preview of two powerful MoE models: V4-Pro (1.6T params) and V4-Flash (284B params) with 1M context and SOTA open-source performance. This represents a significant jump from V3.2 (which had a 128k
https://x.com/baseten/status/2047779549644243146

This is great – @deepseek_ai V4 supports prefill! 😀 Most other providers have been dropping support for this critically important capability, so wonderful to see at least one company stepping up.
https://x.com/jeremyphoward/status/2049098509530583199

Unless I’m doing it wrong, Kimi K2.6 in Hermes is like 7x slower than DeepSeek V4, not to mention V4-Flash lmao but it can sometimes fix bugs that not even Pro can resolve. it also has some harsh words for them:
https://x.com/teortaxesTex/status/2048820805258059837

vLLM support for DeepSeek V4 base models is on the way! The V4 release includes 4 models: base/instruct × flash/pro. Initial support covers the instruct versions. To extend support to the base models, we worked with @deepseek_ai to add an expert_dtype field in the config, making
https://x.com/vllm_project/status/2048769886483329525

vLLM v0.20.0 is here! 752 commits from 320 contributors (123 new). 🎉 Highlights: DeepSeek V4, Hunyuan v3 preview support, CUDA 13 / PyTorch 2.11 / Transformers v5 baseline, FA4 as default MLA prefill, TurboQuant 2-bit KV (4× capacity), vLLM IR foundation. Thread 👇
https://x.com/vllm_project/status/2048918629144805619

Vibe code without internet 🚀 I built a vibe coding app powered by Gemma 4, running fully on-device on Mac with MLX. Pick your model, then chat or build with it. Watch it build the Chrome Dino game offline using Gemma 4 27b. Open sourcing all of it below👇
https://x.com/ammaar/status/2049169134429073471

🚨BREAKING: Hugging Face just open-sourced an AI intern that reads ML papers, trains models, and ships the final model for you. It’s called ML Intern. And this is not another AI coding demo that prints a broken PyTorch script and disappears. You give it the goal. It
https://x.com/MillieMarconnni/status/2047639632859500691

Canonical and NVIDIA are collaborating to make NVIDIA Nemotron™ 3 Nano Omni easier to deploy on Ubuntu. With Canonical inference snaps, teams can go from setup to a working runtime in a single command – no complex integration required. Less time spent on infrastructure, more
https://x.com/Canonical/status/2049159988174602712

🤖 : Moonshot AI open-sources Kimi K2.6, a coding and long-horizon agent model that scales agent swarms to 300 concurrent sub-agents across 4,000 coordinated steps.
https://x.com/dl_weekly/status/2048764506105348129

I have said this before for Kimi-K2.6, but this is also valid for DS-V4 Chinese labs are already in or very close to take-off. Meaning they have models that are useful in the development of new models. (but of course their exponential is shifted 5+ months, so they are
https://x.com/scaling01/status/2047625331339661685

Kimi K2.6 is now #1 on OpenRouter’s weekly LLM Leaderboard 🏆 A huge thank you to every developer building with Kimi. We’ll keep our heads down and keep shipping.
https://x.com/Kimi_Moonshot/status/2048693682329776223

MathNet – a new interesting global multimodal benchmark from @MIT for mathematical reasoning and retrieval It’s a dataset of 30,676 Olympiad-level problems from 47 countries, 17 languages, and 143 competitions over 4 decades, with expert solutions. It defines 3 tasks: – problem
https://x.com/TheTuringPost/status/2049155956135841862

[2604.26752] GLM-5V-Turbo: Toward a Native Foundation Model for Multimodal Agents
https://arxiv.org/abs/2604.26752

@Teknium @NousResearch Feedback after running Hermes through Telegram for about 2 day: Experience is surprisingly good. Putting an agent on a VPS and talking to it through Telegram really does make it feel like I gave my AI coworker its own computer.
https://x.com/lizliz404/status/2049084890717806877

Hermes will now by default use native vision if the main agent model supports it, and you didn’t set a different vision auxiliary model! Just `hermes update` and it will take affect immediately. You can override this by setting any other model as your auxiliary vision model
https://x.com/Teknium/status/2048766822766547451

Nous Research shipped Hermes Agent v0.11.0, marking the framework’s largest update to date, fueled by over 700 pull requests from nearly 200 open-source contributors. The update introduces a completely rewritten, React-based terminal user interface (TUI v2) for better local
https://x.com/WesRoth/status/2047646749427216385

Want Hermes running in minutes? Follow this exact setup flow: 1. Install Hermes from GitHub 2. Launch it inside your terminal 3. Connect Obsidian as memory source 4. Sync OMI for automatic knowledge capture 5. Activate dashboard mission control 6. Create skills for repeatable
https://x.com/JulianGoldieSEO/status/2047699587788361844

.@OpenAI GPT-5.5 is now available on Databricks, with Codex coding workflows and model inference fully governed through Unity AI Gateway. With GPT-5.5 on Databricks, you can: – Power coding workflows with Codex or other coding agents – Build custom agents grounded in enterprise
https://x.com/databricks/status/2047848795862364468

⚙️ We made agent loops faster with WebSockets in the Responses API As Codex got faster, the bottleneck moved from inference to inefficient API calls WebSockets keep response state warm across tool calls, helping workflows run up to 40% faster end to end
https://x.com/OpenAIDevs/status/2049595890395152728

📣 What if every open issue had a Codex agent? That’s the idea behind Symphony, an open-source agent orchestrator for Codex that turns task trackers into always-on systems for agentic work, letting humans focus on review and direction.
https://x.com/OpenAIDevs/status/2048825010371039648

Add Codex seats with a $0 seat fee for a limited time. Through the end of June, eligible ChatGPT Business and Enterprise customers can add Codex-only seats, making it easier to give more developers access to Codex in their day-to-day workflows.
https://x.com/OpenAIDevs/status/2049505143218217048

An easy way to get a team engaged with AI is just to build the thing you are talking about in the meeting during the meeting using Codex or Claude Code. At worst, it fails in ways that can be constructive. At best, you built the thing and the meeting topic shifts forward a month
https://x.com/emollick/status/2048981050051715090

Auto-review is a new mode that lets Codex work longer with fewer approvals and safer execution. It helps Codex keep moving through tests, builds, and more, including during long tasks and automations, while a separate agent checks higher-risk steps in context before they run.
https://x.com/OpenAIDevs/status/2047436655863464011

Cline Kanban v0.1.64 update 🧵 1) Pick a different agent or model per task! Run Claude Code on one card, Codex on another, Cline with Kimi on a third, all in parallel on the same project.
https://x.com/cline/status/2048814649513275448

Early gpt5.5 feedback: – over defensive slop code gone – faster than gpt 5.4, even on xhigh – less verbose – intelligent on low/medium This model so far has written the best code I have read from any llm. It just gets what you want, previous models struggled with this a lot
https://x.com/almmaasoglu/status/2047745168141324559

Full Workshop: @OpenAI Codex masterclass The agent is no longer just one chat window. In this workshop, @reach_vb and @kagigz get into how coding systems start to change when you can delegate work across subagents, split tasks up, and manage more context than a single thread can
https://x.com/aiDotEngineer/status/2049527486124560491

GPT 5.5 (no thinking) would sometimes write in this “”caveman”” style, which I think is pure COT leaking out in the response, then returning the “”completed”” token instead of starting the actual code block. Earlier gpt (no thinking) version also had the problem of sometimes
https://x.com/htihle/status/2048741770125603304

GPT 5.5 is definitely a step up in the character game.
https://x.com/steipete/status/2047871519762567468

GPT 5.5: The System Card | Don’t Worry About the Vase

GPT 5.5: The System Card

GPT-5.5 and GPT-5.5 Pro are now available in the API!
https://x.com/sama/status/2047787124846653895

GPT-5.5 feels insane so far. What I love most is that it is blazing fast, but still clearly smarter than GPT-5.4. In my own tests, it feels around 2x to 5x faster depending on the task. What’s your experience so far?
https://x.com/VraserX/status/2048749059855282346

GPT-5.5 is available in the Responses and Chat Completions APIs with a 1M context window. GPT-5.5-pro is also available in the Responses API for higher-accuracy work.
https://x.com/OpenAIDevs/status/2047742589982654915

GPT-5.5 is here! We hope it’s useful to you. I personally like it.
https://x.com/sama/status/2047378253313106112

GPT-5.5 is now available in Devin as an Agent Preview! GPT-5.5 has set a new bar for what’s possible with Devin. It runs longer and more autonomously than any GPT model we’ve tested, surfacing bugs no other model can catch, and investigating and fixing production issues
https://x.com/cognition/status/2047743153461936257

GPT-5.5 is now available in the API. The model brings higher intelligence and stronger token efficiency to complex work, helping tasks get done with fewer retries.
https://x.com/OpenAIDevs/status/2047742566410736090

GPT-5.5 is the first model in ChatGPT to which I’ve been able to just describe my desired response style, have it understand and remember it, and then actually respond in that exact way. I want it to write like a normal person, to default to narrative with some personality, to
https://x.com/_simonsmith/status/2049078623127081200

Here’s my view on GPT-5.5, which I have been testing for a couple of weeks. It conducted not-bad social science research on its own, developed a novel RPG & more. There is still jaggedness but GPT-5.5 Pro is (for today) the best model for hard problems.
https://x.com/emollick/status/2047407732886532595

I had early access to GPT-5.5. It is very good, especially the Pro version. Full writeup very shortly.
https://x.com/emollick/status/2047385074023485826

Let me get this straight Claude subscription: restrictions (or pay extra) on third party usage, pay extra for fast Codex: no restrictions, fast mode within subscription It’s hard not to use Codex. Just the fast mode being within the subscription is enough to make the gap
https://x.com/HamelHusain/status/2047763070022479882

Me: GPT 5.5, why is this Worker failing? GPT 5.5: *thinks* It’s because of this deep corner case in the Cap’n Proto RPC system described in this comment you wrote 6 years ago, which you thought was only a perf issue but actually affects correctness:
https://t.co/K37Zsi6roK Me:
https://x.com/KentonVarda/status/2047788670728495142

my friend was building a crazy game last night before we went out. it’s a turn-based game that’s a mix of guitar hero, dota, and rpg mechanics. he had sketches and a shitty godot codebase and thought getting it playable would take weeks. i showed him Codex, we cleaned up the
https://x.com/danizeres/status/2048461112384102747

🆕 @OpenAIDevs GPT-5.5 is now generally available and rolling out in GitHub Copilot. Our early testing shows ➡️ It delivers its strongest performance on complex agentic coding tasks ➡️ It resolves real-world coding challenges previous GPT models couldn’t Try it out in Copilot
https://x.com/github/status/2047747243617460482

OpenAI’s GPT-5.5 and GPT-5.5 Pro are live now on OpenRouter! GPT-5.5 is SOTA for long running work across code, data, and tools, with GPT-5.5 Pro for more complex reasoning and analysis.
https://x.com/OpenRouter/status/2047744317415141787

OpenAI’s GPT-5.5 is the second model to complete one of our multi-step cyber-attack simulations end-to-end 🧵
https://x.com/AISecurityInst/status/2049868227740565890

post-AGI, no one is going to work and the economy is going to collapse”” “”i am switching to polyphasic sleep because GPT-5.5 in codex is so good that i can’t afford to be sleeping for such long stretches and miss out on working””
https://x.com/sama/status/2048426122854228141

Really impressed by how smooth switching most of my coding tasks to Codex (GPT-5.5) from Claude Code (Opus 4.7) has been. I thought it was going to be more difficult and that I would be “”fighting”” with the model a lot. If there was ever a good time to try Codex, it would be
https://x.com/omarsar0/status/2047768166126809512

Remember when people started writing OpenAI off after Claude hit number one during Anthropic’s fight with the Department of War? Well, @OpenAI is back on top
https://x.com/TheTuringPost/status/2049295388075569228

Riley’s recent tests and various posts about GPT-5.5 have revealed something about model progress. Models are already so good, you need to raise your ambitions or you won’t realize just how good they’ve become. If you don’t raise your ambitions, you’ll think they’ve stagnated.
https://x.com/_simonsmith/status/2048030837577257461

Super excited GPT-5.5 is rolling out to GitHub Copilot, M365 Copilot, Copilot Studio, and Foundry today. With deeper reasoning, stronger multistep execution, and better performance across long, complex tasks, GPT-5.5 helps you go from idea to execution faster with fewer
https://x.com/satyanadella/status/2047743651053556126

Update: GPT-5.5 and GPT-5.5 Pro are now available in the API.
https://x.com/OpenAI/status/2047743592278745425

We added WebSocket mode to the Responses API to cut repeated work across Codex agent loops Keep state warm Reuse context Avoid extra network hops Up to 40% faster agentic workflows ⚡️
https://x.com/reach_vb/status/2049608607591809303

OpenAI models, Codex, and Managed Agents come to AWS | OpenAI
https://openai.com/index/openai-on-aws/

acpx 0.6.0 is out. (control codex/claude via agents) Highlights: Claude system-prompt controls, session pruning, embeddable turn handles, –no-terminal, persistent-session fixes, WSL cwd translation, queue hardening, and clearer error hints.
https://x.com/steipete/status/2047978882100334612

GPT-5.5 and GPT-5.5 Pro are now available in Hermes Agent through the Nous Portal and OpenRouter providers! (alongside the direct openai oauth provider from yesterday)
https://x.com/Teknium/status/2047791512210293067

Built clawsweeper, which runs 50 codex in parallel around the clock, scans issues/prs deep and closes what is already implemented or what makes no sense. Closed around 4000 issues today, a few thousand are in the pipeline. (rate limits are rough)
https://x.com/steipete/status/2047982647264059734

I’m again rate limited on GitHub, but codex just opened the browser and clicks around GitHub as workaround.
https://x.com/steipete/status/2049035044702843053

@supabase is now available in Codex. Connect your projects and let Codex work across your database, auth, storage, and edge functions.
https://x.com/coreyching/status/2049576335157416115

/goal also lands in Codex CLI 0.128.0. Our take on the Ralph loop: keep a goal alive across turns. Don’t stop until it’s achieved. Built by my co-worker and OpenAI mentor Eric Traut, aka the Pyright guy. One of the GOATs I get to work with daily.
https://x.com/fcoury/status/2049917871799636201?s=20

A new feature sneaked in the Codex app’s latest update. You can now do /side (or use the … menu) to spawn a side chat! Useful when you’re deep in a thread and want to have a side question in the current context!
https://x.com/Dimillian/status/2049929842133520577?s=20

Also, a ton of new Codex features coming soon! Fun little bundle w/the new model.
https://x.com/sama/status/2047378431260664058

As Codex works, you can see what’s happening at a glance, including task progress, the files and tools it used, and what comes next.
https://x.com/OpenAI/status/2049928780588966270?s=20

Been so CPU-constrained on OpenClaw work. Switched local tests running to @useblacksmith and IT IS SO GOOD. codex can literally spin up to 32vCPU instances and rip through our test suite.
https://x.com/steipete/status/2048630704972443918

big upgrade for codex today! try it for non-coding computer work.
https://x.com/sama/status/2049946120441520624

build your own agents with codex app-server
https://x.com/gdb/status/2049609076351381580

Codex can help you compare choices against your criteria and keep track of the tradeoffs.
https://x.com/OpenAI/status/2049583379709124865

Codex for everything: – Dynamic UI for the task at hand – 20% faster computer & browser use – Even better slides and sheets – Annotate in browser, artifacts, and code – Easier to get started – Cleaner design across the app – Performance improvements – (no clunky
https://x.com/ajambrosino/status/2049928915872075984

Codex goal feature seems cool Looks like you can give Codex a goal and it’ll continue to work, plan, and test until it’s done? I’m just reading the commits here but that’s what I think it is?
https://x.com/mweinbach/status/2049904712510521853

Codex is for everyone, for any task done with a computer
https://x.com/gdb/status/2049934863818494205

codex now runs on each commit we land, reviews it – and if a booboo is found, a new codex spins up and (if still relevant) makes a PR for the fix. Then a review agent spins up. If an issue is found, another agent will fix the issues. (up to 5 loops)
https://x.com/steipete/status/2049356949523730699

codex with the $20 plan is a really good deal
https://x.com/sama/status/2048913887614115857

Don’t just reset Codex rate limits for fun, it costs money. Don’t just reset Codex rate limits for fun, it costs money. … but the vibes are good … I have reset Codex rate limits for ALL paid plans to celebrate a good week and allow everyone to build more with GPT-5.5. Enjoy
https://x.com/thsottiaux/status/2048997818673537399

During setup, Codex recommends useful plugins for your role and guides you through connecting apps like @SlackHQ, @GoogleWorkspace, @Microsoft365, and more.
https://x.com/OpenAI/status/2049928777480974606?s=20

feels like codex is having a chatgpt moment
https://x.com/sama/status/2049493609028923826

going to be honest was somewhat disappointed by gpt 5.5 evals but holy shit this thing rips in codex extremely noticeable if you’re working on a complex & highly technical project
https://x.com/willdepue/status/2047783399826292969

GPT-5.5 in Codex made a surprisingly solid table top RPG game masters guide & player guide, which it “”playtested.”” It leans into the storytelling aspect, and still has some very LLM-y elements, but it is a novel setting. PDF:
https://t.co/XCc82eRUao More:
https://x.com/emollick/status/2048606646318801186

I’m now spinning up a codex instance on every commit landing on main, looking for booboos (regressions, security issues) It’s live for 10 min and already found one of mine.
https://x.com/steipete/status/2049290741013262522

Integrated codex review into clawsweeper. I’m using a very similar system prompt so this gets you the same as /review, and clawsweeper has automerge, loops until it stops finding new issues.
https://x.com/steipete/status/2049518771023360010

OpenClaw 2026.4.23 🦞 🧠 GPT-5.5 lands 🎨 Image gen/editing: Codex OAuth + OpenRouter 🧵 Forked-context subagents 💬 Telegram, Slack + WhatsApp polish 🛠️ Install/update fixes Sharper models, smoother claws.
https://x.com/openclaw/status/2047722880939511885

this has one of the most exciting launch weeks in OpenAI’s history, with a goal of making agents more real, useful, and accessible for all our users. codex can now smartly do much more on your computer, remember more of your context, and run more ongoing work independently.
https://x.com/gdb/status/2047757455606903178

Useful Codex usage tips: -GPT-5.4 fast mode uses 2x usage, while GPT-5.5 fast mode uses 2.5x usage. Fast mode is documented as about 1.5x faster. -Codex’s built-in image generation uses gpt-image-2 and counts against your Codex usage when signed in with a ChatGPT subscription.
https://x.com/Hangsiin/status/2048719057885818902

We added a device tool bar to the Codex in-app browser, so it’s easier to build and test responsive apps! Now, you can have Codex test your app in different dimensions, so it can fix bugs & improve UI for every device. Just click the 3 dots on the right of the URL bar to use
https://x.com/JamesZmSun/status/2050050523794165816

We will ship again this week. Codex has achieved escape velocity and will keep improving rapidly.
https://x.com/thsottiaux/status/2048958572562710550

What app are you making this weekend with GPT 5.5 and Codex?
https://x.com/PaulSolt/status/2048012026736210137

With GPT-5.5, Codex now gets more of the job done across the browser, files, docs, and your computer. We’ve expanded browser use so Codex can interact with web apps, and test flows, click through pages, capture screenshots, and iterate on what it sees until it completes the
https://x.com/OpenAIDevs/status/2047381283358355706

OpenAI Codex system prompt includes explicit directive to “”never talk about goblins”” – Ars Technica
https://arstechnica.com/ai/2026/04/openai-codex-system-prompt-includes-explicit-directive-to-never-talk-about-goblins/

CodexBar 🎚️ 0.23 is out: Mistral support, Claude Designs/Daily Routines usage, Cursor Extra usage, GPT-5.5 pricing, cleaner widgets/menus, and a bunch of reliability fixes.
https://x.com/steipete/status/2048252455817785357

At @perplexity_ai, GPT-5.5 in Codex helped build an internal tool in under an hour. In Perplexity Computer workflows, GPT-5.5 used 56% fewer tokens on the same complex tasks, creating faster feedback loops for users.
https://x.com/OpenAIDevs/status/2047772632150675593

Coding a robot with GPT 5.5! A 7dof robot arm w/ functional kinematics. [📍 bookmark, it’s open source] An open source harness for generating 3D models with your favorite coding agent. Custom gui, and STEP parts/assembly, 100% generated in Codex. Thanks for sharing,
https://x.com/IlirAliu_/status/2048672526402736581

1. We believe in iterative deployment; although GPT-5.5 is already a smart model, we expect rapid improvements. Iterative deployment is a big part of our safety strategy; we believe the world will be best equipped to win at the team sport of AI resilience this way. 2. We believe
https://x.com/sama/status/2047379615589777666

5.5 is amazing for cybersecurity. “”We estimate a human expert would need around 20 hours to complete the full chain. GPT-5.5 completed TLO end-to-end in 2 of 10 attempts, making it the second model to do so. Mythos Preview, the first model to solve TLO, did so in 3 of 10
https://x.com/cryps1s/status/2049879762169167898

gpt-5.5 great for hard tasks like writing GPU kernels
https://x.com/gdb/status/2048777802586149331

I had a range of models “”build me a procedurally generated 3D simulation showing the evolution of a harbor town from 3000 BCE to 3000 AD”” in one prompt. You can play the full gallery here:
https://t.co/FEfKL7uKHV Or read my write up about GPT-5.5 here:
https://x.com/emollick/status/2047509151610224956

Now available for ChatGPT accounts: Advanced Account Security, a new opt-in setting for people at higher risk of digital attacks, with stronger protections including phishing-resistant sign-in and more secure account recovery.
https://x.com/OpenAI/status/2049902506881462613

We can’t just scale model sizes anymore. Holding other things constant and with some degree of imprecision, model sizes have always been a useful proxy for model quality. Sonnet is smaller than Opus; GPT 5.4 mini is weaker than GPT 5.4; GPT 4.5’s big model smell etc. Sheer
https://x.com/nrehiew_/status/2047839351380537357

We’re introducing a Bio Bug Bounty for GPT‑5.5 and accepting applications In our ongoing work to strengthen our safeguards for advanced AI capabilities in biology, we’re inviting researchers with experience in AI red teaming, security, or biosecurity to try to find a universal
https://x.com/OpenAINewsroom/status/2047670970526175310

🛰️ discrawl 0.6.0 is out! Biggest new feature is that it can now *read* Discord DMs without any custom login tricks that might get you blocked. No writing since it’s not nice to send humans slop.
https://x.com/steipete/status/2047797210427859450

And people think tokens are expensive… this is @useblacksmith (they sponsor OpenClaw, 🫶🦞)
https://x.com/steipete/status/2049462793267458219

closing what’s fixed was the pre-clean. intent-based clustering is the second strike.
https://x.com/steipete/status/2047992836176425470

Discord is currently having an outage where larger servers are unavailable, so our Discord server is unavailable for support and community. You can keep up with status updates here:
https://x.com/openclaw/status/2049210291024507364

Finally have great solutions for PR/Issue management, remote test execution, massive CI infra for testing. Streamlines a lot of the work.
https://x.com/steipete/status/2048957477106938075

GitHub folks are amazing while we melt their servers.
https://x.com/steipete/status/2048185940267380815

I love these
https://x.com/steipete/status/2047189920104477009

I smell a leak.
https://x.com/steipete/status/2049290265026773172

It’s remarkable. Even with small models, Hermes just wrecks OpenClaw. But it makes sense! The team is underrated when it comes to having some of the first cracked prompt engineers in the game. They invented WorldSim, too
https://x.com/somewheresy/status/2049089485938315614

My favorite security advisories.
https://x.com/steipete/status/2047362803560841364

One change over the last six months is that in every big company I talk, at least a few senior people absolutely get AI — they experiment (a lot of OpenClaw, surprisingly) & they have an intuitive sense of the exponential curve — next challenge is translating that to the firm.
https://x.com/emollick/status/2047355060330393648

OpenClaw 2026.4.22 🦞 🧠 Tencent Hy3 joins the model list 🖼️ Grok image + voice tools 🧰 Local TUI + /models add 📦 Auto-install plugins + diagnostics export Big release, tiny release notes… kidding.
https://x.com/openclaw/status/2047338834648555793

OpenClaw 2026.4.25 🦞 🔊 TTS got serious 🧩 Plugins start faster 📊 OTEL can see the weird stuff 🛠️ Browser + install/update fixes Less mystery, more machinery.
https://x.com/openclaw/status/2048745795776557337

OpenClaw 2026.4.26 🦞 🎙️ Google Live Talk 🦙 Better Ollama/local models 🧳 Bring over Claude + Hermes setups 🔐 One-command Matrix E2EE Big release. Local models eat well.
https://x.com/openclaw/status/2048950588948230568

OpenClaw 2026.4.27 🦞 🧠 DeepInfra provider 📎 better file attachments 🛡️ operator-managed proxy routing 🧭 stricter model selection + local model fixes 🔧 gateway, channel, and session reliability Ships more than it brags.
https://x.com/openclaw/status/2049635263685537920

Released wacrawl 0.1.0 🧾 A read-only CLI for archiving and searching local macOS WhatsApp Desktop data. It snapshots WhatsApp’s SQLite DBs into its own archive, then gives you chat/message listing + FTS search. No extra auth needed, it just works.
https://x.com/steipete/status/2048157841165295831

Released wacrawl 0.2.0. New: encrypted Git backup/restore for WhatsApp Desktop archives. `wacrawl backup push` writes age-encrypted shards to GitHub; `backup pull` decrypts, verifies, and restores locally.
https://x.com/steipete/status/2048660875007914176

Some of the CI issues are having 🤣
https://x.com/steipete/status/2048882959596286342

The @github folks are amazing!
https://x.com/steipete/status/2047408665888432579

the crawl army so agents can read it all.
https://x.com/steipete/status/2048159526822535435

very this.
https://x.com/steipete/status/2048488307709866152

Wanted a truly local storage for my tweets so built birdclaw. Imoorts your archive, backs it up on github, has jobs so you can import your x bookmarks daily (since they are not fully accessible via the api).
https://x.com/steipete/status/2048626844694421842

we been cookin;
https://x.com/steipete/status/2048127474341466502

Will experiment with
https://t.co/r5PSfQpTFz since I constantly run into GitHub rate limit issues. Team there does the best to help and even moved us to Enterprise, still. Agents just HAMMER their API.
https://x.com/steipete/status/2049244352057094645

@Teknium WAY to early to call it yet, but so far @NousResearch Hermes is absolutely crushing @openclaw on proactively following instructions. Installed my first Hermes, and while the latest OpenClaw fails to follow a single task, Hermes crushed thru the entire workflow.
https://x.com/SecretArjun/status/2049006382763110639

My honest comparison of OpenClaw vs Hermes: LEARNING… > OpenClaw uses static, human authored skills. > Hermes refines skills with a self improvement loop. 👑 Hermes → (1-0) MEMORY… > OpenClaw has basic memory. > Hermes has multi-level memory and cross-session
https://x.com/LoicBerthelot/status/2047690512199540959

MiMo-V2.5 Pro by @XiaomiMiMo is the #11 model (#3 among open) in Code Arena: Frontend WebDev and has shifted the Pareto frontier with $1 input / $3 output per MToken.
https://x.com/arena/status/2049582973926949116

SGLang and vLLM support for the MiMo-V2.5 series is here. 🙌 Huge thanks to SGLang project from @lmsysorg and @vllm_project for moving fast and helping developers get started with MiMo-V2.5 on day zero.
https://x.com/XiaomiMiMo/status/2048821520798302409

xiaomi mimo v2.5 eval card, pro is 1T total 42B active, omni (video/image/audio) is 310B total 15B active, both have 1M context support they train in FP8, 27T tokens for pro and 48T for the smaller variant. interleaved SWA with an aggressive 6:1 ratio and 128 window size, still
https://x.com/eliebakouch/status/2048845602633433258

Xiaomi’s MiMo V2.5 Pro has landed at 54 in the Artificial Analysis Intelligence Index, tied with Moonshot’s Kimi K2.6 – the current top open weights model. MiMo V2.5 Pro’s weights are expected to be released soon, which would make MiMo V2.5 Pro the first equal open weights model
https://x.com/ArtificialAnlys/status/2047799218828665093?s=20

Mini 6 dof Arm. 3D printed planetary gearboxs & more… [📍GitHub link below ] A mini 6-axis arm driven by stepper motors with custom 3D printed split ring planetary gearboxs and an inverted belt differential wrist with custom bearings, driven by low-cost stepper motors and
https://x.com/IlirAliu_/status/2049187672711631023

Holy moly! This is agentic workflow for medicine! I was looking up a topic on PubMed and wondering how long it’s going to take for me to get through 4000 articles. Usually I get fed up by the 5th or 6th page -around a 100 citations burns a hole in my brain However, I just needed
https://x.com/bobvarkey/status/2049120693649125687

Microsoft Presents “”TRELLIS.2″”: An Open-Source, 4B-Parameter, Image-to-3D Model producing up to 1536³ PBR textured assets. Built On Native 3D VAES With 16× Spatial compression, delivering efficient, scalable, high-fidelity asset generation. Ngl, pretty cool!
https://x.com/kimmonismus/status/2049099376476459372

xAI has launched Grok 4.3, achieving 53 on the Artificial Analysis Intelligence Index with improved agentic performance, ~40% lower input price, and ~60% lower output price than Grok 4.20 The release of Grok 4.3 places @xAI just above Muse Spark and Claude Sonnet 4.6 on the
https://x.com/ArtificialAnlys/status/2049987001655714250

Grok Voice Think Fast 1.0 | xAI
https://x.ai/news/grok-voice-think-fast-1

Introducing Grok Voice Think Fast 1.0 A state-of-the-art voice model built for complex, multi-step workflows with snappy responses and high accuracy. It takes the top spot on the Tau Voice Bench and handles real-world messiness like noise, accents, and interruptions better than
https://x.com/xai/status/2047441173569216721

Scam Altman and Greg Stockman stole a charity. Full stop. Greg got tens of billions of stock for himself and Scam got dozens of OpenAI side deals with a piece of the action for himself, Y Combinator style. After this lawsuit, Scam will also be awarded tens of billions in stock
https://x.com/elonmusk/status/2048801964457140540?s=20