Image created with OpenAI gpt-image-1. Image prompt: Single-panel cartoon with loose, hand‑inked lines, bean‑bodied figures, muted flat colors, minimal props, and deadpan humor: Therapy circle of teddy bears wearing smiley masks; one bear holds a laptop displaying code hearts. Large bold title text centered at top: “HUGGINGFACE” Muted colors, flat shading, black ink outlines. 16:9. Caption: “New release adds 20 % more warm fuzzies.”

“Nvidia just released Describe Anything 3B – Multimodal LLM for Detailed Localized Image and Video Captioning ⚡ > integrates full-image/ video context with fine-grained local details using a focal prompt and a localised vision backbone with gated cross-attention DAM-3B > https://x.com/reach_vb/status/1914962078571356656

“Nvidia presents Eagle 2.5! – A family of frontier VLMs for long-context multimodal learning – Eagle 2.5-8B matches the results of GPT-4o and Qwen2.5-VL-72B on long-video understanding https://x.com/arankomatsuzaki/status/1914517474370052425

“Adobe announced DRAGON on Hugging Face Distributional Rewards Optimize Diffusion Generative Models https://x.com/_akhaliq/status/1914602497148154226

“Spotify just announced ViSMaP on Hugging Face Unsupervised Hour-long Video Summarisation by Meta-Prompting https://x.com/_akhaliq/status/1915703054701044209

“LiveCC just dropped on Hugging Face Learning Video LLM with Streaming Speech Transcription at Scale video LLM capable of real-time commentary, trained with a novel video-ASR streaming method, SOTA on both streaming and offline benchmarks. https://x.com/_akhaliq/status/1915094398364197101

MiniLLM (MiniLLM) https://huggingface.co/MiniLLM

OmDet-Turbo https://huggingface.co/docs/transformers/main/en/model_doc/omdet-turbo

“NEW: You can now use Dia 1.6B SoTA Text-to-Speech model directly on Hugging Face via @FAL 🔥 You can get up-to 25 generations for less than a dollar 🤗 Run it 5 lines of code too: import requests API_URL = “https://router.huggingface. co/fal-ai/fal-ai/dia-tts” headers = { https://x.com/reach_vb/status/1915418386818834792

“We shipped an alpha version of the new Surya OCR model. No hype, just facts: – 90+ languages (focus on en, romance langs, zh, ar, ja, ko) – LaTeX and formatting – Char/word/line bboxes – ~500M non-embed params – 10-20 pages/s https://x.com/VikParuchuri/status/1915492483955384659

“Perception Encoder models and datasets: https://x.com/mervenoyann/status/1915723397272654194

sand-ai/MAGI-1 · Hugging Face https://huggingface.co/sand-ai/MAGI-1

“TextArena went live on Hugging Face It’s an open-source collection of competitive text-based games for LLMs, spanning 57+ unique environments Tests for different agentic behaviors—negotiation, theory of mind, deception, via competitive play https://x.com/rowancheung/status/1914567435228795391

“Fuck it, starting today you can run inference across 30,000+ Flux and SDXL LoRAs on the Hugging Face Hub via Inference Providers (powered by @FAL ⚡) And.. it gets better, you can generate over 40+ images in less than A DOLLAR! Go try it now on your favourite LoRA on HF 🤗 https://x.com/reach_vb/status/1915830938438717777

“Nvidia just dropped Describe Anything on Hugging Face Detailed Localized Image and Video Captioning https://x.com/_akhaliq/status/1914917564137828622

“vLLM🤝🤗! You can now deploy any @huggingface language model with vLLM’s speed. This integration makes it possible for one consistent implementation of the model in HF for both training and inference. 🧵 https://x.com/vllm_project/status/1912958639633277218

Chat UI Energy Score – a Hugging Face Space by jdelavande https://huggingface.co/spaces/jdelavande/chat-ui-energy

“So cool to see transformers becoming the source of truth for model definition & collaborating with wonderful partners like vLLM to have these models run everywhere the fastest! As a model builder, it means that you integrate with Hugging Face & instantly get hundreds of https://x.com/ClementDelangue/status/1914432076956262495

“Meta released WebSSL DINO & ViT models on Hugging Face 300M to 7B 🔥 Notes: > Visual SSL outperforms CLIP on Vision-Centric VQA and closes the gap on OCR & Chart tasks when scaled properly > CLIP saturates at 3B parameters, while SSL shows log-linear improvements up to 7B+ > https://x.com/reach_vb/status/1915453821251375552

“ByteDance just announced QuaDMix on Hugging Face Quality-Diversity Balanced Data Selection for Efficient LLM Pretraining https://x.com/_akhaliq/status/1915656590130036887

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading