Flux[dev]: Computer source code is arranged to create the form of a futuristic robot, sleek smooth humanoid design. Smooth, glossy black faceplate with no visible facial features, high-tech, minimalist appearance. The robot’s body is matte black or dark gray, with articulated joints and mechanical parts that resemble those of a human, including fingers. In the foreground, “Open Source” is written in glowing green system font.

Hugging Face

“Exclusive: Hugging Face just bought a machine learning platform called XetHub, started by former Apple employees, aimed at letting developers build large-scale AI models

XetHub is joining Hugging Face!

Meta/Llama

“Idefics3-Llama is out! 💥 It’s a multimodal model based on Llama 3.1 that accepts arbitrary number of interleaved images with text with a huge context window (10k tokens!) 😍 Link to demo and model in the next one 😏

Call for Applications: Llama 3.1 Impact Grants

“It’s curious how Llama 405b’s performance drops by 5 percentage points when using standard simple-evals prompts instead of its native Llama 3.1 prompts. Other models show much less sensitivity to this prompt change and fall nicely along the 45-degree line.

“📣 Today we’re opening a call for applications for Llama 3.1 Impact Grants! Until Nov 22, teams can submit proposals for using Llama to address social challenges across their communities for a chance to be awarded a $500K grant. Details + application ➡️

“New smol-vision tutorial dropped: QLoRA fine-tuning IDEFICS3-Llama 8B on VQAv2 🐶 Learn how to efficiently fine-tune the latest IDEFICS3-Llama on visual question answering in this notebook 📖 Link in the next one 🤗

“@huggingface This is the direct successor of Meta-Llama-3-120B-Instruct, a self-merge of Llama 3 70B that produced great results in tasks like creative writing.

“The methods from this paper were able to reliably jailbreak the most difficult target models with prompts that appear similar to human-written prompts. Achieves attack success rates > 93% for Llama-2-7B, Llama-3-8B, and Vicuna-7B, while maintaining model-measured perplexity <

Mistral

“Introducing @MistralAI agents! You can now build your agents based on Mistral models or fine-tuned models and use on Le Chat 🙌 More features coming soon!

“Mistral Large 2 (2407) is now on @lmsysorg. It performs extremely well in the Coding, Hard Prompts, Math, and Longer Query categories, where it outperforms GPT4-Turbo and Claude 3 Opus. It is also doing very well in Instruction Following where it ranks above Llama 3.1 405B.

Mistral AI’s CEO on Microsoft and Europe’s AI Ecosystem | TIME

“.@MistralAI Mistral Large doing well on the @allen_ai ZebraLogic benchmark despite being much smaller than the other models 🙌

Build, tweak, repeat | Mistral AI | Frontier AI in your hands

Qwen

“CONGRATS to @Alibaba_Qwen team on Qwen2-Math-72B outperforming GPT-4o, Claude-3.5-Sonnet, Gemini-1.5-Pro, Llama-3.1-405B on a series of math benchmarks 👏👏👏

Introducing Qwen2-Math | Qwen

Other Open Source News

“🔥 Meet Yi-Large Turbo: the powerful, cost-effective upgrade to Yi-Large. Faster and more affordable at only $0.19 per 1M tokens for input and output. Ideal for heavy data tasks like complex inference and high-quality text generation. Check it out now: 

Introducing Palmyra-Med and Palmyra-Fin – Writerhttps://writer.com/blog/palmyra-med-fin-models/

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading