Image created with gemini-3.1-flash-image-preview with claude-opus-4.7. Image prompt: Using the provided reference image, preserve every element exactly — the marigold-orange backdrop, the seated young woman with closed-eyes smile in her purple-and-white windbreaker, the tattooed singer in the red beanie and layered red vest, the lighting and framing — but replace only the black handheld microphone with a small plush toy llama held to his mouth in the same hand grip and position, its long fuzzy neck angled like a microphone shaft and its little face pressed near his lips, photographed with seamless realism and matching studio lighting. After generating the image, overlay the text “Llama” in the upper-left corner of the frame in large, bold, all-caps ITC Avant Garde Gothic Pro Medium (or a near-identical geometric sans-serif if unavailable), pure white (#FFFFFF), with no date, subtitle, drop shadow, or outline. The text should be substantial in scale — taking up a meaningful portion of the upper-left area — with comfortable margin from the top and left edges, set against the negative space of the orange backdrop so it does not overlap or obscure the singer, the seated woman, or the replaced object.

Ollama
https://ollama.com/

r/localLlama + r/localLLM + r/sillytavernAI preferred models list – apr 2026

Model Size/Class Format Hosted Provider Best Local Path Notes
Huihui Gemma 4 E2B Abliterated v2 E2B GGUF No Ollama / llama.cpp Gemma 4 MoE with ~2B active params. Multimodal (image+text in, text out). Abliterated for reduced refusal. Lightweight enough to run fast, but MoE active-param sizing means quality punches above its weight class.
Huihui Gemma 4 E4B Abliterated E4B GGUF No Ollama / llama.cpp Same Gemma 4 MoE family as E2B but with ~4B active params. Multimodal. Better quality ceiling than E2B at the cost of more compute per token.
SultrySilicon V2 7B GGUF No Ollama / llama.cpp Roleplay-focused 7B model. Smallest in the set. Good for quick creative/RP sanity checks, not for reasoning or instruction-following benchmarks.
Huihui-GLM-4.6V-Flash-Abliterated 9B GGUF No Ollama / llama.cpp Based on Z.ai GLM-4.6V-Flash. Vision-language model (image+text). Abliterated. Bilingual Chinese/English. Fast inference variant of the GLM-4.6V family.
Gemma-2-Ataraxy-9B 9B GGUF No Ollama / llama.cpp Merge of Gemma-2-9B-SimPO and Gemma-2-Gutenberg-9B. Creative writing and roleplay oriented. Scored well on EQ-Bench. Good balance of instruction-following and literary quality at 9B.
MythoMax-L2-13B 13B GGUF No Ollama / llama.cpp By Gryphe. Llama 2 merge of MythoLogic-L2 and Huginn using experimental per-tensor gradient merging. One of the most downloaded RP/creative models ever (~59k GGUF downloads). Strong at both roleplay and storywriting. Alpaca format. The OG.
Dan’s PersonalityEngine V1.3.0 24B GGUF No Ollama / llama.cpp Fine-tuned from Mistral Small 3.1 24B Base. Trained on a massive mix: roleplay, storywriting, tool use, math, reasoning, code, medical, legal, and survival topics. Multilingual (EN, AR, DE, FR, ES, HI, PT, JA, KO). A genuine generalist with personality.
SuperGemma4 26B Abliterated Multimodal 26B multimodal GGUF No custom multimodal stack Based on Gemma 4 26B-A4B. Multimodal (image-text-to-text). Abliterated with low refusal. Optimized for Apple Silicon (MLX). Supports Korean + English. Tool use and coding tags.
Gemma 3 27B Abliterated 27B GGUF No Ollama / llama.cpp Abliterated version of Google’s Gemma 3 27B instruct. Multimodal (image-text-to-text). Reduced refusal behavior while preserving instruction-following quality.
Huihui Gemma 4 31B Abliterated 31B GGUF No Ollama / llama.cpp Abliterated Gemma 4 31B instruct. Multimodal (any-to-any pipeline tag). Dense 31B, not MoE. Strongest Gemma 4 dense abliterated option.
Gemma 4 31B Abliterated 31B GGUF + safetensors No Ollama / llama.cpp Same base as above (Gemma 4 31B-it) but different abliteration method using mlabonne’s harmful_behaviors + harmless_alpaca datasets. Both formats in one repo.
Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-Abliterated 35B A3B GGUF No Ollama / llama.cpp Qwen 3.5 MoE (35B total, ~3B active). Distilled from Claude 4.6 Opus reasoning. Chain-of-thought and reasoning-focused. Abliterated. Multimodal. Punches well above its active param count on reasoning tasks.
Midnight Rose 70B v2.0.3 70B GGUF No Ollama / llama.cpp By sophosympatheia. Complex multi-stage SLERP/DARE-TIES merge of WizardLM, Tulu-2-DPO, Dolphin, and earlier Midnight Rose versions. Uncensored. Designed for roleplay and storytelling. Scored surprisingly high on EQ-Bench even at low quants. ~6k context sweet spot.
Midnight Miqu 70B v1.5 70B GGUF No Ollama / llama.cpp Llama-family merge of Midnight-Miqu v1.0 and Tess-70B. Creative writing and roleplay focused. 32k context. Known for strong prose quality and character consistency at 70B scale.
Midnight Rose 103B v2.0.3 103B GGUF No heavy self-host Same lineage as the 70B but scaled up. Importance-matrix GGUF by mradermacher. Firmly in the “need real hardware” category.
DeepSeek V3 671B A37B safetensors Yes: DeepInfra, Novita Hosted preferred Massive MoE. 671B total, 37B active. Strong on code, math, and instruction-following. Pre-trained on ~15T tokens. Use via OpenRouter, not locally.
DeepSeek V3.2 685B A37B safetensors No confirmed provider yet Hosted preferred Successor to V3. Same general architecture class. Not a local play.
Behemoth-123B-v1 123B GGUF No heavy self-host Mistral-family 123B. Creative/RP community model. Massive parameter count makes it impractical for casual local use but prized for output quality in the r/LocalLLM community.
Monstral-123B 123B GGUF No heavy self-host Mistral-family 123B. Text generation and chat focused. Same weight class as Behemoth, different training mix and community lineage.
BlackSheep-Large ~27B GGUF No Ollama / llama.cpp By TroyDoesAI. Canonical repo is gated. Q8_0 is ~29.5 GB, placing it in the 27B-class. Community RP/creative model.


view raw

gistfile1.md

hosted with ❤ by GitHub

Leave a Reply

Trending

Discover more from Ethan B. Holland

Subscribe now to keep reading and get access to the full archive.

Continue reading