Simon Willison's Weblog

That model was trained in part utilizing their unreleased R1 "reasoning" model. Today they've released R1 itself, along with a whole household of brand-new models obtained from that base.

There's a great deal of things in the new release.

DeepSeek-R1-Zero appears to be the base model. It's over 650GB in size and, like the majority of their other releases, is under a clean MIT license. DeepSeek caution that "DeepSeek-R1-Zero encounters challenges such as endless repeating, poor readability, and language mixing." ... so they likewise launched:

DeepSeek-R1-which "incorporates cold-start data before RL" and "attains efficiency comparable to OpenAI-o1 across math, code, and reasoning tasks". That one is also MIT certified, and is a similar size.

I don't have the ability to run models larger than about 50GB (I have an M2 with 64GB of RAM), so neither of these 2 models are something I can quickly have fun with myself. That's where the brand-new distilled models are available in.

To support the research study neighborhood, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and 6 dense designs distilled from DeepSeek-R1 based upon Llama and Qwen.

This is a fascinating flex! They have actually designs based upon Qwen 2.5 (14B, 32B, Math 1.5 B and Math 7B) and Llama 3 (Llama-3.1 8B and forum.pinoo.com.tr Llama 3.3 70B Instruct).

Weirdly those Llama designs have an MIT license connected, which I'm uncertain works with the underlying Llama license. Qwen models are Apache accredited so maybe MIT is OK?

(I also just saw the MIT license files say "Copyright (c) 2023 DeepSeek" so they might require to pay a little bit more attention to how they copied those in.)

Licensing aside, these distilled designs are remarkable beasts.

Running DeepSeek-R1-Distill-Llama-8B-GGUF

Quantized variations are currently beginning to appear. So far I have actually tried simply among those- unsloth/DeepSeek-R 1-Distill-Llama-8B-GGUF launched by Unsloth AI-and it's really enjoyable to have fun with.

I'm running it utilizing the combination of Ollama, LLM and the llm-ollama plugin.

First I brought and ran the model utilizing Ollama itself:

This downloads around 8.5 GB of design data and starts an interactive chat interface.

Once the model has been brought, LLM can speak to it as well. I prefer utilizing LLM for experiments because it logs everything to SQLite for later expedition.

I set up the plugin and ran it like this:

Each reaction begins with a ... pseudo-XML tag containing the chain of idea used to help produce the response. For "a joke about a pelican and a walrus who run a tea room together" that started like this:

Okay, so I need to come up with a joke about a pelican and a walrus running a tea room together. Hmm, that's an interesting mix. Let me think of how these two characters may engage in a funny situation.

First, let's consider their qualities. Pelicans are understood for their long beaks and Webbed feet, often seen near the beach or water. Walruses have huge teeth, [smfsimple.com](https://www.smfsimple.com/ultimateportaldemo/index.php?action=profile