How is that For Flexibility?
edwardohallstr edytuje tę stronę 7 miesięcy temu


As everyone is well mindful, the world is still going nuts attempting to develop more, newer and better AI tools. Mainly by tossing unreasonable amounts of money at the problem. Much of those billions go towards developing cheap or complimentary services that operate at a considerable loss. The tech giants that run them all are wishing to attract as many users as possible, so that they can capture the marketplace, and end up being the dominant or just party that can use them. It is the traditional Silicon Valley playbook. Once dominance is reached, anticipate the enshittification to begin.

A likely method to earn back all that money for developing these LLMs will be by tweaking their outputs to the liking of whoever pays the most. An example of what that such tweaking appears like is the rejection of DeepSeek's R1 to discuss what occurred at Tiananmen Square in 1989. That one is certainly politically motivated, however ad-funded services won't exactly be fun either. In the future, I totally anticipate to be able to have a frank and honest discussion about the Tiananmen occasions with an American AI representative, however the only one I can afford will have assumed the persona of Father Christmas who, while holding a can of Coca-Cola, will intersperse the recounting of the terrible occasions with a cheerful "Ho ho ho ... Didn't you understand? The holidays are coming!"

Or maybe that is too improbable. Today, dispite all that cash, the most popular service for code conclusion still has problem working with a couple of simple words, regardless of them existing in every dictionary. There need to be a bug in the "free speech", or something.

But there is hope. Among the tricks of an upcoming gamer to shake up the market, is to damage the incumbents by launching their model totally free, under a permissive license. This is what DeepSeek just made with their DeepSeek-R1. Google did it earlier with the Gemma models, as did Meta with Llama. We can download these models ourselves and run them on our own hardware. Even better, individuals can take these models and scrub the predispositions from them. And we can download those scrubbed designs and run those on our own hardware. And then we can finally have some really beneficial LLMs.

That hardware can be an obstacle, though. There are 2 options to choose from if you want to run an LLM in your area. You can get a big, powerful video card from Nvidia, or you can buy an Apple. Either is costly. The main specification that suggests how well an LLM will perform is the amount of memory available. VRAM when it comes to GPU's, normal RAM in the case of Apples. Bigger is better here. More RAM indicates bigger designs, which will significantly enhance the quality of the output. Personally, I 'd say one requires a minimum of over 24GB to be able to run anything useful. That will fit a 32 billion parameter model with a little headroom to spare. Building, or buying, a workstation that is equipped to manage that can quickly cost thousands of euros.

So what to do, if you don't have that quantity of cash to spare? You buy pre-owned! This is a viable option, but as always, there is no such thing as a free lunch. Memory may be the main issue, however don't ignore the importance of memory bandwidth and other specs. Older equipment will have lower performance on those elements. But let's not worry excessive about that now. I have an interest in building something that a minimum of can run the LLMs in a usable method. Sure, the most recent Nvidia card might do it much faster, however the point is to be able to do it at all. Powerful online designs can be nice, but one should at the minimum have the alternative to switch to a regional one, if the circumstance calls for it.

Below is my effort to build such a capable AI computer system without investing too much. I ended up with a workstation with 48GB of VRAM that cost me around 1700 euros. I might have done it for less. For circumstances, it was not strictly needed to buy a brand name brand-new dummy GPU (see listed below), or I could have discovered somebody that would 3D print the cooling fan shroud for me, rather of shipping a ready-made one from a far nation. I'll confess, I got a bit impatient at the end when I learnt I had to purchase yet another part to make this work. For me, forum.altaycoins.com this was an appropriate tradeoff.

Hardware

This is the full expense breakdown:

And this is what it appeared like when it initially booted up with all the parts set up:

I'll offer some context on the parts listed below, and after that, I'll run a couple of fast tests to get some numbers on the efficiency.

HP Z440 Workstation

The Z440 was an easy pick because I currently owned it. This was the starting point. About 2 years earlier, I wanted a computer that might serve as a host for my virtual makers. The Z440 has a Xeon processor with 12 cores, and this one sports 128GB of RAM. Many threads and a lot of memory, that should work for hosting VMs. I bought it pre-owned and after that the 512GB hard disk drive for a 6TB one to store those virtual devices. 6TB is not required for running LLMs, and for that reason I did not include it in the breakdown. But if you prepare to gather many designs, 512GB may not suffice.

I have actually pertained to like this workstation. It feels all extremely strong, and I haven't had any problems with it. At least, until I started this task. It turns out that HP does not like competition, and [forum.batman.gainedge.org](https://forum.batman.gainedge.org/index.php?action=profile