• 3 Posts
  • 17 Comments
Joined 5 years ago
cake
Cake day: June 30th, 2020

help-circle


  • It really depends on how you quantize the model and the K/V cache as well. This is a useful calculator. https://smcleod.net/vram-estimator/ I can comfortably fit most 32b models quantized to 4-bit (usually KVM or IQ4XS) on my 3090’s 24 GB of VRAM with a reasonable context size. If you’re going to be needing a much larger context window to input large documents etc then you’d need to go smaller with the model size (14b, 27b etc) or get a multi GPU set up or something with unified memory and a lot of ram (like the Mac Minis others are mentioning).









  • I believe alternative methods of validating blocks (series of transactions) such as Proof-of-stake, instead of the vastly more computation and energy-intensive proof-of-work that Bitcoin uses would largely address the issue of computational expense. There are other methods of increasing efficiency and speed of processing as well such as the use of more efficient ‘layer 2’ mechanisms for processing blocks. I remember reading about these and their implementation when I was researching cryptocurrencies out of curiosity. I believe Ethereum and some others have largely implemented these. The decentralized applications aspect of Eth was super interesting to me as well. Basically, you can program software to run on the blockchain which can make it nearly impossible to shut down by a centralized authority so long as the network is sufficiently decentralized. Some of the programmable money (so-called decentralized finance or ‘DeFi’) apps are pretty interesting as well in terms of enabling more people to utilize the more complex financial instruments that Wall Street firms have been using for years. Of course, a lot of that has turned into a Wild West of scams and ‘rug pulls’, not to mention massive targets for hackers who try to exploit vulnerabilities to steal millions so buyer beware for sure.