Oh and I typically get 16-20 tok/s running a 32b model on Ollama using Open WebUI. Also I have experienced issues with 4-bit quantization for the K/V cache on some models myself so just FYI
Oh and I typically get 16-20 tok/s running a 32b model on Ollama using Open WebUI. Also I have experienced issues with 4-bit quantization for the K/V cache on some models myself so just FYI
It really depends on how you quantize the model and the K/V cache as well. This is a useful calculator. https://smcleod.net/vram-estimator/ I can comfortably fit most 32b models quantized to 4-bit (usually KVM or IQ4XS) on my 3090’s 24 GB of VRAM with a reasonable context size. If you’re going to be needing a much larger context window to input large documents etc then you’d need to go smaller with the model size (14b, 27b etc) or get a multi GPU set up or something with unified memory and a lot of ram (like the Mac Minis others are mentioning).
Is it possible to use StreetComplete on iOS?
I think we can all agree that modifications to these models which remove censorship and propaganda on behalf of one particular country or party is valuable for the sake of accuracy and impartiality, but reading some of the example responses for the new model I honestly find myself wondering if they haven’t gone a bit further than that by replacing some of the old non-responses and positive portrayals of China and the CPC with a highly critical perspective typified by western governments which are hostile to China (in particular the US). Even the name of the model certainly doesn’t make it sound like neutrality and accuracy is their primary aim here.
I used to daily drive Ubuntu some years ago for work/personal use but have been back on Win 10 primarily for the last 4-5 years. I was considering trying to go back due to how much Windows sucks (despite some proprietary software only being available on it) but remembering the trouble I had with some networking/printer drivers and troubleshooting those issues and then seeing this article Is definitely making me reconsider…
Yeah I use voyager pretty much exclusively on my iPhone so maybe I should request a feature like that there? Seems like it would be something that many people would appreciate. Not sure why I end up seeing posts with -10, -15 votes… Those are generally trash haha
Based on the differences in color for each handle it makes me wonder if the one for not washing your hands is a different material. Maybe an antimicrobial metal like a copper alloy.
I believe alternative methods of validating blocks (series of transactions) such as Proof-of-stake, instead of the vastly more computation and energy-intensive proof-of-work that Bitcoin uses would largely address the issue of computational expense. There are other methods of increasing efficiency and speed of processing as well such as the use of more efficient ‘layer 2’ mechanisms for processing blocks. I remember reading about these and their implementation when I was researching cryptocurrencies out of curiosity. I believe Ethereum and some others have largely implemented these. The decentralized applications aspect of Eth was super interesting to me as well. Basically, you can program software to run on the blockchain which can make it nearly impossible to shut down by a centralized authority so long as the network is sufficiently decentralized. Some of the programmable money (so-called decentralized finance or ‘DeFi’) apps are pretty interesting as well in terms of enabling more people to utilize the more complex financial instruments that Wall Street firms have been using for years. Of course, a lot of that has turned into a Wild West of scams and ‘rug pulls’, not to mention massive targets for hackers who try to exploit vulnerabilities to steal millions so buyer beware for sure.
Tend to start with top (day) for my subs and then switch to scaled once I get down to posts that are below 100 upvotes or so to see more posts from smaller communities that can’t make that ‘top’ cut.
Maybe 1-3 times a day. I find that the newest version of ChatGPT (4o) typically returns answers that are faster and better quality than a search engine inquiry, especially for inquiries that have a bit more conceptualization required or are more bespoke (i.e give me recipes to use up these 3 ingredients etc) so it has replaced search engines for me in those cases.
Thanks for all the ongoing development work. If you all haven’t you should definitely consider an ongoing donation to support the project. I joined Lemmy a few years ago but ever since Reddit imploded and went full greed-mode in preparation for the IPO it has been my primary social link aggregator. Been on a monthly ongoing donation since then.
I there a changelog available?
AMD only and not Nvidia? That’s what I was seeing based on a quick search. Unfortunately, I don’t have an AMD GPU.
This is impressive and interesting, but what about hardware ray tracing support? Proton has been very impressive but I thought that RT on DX12 was basically non-existent on Linux.
So my grandchildren will be more than likely be belters. Got it.
Here’s a non Google AMP link to the article: https://www.theregister.com/2023/07/01/chiplet_market/
Looks like it now has Docling Content Extraction Support for RAG. Has anyone used Docling much?