Open Source Text-to-Speech and Speech-to-Text on Android?

andrew0@lemmy.dbzer0.com · 1 month ago

You’re right! Sorry for the typo. The older nomic-embed-text model is often used in examples, but granite-embedding is a more recent one and smaller for English-only text (30M parameters). If your use case is multi-language, they also offer a bigger one (278M parameters) that can handle English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese (Simplified). I would test them out a bit to see what works best for you.

Furthermore, if you’re not dependent on MariaDB for something else in your system, there are also some other vector databases I would recommend. Qdrant also works quite well, and you can integrate it pretty easily in something like LangChain. It really depends on how much you want to push your RAG workflow, but let me know if you have any other questions.

andrew0@lemmy.dbzer0.com · edit-2 1 month ago

Have a look at Ollama embeddings. Easy to set up and the models are much smaller than a typical LLM.

andrew0@lemmy.dbzer0.com · 2 months ago

For notes, I have moved to Joplin with the option to synchronize my data using a WebDAV server. It works really well, and it has both a mobile and desktop app. If you’re interested in developing your project, maybe you can have a look at the options this provides. For example, I really like the ability to separate notes between groups, assign tags, create drawings, and the possibility to use Markdown.

Good luck with your projects! To mirror @enemenemu’s suggestion, I would also look into collaborating with the people trying to push the EU Docs alternative. Not sure if that will work, but it’s worth a shot if you’re interested :D

andrew0@lemmy.dbzer0.com · edit-2 2 months ago

Thanks for the SherpaTTS suggestion. I really like the GLaDOS voice <3

I am not sure which phone you use, but are you able to set FUTO Voice as the default “Voice input” in the Android settings? I played around with a few apps, which show up. However, FUTO is not an option here :(

andrew0@lemmy.dbzer0.com · 2 months ago

Thanks for the suggestion! I gave this a try, but it seems that it won’t register any voice 🤔 However, it seems like it shows up in my settings, so it’s a good sign. I’ll try to get it to work :D

andrew0@lemmy.dbzer0.com · edit-2 2 months ago

Thanks! I was actually looking at this, but I gave up because I couldn’t really figure out how to get a multilingual model running through Obtainium. I’ll try again :D

andrew0@lemmy.dbzer0.com · edit-2 2 months ago

Open Source Text-to-Speech and Speech-to-Text on Android?

andrew0@lemmy.dbzer0.com · edit-2 4 months ago

Might be even cheaper if you wait a bit and build it yourself. Next gen GPUs are coming out, which will lead to some price cuts on the current gen.

However, like others here have mentioned, you’re paying extra for them building it for you and warranty.

I don’t know if ro.pcpartpicker.com works well for Romania, but you can also give that a try and see what the individual components would net you on the local market.

Building the computer yourself along with your kid could also be a nice opportunity to teach him (and maybe yourself, if you’re not that knowledgeable) about the underlying components.

andrew0@lemmy.dbzer0.com · 8 months ago

Piracy. I’d buy albums if I had money, though. I’ll slowly phase into getting them once I get some more cash.

I can find most stuff I listen to, and I rarely grow my music library. I mostly listen to 20-30 albums, with some more mainstream music peppered in.

My music library currently sits at 90 gigabytes (mostly flacs), so quite small compared to others I’ve seen around here. Still, I have plenty of variation to keep me entertained :D

If you have Tidal, aren’t there some apps to rip the lossless audio from there? You could get most of the stuff that you need, and then cancel the subscription. If you feel bad, maybe order some merch from the band, haha.

andrew0@lemmy.dbzer0.com · edit-2 8 months ago

Click for longer opinion

If I remember correctly, even though Fuchsia is used in production, it is mainly targetting mobile or IoT devices. Nevertheless, the underlying micro-kernel, Zircon, is written in C/C++, which differs from Redox. Now, I’m not saying that Redox solves everything by writing the kernel in Rust. It will require plenty unsafe blocks to achieve what it needs, but it makes you aware beforehand that you should be careful about how you implement that bit of code. Having this clear marking could also make the kernel code review process more likely to catch issues.

Disregarding this, if I am not mistaken, Redox aims to be a drop-in replacement for Linux one day, both for desktop and server, while Fuchsia only wishes to be integrated in/replace Android. Linux is perfectly fine for most use cases, I am not suggesting otherwise! However, given how many issues resulted from overflow/memory corruption issues that could have been potentially easier to identify if Rust (or any other memory safe language) was used, you’d think that there is incentive to rely on it for kernel development. Linus himself made this decision as well when allowing Rust to be used in the Linux kernel development (albeit perhaps a bit too early).

The Linux kernel is not flawed, and Redox is probably years away from being even near it. However, having memory-safety from the get-go as a requirement for developing the kernel could lead to fewer exploits, compared to what we have today with Linux. Just as you’ve said, most users are not aware of it/they don’t care, but the big players will care about keeping information safe on their servers. Just to conclude, Redox OS is not just Linux rewritten in Rust, and could potentially have many other benefits that are particularly juicy for data centers. Too bad it’s not production ready yet :D

andrew0@lemmy.dbzer0.com · 8 months ago

That’s unfortunate :( I think you can still run it in QEMU, if you’re interested.

andrew0@lemmy.dbzer0.com · 8 months ago

I see your point. However, integrating Rust properly in the Linux kernel is an uphill battle. Redox OS is not at all close to being stable, but it showcases that you can build a Rust kernel from scratch, and integrate it into an OS that meets some of the requirements of a modern one. Of course, considering it a toy project and glancing over its potential doesn’t help with adoption. They even mention in their description that currently they can only support a community manager and a student developer with the current donations. When you compare that to the amount of money and developers involved in the Linux kernel, it’s insignificant.

I was not suggesting that the Rust For Linux devs jump ship, but it could be beneficial for the investors behind the project to look at alternatives. Heck, the Linux kernel started as a toy project itself. I believe that a team focused solely on such a Rust-only kernel could spearhead needed changes to reach something stable, as opposed to investing time and money into fighting established C developers to integrate a memory-safe language in the kernel fully.

andrew0@lemmy.dbzer0.com · edit-2 8 months ago

Redox OS 0.9.0 - Redox - Your Next(Gen) OS

andrew0@lemmy.dbzer0.com · 1 year ago

Good luck! You can try the huggingface-chat repo, or ollama with this web-ui. Both should be decent, as they have instructions to set up a docker container.

I believe the Llama 3 models are out there in a torrent somewhere, but I didn’t dig to find it. For the 70B model, you’ll probably need around 64GB of RAM available, but the 7B one should run fine with just 8GB. It will be somewhat slow though, compared to the ChatGPT experience. The self-attention mechanism can be parallelized, which is why you will see much better results on a GPU. According to some others that tested it, if you offload some stuff to RAM, you could see ~10-12 tokens per second on an RTX 3090 for certain 70B models. But more capable ones will be at less than 1 token per second, all depending on the context window you use.

If you don’t have a GPU available, just give the Phi-3 model a try :D If you quantize it to 4 bits, it can apparently get 12 tokens per second on an iPhone haha. It should play nice with pooling information from a search engine, or a vector database like milvus, qdrant or chroma.

andrew0@lemmy.dbzer0.com · 1 year ago

What db2 already said. Microsoft just released Phi-3 mini, which could, allegedly, run locally on newer smartphones.

If I understood correctly, the Rabbit thingy just captures your information locally and then forwards it to their server. So, if you want more power, you could probably do the same by submitting the same info to a bigger open source model than Phi-3, like Llama 3, hosted on your homelab. I believe you can set it up with huggingface/gradio, which sort of provides an API that you could use.

That way, you don’t need a shitty orange box, and can always get the latest open source models with a few lines of code. There are plenty of open source frameworks in the works at the moment, and I believe that we’re not far off from having multi-modal LLMs running on homelab-level hardware (if you don’t mind a bit of lag).

andrew0@lemmy.dbzer0.com · edit-2 1 year ago

How will you move to WhatsApp if everyone else uses iMessage? Europe has the same issue, but reversed. Everyone uses WhatsApp and can’t jump to Signal/Telegram because they’re not as popular.

andrew0@lemmy.dbzer0.com · 1 year ago

I got NFS Most Wanted (2005) working in Wine, and was somewhat impressed how easy it was at the time. Game worked quite well, and would only crash once in a while with some cryptic errors that I don’t remember. Made me hopeful for the future of linux gaming :)

andrew0@lemmy.dbzer0.com · 1 year ago

Wow, some of the comments on that article saying Google should have made Android closed source are mindboggling. They realize they never would have had their current worldwide marketshare if they did that, no?

But maybe if they did, we would have had more people working on true linux phones 🤔 I’m a bit torn on this one haha.

andrew0@lemmy.dbzer0.com · 2 years ago

I’d probably fall into the Tech Paranoid if it weren’t for gaming. I’m honestly rooting for Steam to jumpstart the Linux gaming ecosystem even more.

andrew0

Open Source Text-to-Speech and Speech-to-Text on Android?

Open Source Text-to-Speech and Speech-to-Text on Android?

Redox OS 0.9.0 - Redox - Your Next(Gen) OS

Redox OS 0.9.0 - Redox - Your Next(Gen) OS