Meet Ellie: My Voice Assistant Project
Hey there! I've been working on Ellie for a while as of late. She's all about privacy, so no tracking or storing data here. Ellie might not be as feature-packed as the big names (yet!), but she's open-source and lets anyone whip up their own commands using Python. Still lots to do, but I'm super excited about where Ellie's headed! This forum will be used to track the development of Ellie!
14 Replies
Ellie is just now barely reaching a deployable state as I'm doing a bunch of work on how the plugin api should work
Keep in mind this project has only been in works for a few weeks, and all started when I created a Part Of Speech (POS) tagger
https://discord.com/channels/728958932210679869/741973614617821267/1159872065529188372
But slowly developed into a project. I'm a huge hater on the big names currently representing voice assistants like Alexa just to name one
To make Ellie just as feature-packed I am working with all the freetime I have towards this project.
I have already begun to use my Raspberry Pi 4 along with a bread board for working on the electronic side of things.
How Ellie Works
Ellie ensures your privacy by not storing or learning from your voice data. Built using FOSS technology, here's a breakdown of how Ellie transforms your voice commands into actions:
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
So as of right now I'm still working on the api for the plugins
the main idea is to use decorators to register options and commands
and even create persistent objects that can have a lifetime longer then the commands runtime
For example a timer, which is persistent and should always run in the background when made.
The hope is for a simple and modular plugin api which is made possible by the COINS sorting algorithm
My current idea for something like opening an app goes as follows
The expects and requires syntax helps Ellie help you. It clears up confusion on both ends, you the user know you wont be getting random data. It keeps consistency
Although I will note it won't be required to use
expects
or requires
Using this system also allows Ellie to distinguish overlaps in commands & options
So for example you can have overlapping open commands if they're built like so
If I didn't have this syntax this distinguish could not be made, I.e.
What is app
? What is device
? Are they a string, a number, an input!? This is the confusion. In this case if an expects is not passed it doesn't give you the freedom of doing def open(app, device)
instead it will pass a CoinContext object, so your function would look like def open(context)
If not labeled using requires
it will always be treated as optional, allowing for every value to have a None value
Fully implemented everything mentioned above
I am now working on representing the COINS structure as a Coin class. As currently it's just a dict so it can be a bit tedious to use sometimes
The hope is to be able to do stuff like
This is what CoinContext currently is
But bringing it into one Coin class and then working with that instead makes life a bit easier on my end. Not only does it improve code readability I don't need to manage a CoinContext while handling Coin parsing
Its just all one thing
Probably sounds like gibberish, I'm a little tired right now 😅
Currently the Coin class will just represent the CoinContext dict but as a class. I will work on adding some methods to it, like execution and such in a short while
Haven't added much in the forum so here we go
Started the physical prototype using a Raspberry Pi 4 Model B 4 gb
With some work Ellie could run on a 2gb model
which would be nice, but what's also nice is that 2 extra gb of ram
the backend code for how plugins are stored and used was rewritten
and is now faster, along with the code being cleaner
Much clearer in fact that I can send all the code for the decorators right now
CoinBank
is the underlying class for dealing with all the Coins ellie stores
As a result of changing the way Coin's are worked with under the hood the CoinContext class (the backend implementation of a Coin) also had to change
The power of this system is speed, as if you threw thousands of commands into Ellie it would have big hickups
Thanks to the name
and type
fields the CoinBank structure is much simpler and a lot of iteration can be removed.
The only things slowing down the system now would be the wake work and stt systems
Which are only slow due to start and stop of a SoundDevice
I'm trying to get around this problem to speed up Ellie's execution
You can now use register_input
Which similar to register_command
and register_option
, allows the user to easily do what they want
Although it's still a barely tested feature it works
it's a bit complex in how it can be used, but here is what it simplifies
Without register_input
with register_input
Is it really required? No.
I may not even keep it at all, but it does make it just a bit more readable in my opinion
Currently expects
and requires
does not work with register_input
This can either be a perk or downside depending on what you're making
In my volume case it's cleaner to use register_input
as you dont need expects
or requires
at all. A CoinContext is still passed and can be captured if you'd need more info for your inputDid you stop this project or are you still making progress in it
It's pretty much done, I deployed it into my classroom for testing
It was less then perfect, but it functioned
If you're curious I can send the source code
It uses only open source technologies
And is written in python
God, Ellie has a trash project logo
Well anyway, getting back to updating Ellie as it's been having some trouble lately
So as such I plan on rebuilding it from the ground up
That doesn't just mean the physical portion. I will be rewriting a good chunk of Ellies code entirly
I hope to switch off using a raspberry pi, if possible
I really doubt it will be, that's because I need at least 2 gb of ram to run a base whisper install
The hope would be to make it cheaper to create more Ellie's
This time around I aim to make Ellie more then just something I deploy in one classroom, but something bigger. To make that happen I need to do a lot of work, and make it so Ellie has lotssss of data. Where before Ellie was running off 50 lines of data
This was enough for it's purpose and thanks to having a pretty cherry picked data set, although small. Allowed the NER model to work really well
I will likely write something to augment commands, just to switch them up and give more data to Ellie
I have already rewrote most of COINann, which I have renamed to Acoin
It now actually has a proper readme
GitHub
GitHub - ZackeryRSmith/acoin: An annotation tool for the COINS NER ...
An annotation tool for the COINS NER model. Contribute to ZackeryRSmith/acoin development by creating an account on GitHub.
I have also added new features to Acoin, with even more planned. The site will likely always look bare bones though, as I doubt I'll add much more to the style of Acoin
I actually do have a pretty solid idea in mind if I ever do design and make it look all cute
But for now it's not an issue
And likely won't be until I'm done with Ellie entirely
I care less about data gathering right now though
I care more about rewriting the API and cleaning up all that code
Then I will worry about slamming lots of data and doing tons of training
Remember what I said about data earlier, I lied. I added about 50 more cherry picked data to annotate
Just doing this allows me to say complex things it's never seen before and have it still picked up correctly
For example:
None of the data it's trained on includes words like "turn", "on", "lights", "living room". Thanks to some truly cherry picked and thought out data it works very well
My data trains more on the structure of the sentence more then the actual context
Actually most the data doesn't even make proper sentences, just so the model is more clear on the structure rather then the words
Ok so listening Dream On by Aerosmith caused it to go mad
I for one, do not want to go back to school
So it can do this, but not this
Reworked the file structure
Rewrote the logger, it now logs to a file too