Whole Home Voice Control
Problem
Typing is not always the best way to get information (e.g. what’s the weather?) or give commands (e.g. turn the kitchen lights off!) in the application.
Solution
The idea is to create a Voice Assistant to control devices in the SeaPod and ask for information that is available in the Ocean Builder’s applications.
The voice assistant should be smart enough to recognize users and be able to follow commands, considering users’ settings and permissions. For example, when the user asks to turn on the shower then the shrower will be turned on and adjusted automatically based on the user's preferences, but if the user doesn’t have permission to use that shower then the command will be ignored.
Prize
- Get credited as a Project Contributor to the Ocean Builders Project
-
Turn this into your own entrepreneurial business venture and we will be your first customers and help bring you media attention and customers
-
Get Entrepreneurial Business Coaching to start this as a business
And here are some potential benefits:
-
Mass exposure with highly visible project
-
Build reputation
-
Recognized as an official collaborator/ and/or on Github
-
Get noticed
-
Product development experience
-
Work on projects you are passionate about
-
Get your project built and working in the real world
-
Participate in interesting work
-
Get grants (maybe partner with someone that can help with this or exposure to grant writers)
-
Change the world
Industry
Current technological level
You’ve probably heard of voice assistants like Alexa, Siri, Google Assistant, and Cortana. These voice assistants are essentially based on voice recognition, NLP, and synthesis of speech (see picture below).
There are also many open source projects like:
Many of these open source voice assistants have come into existence quite recently and will probably take some time to develop into a more sophisticated solution.
The problem with the majority of these platforms is that they are not local and not private enough.
Some projects such as MyCroft offer solutions built around Google Home or Alexa. However, certain characteristics of these systems - no data protection and no business vocabulary adaptation - limit them to a B2C market that is not (yet) concerned by data sensitivity and criticality issues.
There are also some platforms like LinTo, that embraces these challenges from the start in order to be the engine that catapults your professional product.
One of the biggest challenges might be to implement a voice authenticator. Here are some projects to check out to see if any of them could be a fit to integrate with our system:
Here's a website about open source projects. I brought out some more interesting projects below:
https://awesomeopensource.com/project/pyannote/pyannote-audio
https://codeocean.com/capsule/7271435/tree/v1
https://github.com/mravanelli/pytorch-kaldi
https://awesomeopensource.com/project/google/uis-rnn
https://alize.univ-avignon.fr/
It has a Java version as well.
Information
Repository
<text>
License Requirement
Open Source: Can be used for private or commercial projects
Software: GNU General Public License (GNU GPL V3) here
Non-Software: Creative Commons (CC BY-SA 4.0) here
Project Areas
-
IoT Development (sensors, arduino and raspberry pi)
-
Software Development (python)?
-
<text>
Keywords: <text>
Project requirements
Stages and deadlines
Project plan should cover the following:
-
stages / milestones of a project (not all stages are brought out in a table above)
-
activities or tasks in each phase
-
task start and end dates
-
interdependencies between tasks
Also:
-
skills needed
-
responsibilities of each team member (identify as many as you can).
Product’s general requirements
https://docs.google.com/spreadsheets/d/1u0Ca9NZvKY6ex5JPtpl8M-HoaM-K8VBF4W4NGoJWlSo/edit?usp=sharing
(Will remove URL before publishing)
Tips
Below you can find some examples of tools to use to build Voice Assistance:
gTTS (Google Text-to-Speech) is a speech synthesis library to convert text to speech.
SpeechRecognition is a library for performing speech recognition, with support for several engines and APIs, online and offline
Sphinx is the offline recognition engine called by the SpeechRecognition library.
Packt is a voice recognition library to identify the person who is speaking
Voice Authentication
https://courses.csail.mit.edu/6.857/2016/files/31.pdf
Not open source:
https://docs.google.com/document/d/1V-cyxivxKFXwYVUO21oleuXcAoXzR8QTKkpI5ug6E0A/edit
Project video link:
https://www.dropbox.com/s/j44y0z574mt1ohh/VoiceControl.mp4?dl=0