Skip to main content

Whole Home Voice Control

image02-VoiceContol.jpg

Problem

Typing is not always the best way to get information (e.g. what’s the weather?) or give commands (e.g. turn the kitchen lights off!) in the application. 

 

Solution

 

The idea is to create a Voice Assistant to control devices in the SeaPod and ask for information that is available in the Ocean Builder’s applications. 

 

The voice assistant should be smart enough to recognize users and be able to follow commands, considering users’ settings and permissions. For example, when the user asks to turn on the shower then the shrower will be turned on and adjusted automatically based on the user's preferences, but if the user doesn’t have permission to use that shower then the command will be ignored. 



Prize

  • Turn this into your own entrepreneurial business venture and we will be your first customers and help bring you media attention and customers

  • Get Entrepreneurial Business Coaching to start this as a business



And here are some potential benefits:

 

  • Mass exposure with highly visible project

  • Build reputation

  • Recognized as an official collaborator/ and/or on Github

  • Get noticed

  • Product development experience

  • Work on projects you are passionate about

  • Get your project built and working in the real world

  • Participate in interesting work

  • Get grants (maybe partner with someone that can help with this or exposure to grant writers)

  • Change the world

 

Industry

Current technological level

 

You’ve probably heard of voice assistants like Alexa, Siri, Google Assistant, and Cortana. These voice assistants are essentially based on voice recognition, NLP, and synthesis of speech (see picture below).

 

There are also many open source projects like:

 

  1. Mycroft

  2. OpenAssistant

  3. Jasper

  4. LinTO

  5. Rhasspy

  6. Aimybox

  7. Leon

 

Many of these open source voice assistants have come into existence quite recently and will probably take some time to develop into a more sophisticated solution.

 

The problem with the majority of these platforms is that they are not local and not private enough.



Some projects such as MyCroft offer solutions built around Google Home or Alexa. However, certain characteristics of these systems - no data protection and no business vocabulary adaptation - limit them to a B2C market that is not (yet) concerned by data sensitivity and criticality issues.

 

There are also some platforms like LinTo, that embraces these challenges from the start in order to be the engine that catapults your professional product.

 

One of the biggest challenges might be to implement a voice authenticator. Here are some projects to check out to see if any of them could be a fit to integrate with our system:

 

Here's a website about open source projects. I brought out some more interesting projects below:

 

https://awesomeopensource.com/project/pyannote/pyannote-audio

 

https://codeocean.com/capsule/7271435/tree/v1

 

https://github.com/mravanelli/pytorch-kaldi

 

https://awesomeopensource.com/project/google/uis-rnn

 

https://alize.univ-avignon.fr/

It has a Java version as well. 



Information

 

Repository

<text>

 

License Requirement

Open Source: Can be used for private or commercial projects

Software: GNU General Public License (GNU GPL V3) here

Non-Software: Creative Commons (CC BY-SA 4.0) here

 

Project Areas
  • IoT Development (sensors, arduino and raspberry pi)

  • Software Development (python)?

  • <text>



Keywords: <text>



Project requirements

 

Stages and deadlines

 

Project Start

date

Team Formed

date

Market Research Summary (Report)

date

Project Plan Complete

date

Preliminary Product Design Complete

date

Prototype Development Complete

date

Prototype Evaluation Complete

date

Product Presentation

date

Project Completion

date

 

Project plan should cover the following:

 

  • stages / milestones of a project (not all stages are brought out in a table above)

  • activities or tasks in each phase

  • task start and end dates

  • interdependencies between tasks

 

Also:

 

  • skills needed

  • responsibilities of each team member (identify as many as you can).

 

Product’s general requirements

https://docs.google.com/spreadsheets/d/1u0Ca9NZvKY6ex5JPtpl8M-HoaM-K8VBF4W4NGoJWlSo/edit?usp=sharing

(Will remove URL before publishing)



Basic

Advanced

Function

   

Part I

   

Can it identify people via voice ID?

   

Can it easily understand people's accents?

   

Does it know users preferences of using home appliances?

   

Can it adjust devices settings based on the users' preferences?

   

Can it take commands only from people who have permission?

   

Can you set permissions for commands and information? e.g. select who can open doors.

   

Is the data sandboxed so personal data is not going to public cloud for AI/ML?

   

Can you switch from online/offline queries.

     
   

Part II

   

Can I ask about all the information that is available in the Ocean Builders user app?

   

Can I ask about all the information that is available in the Ocean Builders admin panel app?

   

Can I give all the commands that are available in Ocean Builders user app?

   

Can I give all the commands that are available in Ocean Builders admin panel app?

   

Does it support <text> language?



Tips

 

Below you can find some examples of tools to use to build Voice Assistance:

 

gTTS (Google Text-to-Speech) is a  speech synthesis library to convert text to speech.

 

SpeechRecognition is a library for performing speech recognition, with support for several engines and APIs, online and offline

 

Sphinx is the offline recognition engine called by the SpeechRecognition library.

 

Packt is a voice recognition library to identify the person who is speaking 

 

Voice Authentication 

https://courses.csail.mit.edu/6.857/2016/files/31.pdf

 

Not open source:

 

https://docs.google.com/document/d/1V-cyxivxKFXwYVUO21oleuXcAoXzR8QTKkpI5ug6E0A/edit

 

Project video link:

https://www.dropbox.com/s/j44y0z574mt1ohh/VoiceControl.mp4?dl=0