Custom Voice Activated Hotwords Trigger
Posted: 14 Jan 2018 17:06
Hi Martin,
I would like to request always on voice command trigger, using custom hotword. The trigger are the recognized words from certain keyword, similar to "OK google", but we can train our own custom keyword. This will works even when the screen is off. This is similar to autovoice continous listening, except it should work offline.
Current action input speech must work based on other trigger. (example Trigger shortcut). The flow must be looped and running forever. There is a short delay between the loop where it will miss the command.
Recently there is an xda dev (Humpie) who created plugin for tasker, using Snowboy, hotword detection engine.
https://www.xda-developers.com/custom-v ... ds-tasker/
The plugin works as a service and listening to the mic (thus locking the mic usage). It will import the pmdl file (snowboy prerecorded hotword) and detect based on the pmdl. We run the service and create the trigger in automagic (using the plugin event). If the hotword detect the keyword, such as "computer", "jarvis" or "hey google"; then it will send the event to automagic and thus triggering the flow. The trigger can be set to detect each keyword, thus it is possible to use multiple hotword and multiple trigger to branch the flow accordingly. I have prove the concept. Using the plugin, we can extent the voice recognition to multiple language search or custom command. Such : "Halo Google" to search using German, "komputer" to search using Indonesian, or "wake up" to turn on the screen, etc. We can train our own sound/keyword in the website (https://snowboy.kitt.ai/)and imported it later for the detection offline.
Perfomance
The perfomance of the detection is much better than using input speech. As the hotword will directly send the event after it recognize the keyword, while input speech still need to wait until we finish the speech. Example, if I say : "My computer is in the room". Even if my keyword is "computer", input speech won't recognize it. it will wait until I finish the speech before giving the result. While using hotword plugin, after saying "My computer", the plugin already send the event to Automagic, even i haven't finish "is in the room". Thus it is possible to read a paragraph of text with multiple keyword inside, and the flow triggered (and branch accordingly) as we read it. I have think about several specific usage of it, ranging from showoff until the most evil one
Problem
The main problem with the plugin is it locks the mic usage. So after the event triggered, if we don't use the action plugin - stop hotword, we can't use any other action related to mic. While as you know, if we are using hotword detection, most likely we are going to give another input thru the input speech. So it is similar to the concept of Let's develop an assistant thread, but the trigger won't use shortcut anymore, it is using this hotword detection instead.
Using the plugin, the concept will be
Trigger Hotword >> Stop Hotword detection >> Input Speech >> process and do as the recognized command >> at the end of flow, Start Hotword detection.
So we have to add stop hotword at the beginning of the flow and start hotword again at the end of the flow. If we don't, input speech will be error.
If it is implemented inside automagic, it probably will be only
Trigger Hotword >> Input Speech >> process and do as the recognized command
No more stop/start anymore, as Automagic already have the mic resource from the beginning. Automagic can pause the trigger under the hood, so no more additional action needed. Maybe only switching the thread which lock the mic only, pausing the trigger and giving mic access to Input Speech. Of course the flow should be set to AEP - Skip, to avoid any mis-detection after the flow is triggered (while waiting another input speech). Only after the flow has finish, the mic access returns back to the trigger, allowing the trigger to detect hotword again.
I don't know if the trigger still can be active during the flow running (if we don't use AEP - Skip, or we create addition flow with another hotword trigger). But if the trigger still can be active during other triggered hotword flow, there should be an option (checkbox) to disable all hotword trigger (disable the whole detection) during certain media action (speech output, sound etc). Some part of my flow use Speech Output and it speaks out something. I don't want the voice output to mis-triggered the hotword as well. The checkbox to disable it, should be in the trigger hotword.
Interested
I am usually not a fan of voice command. But after knowing the powerful "always listening" feature, I have considered to use it in certain occassion, especially when I have no problem with battery (phone is being charged). I am also working on a flow to showoff automagic to my friend. Currently I am still using the hotword plugin as part of the flow. But I wish Automagic can implement it built-in, as it will be more reliable, less resource and integrate to the flow directly (no more mic locking issue). If I can suggest some of the name of the Trigger : Voice Hotword Detected, Snowboy Hotword, Hotword Detected, Voice Keyword Detected, Always Listening, Hotword Continous Detection.
BTW, using only the toolkit from Snowboy doesn't need any license at all (as commented by Humpie). But you should give them credit. Maybe should email them to confirm if it is OK to include it in Automagic.
Regards,
Desmanto
I would like to request always on voice command trigger, using custom hotword. The trigger are the recognized words from certain keyword, similar to "OK google", but we can train our own custom keyword. This will works even when the screen is off. This is similar to autovoice continous listening, except it should work offline.
Current action input speech must work based on other trigger. (example Trigger shortcut). The flow must be looped and running forever. There is a short delay between the loop where it will miss the command.
Recently there is an xda dev (Humpie) who created plugin for tasker, using Snowboy, hotword detection engine.
https://www.xda-developers.com/custom-v ... ds-tasker/
The plugin works as a service and listening to the mic (thus locking the mic usage). It will import the pmdl file (snowboy prerecorded hotword) and detect based on the pmdl. We run the service and create the trigger in automagic (using the plugin event). If the hotword detect the keyword, such as "computer", "jarvis" or "hey google"; then it will send the event to automagic and thus triggering the flow. The trigger can be set to detect each keyword, thus it is possible to use multiple hotword and multiple trigger to branch the flow accordingly. I have prove the concept. Using the plugin, we can extent the voice recognition to multiple language search or custom command. Such : "Halo Google" to search using German, "komputer" to search using Indonesian, or "wake up" to turn on the screen, etc. We can train our own sound/keyword in the website (https://snowboy.kitt.ai/)and imported it later for the detection offline.
Perfomance
The perfomance of the detection is much better than using input speech. As the hotword will directly send the event after it recognize the keyword, while input speech still need to wait until we finish the speech. Example, if I say : "My computer is in the room". Even if my keyword is "computer", input speech won't recognize it. it will wait until I finish the speech before giving the result. While using hotword plugin, after saying "My computer", the plugin already send the event to Automagic, even i haven't finish "is in the room". Thus it is possible to read a paragraph of text with multiple keyword inside, and the flow triggered (and branch accordingly) as we read it. I have think about several specific usage of it, ranging from showoff until the most evil one
Problem
The main problem with the plugin is it locks the mic usage. So after the event triggered, if we don't use the action plugin - stop hotword, we can't use any other action related to mic. While as you know, if we are using hotword detection, most likely we are going to give another input thru the input speech. So it is similar to the concept of Let's develop an assistant thread, but the trigger won't use shortcut anymore, it is using this hotword detection instead.
Using the plugin, the concept will be
Trigger Hotword >> Stop Hotword detection >> Input Speech >> process and do as the recognized command >> at the end of flow, Start Hotword detection.
So we have to add stop hotword at the beginning of the flow and start hotword again at the end of the flow. If we don't, input speech will be error.
If it is implemented inside automagic, it probably will be only
Trigger Hotword >> Input Speech >> process and do as the recognized command
No more stop/start anymore, as Automagic already have the mic resource from the beginning. Automagic can pause the trigger under the hood, so no more additional action needed. Maybe only switching the thread which lock the mic only, pausing the trigger and giving mic access to Input Speech. Of course the flow should be set to AEP - Skip, to avoid any mis-detection after the flow is triggered (while waiting another input speech). Only after the flow has finish, the mic access returns back to the trigger, allowing the trigger to detect hotword again.
I don't know if the trigger still can be active during the flow running (if we don't use AEP - Skip, or we create addition flow with another hotword trigger). But if the trigger still can be active during other triggered hotword flow, there should be an option (checkbox) to disable all hotword trigger (disable the whole detection) during certain media action (speech output, sound etc). Some part of my flow use Speech Output and it speaks out something. I don't want the voice output to mis-triggered the hotword as well. The checkbox to disable it, should be in the trigger hotword.
Interested
I am usually not a fan of voice command. But after knowing the powerful "always listening" feature, I have considered to use it in certain occassion, especially when I have no problem with battery (phone is being charged). I am also working on a flow to showoff automagic to my friend. Currently I am still using the hotword plugin as part of the flow. But I wish Automagic can implement it built-in, as it will be more reliable, less resource and integrate to the flow directly (no more mic locking issue). If I can suggest some of the name of the Trigger : Voice Hotword Detected, Snowboy Hotword, Hotword Detected, Voice Keyword Detected, Always Listening, Hotword Continous Detection.
BTW, using only the toolkit from Snowboy doesn't need any license at all (as commented by Humpie). But you should give them credit. Maybe should email them to confirm if it is OK to include it in Automagic.
Regards,
Desmanto