-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Any examples of multiple keywords and an uploaded .wav file #117
Comments
As you are talking about uploading a wav file, I assume you want to run pocketsphinx on the server side, in which case it does not make much sense to run pocketsphinx.js. You'd rather run a natively compiled version, possibly wrapped in a JavaScript interface if you want to talk to node.js. As for having multiple key words or phrases, it only works by providing a file that contains one keyword/phrase per line. The argument to point to that file is |
Actually, I really do want to enable a user to process a wav file entirely via front-end JavaScript. I already have a working version of sever-side (python & command-line) pocketsphinx. The advantage of getting it to run entirely via front-end JavaScript instead would be that the audio data wouldn't have to move from where it already is, on someone else's computer. This would enable me to avoid meetings to review security and a bunch of emails back and forth about permissions. If the data doesn't move, less hoops to jump through. Additionally, I wouldn't have to worry about maintaining sever code and a sever environment or running the processing for other people on my own machine. I could just publish an internal github pages and leave it there for end-users to use as a user-supplied-keywords+video -> keyword tagging -> data visualization service. |
I have looked at the README and got a small multiple keyword tagging example sorta working based on the instructions there but performance is poor relative to server-side code. "Two" is being found very often even though it isn't in the dict or keywords file. Additionally, it sorta seems like a hard browser reset or restart of the browser (chrome) isn't refreshing certain code changes? I'm not sure if that is a emscripten related thing or not? I'm new to emscripten. I was hoping someone had an example along the lines of what I wanted (multiple keyword tagging with keywords supplied at runtime via UI) to speed debugging. In any case, thanks for your work on this project. |
If you do not need to upload the recording on a server, then pocketsphinx.js is for sure a good solution. There is no reason why performance, in terms of recognition rate, should be different in the browser compared to a natively compiled version. Of course, this assumes you are using the same acoustic model for both tests, and the same init parameters (pocketsphinx.js displays them in the JavaScript console at init time), so you might want to check that. If you have an example of inconsistent performqnce, you can send it along. I don't think setting multiple keywords at runtime would work as the only way to set multiple key words or phrases is via a file, not via the API. But there might be a way to dynamically create something that looks like a file to pocketsphinx.js. Otherwise, you might be able to get something working well using grammars, which can be dynamically set via the API. JavaScript generated by emscripten should be cached by the browser the same way as other JavaScript files. For wasm files, I don't know. At least a hard refresh should work. Feel free to share your code, we could then integrate it directly into pocketsphinx.js. |
This is a small demo that has both grammar and keyword options based on the demos in the webapp folder. To get a multiple keyword spotting demo on github-pages to work required moving some files from the master branch to the gh-pages branch that weren't previously there. I'm noting it, as it caused me a little confusion initially, regarding why things didn't work at first. It does keyword spotting for multiple keywords based on words in the dict.txt and kws.dict files. Currently, it uses static file versions of each, but the plan is to have users create those files via a preceding webpage, save files to the local computer, and load as pre-step to running the main pocketshpinx.js webpage. I'll look into the init parameters next. I might have to put this project on the shelf soon, but I'll hope to get back to it. To help users convert their video files into audio prepped for pocketsphinx.js entirely in the browser, I'll be using this tool, which is an Emscripten conversion of the command line tool FFmpeg. |
I would like to use this for keyword tagging on a wav file that the user would upload. I would like to do multiple keywords and have the keywords be supplied by the user at JavaScript runtime.
I already have a working python version of pocketsphinx that does this. I want to implement the same functionality in JavaScript.
I've gone through the issues and although there were several people who had similar questions, none provided examples of working code. @miguelmota provided a working version that was close to what I want in terms of multiple keywords, but his keywords are defined before converting to JavaScript. There was also some language in 2016 discussing the keyword issue that suggesting adding keywords at runtime on the JavaScript end might not be possible.
Any examples of these features (1. working on a loaded wav file 2. Keywords provided by the user at JavaScript runtime) would be appreciated.
Thanks to everyone who has worked on this project!
The text was updated successfully, but these errors were encountered: