1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.
  2. If you had a PIAF Forum account in the vBulletin days, log in with your old credentials. Otherwise, sign up again and we'll get you back in business as soon as we can.
  3. A serious FreePBX vulnerability has been reported. Update your Framework Module immediately. Click here for details.

PIONEERS Exploring Speech to Text

Discussion in 'Developers' Corner' started by wardmundy, Jan 12, 2012.

  1. ghurty Senior Member

    This googletts sounds better then Allision!

    I have been fooling around with it, however when I try to pass on a number, instead of reading it out as a whole number (one thousand five hundred and forty five), it reads out the individual digits.

    Any suggestions?

    Thanks
  2. wardmundy Nerd Uno

  3. lzaf Guru

    For numbers up to 9 digits the engine will read it as a whole number (eg 284956286 will be read like "two hundred eighty-four million four hundred ninety-five thousand two hundred eighty-five").
    For more than 9 digits it will read each digit individually.
    (eg 1284956286 will be read like "one two eight four nine five ... etc)

    This is something that cannot be tuned as far as i know.
  4. wardmundy Nerd Uno

    Try this approach. Add a colon or space between digits for normal cadence, or add a colon and a comma for a pause after a digit is spoken. In a future version of googletts.agi, perhaps a syntax could be added to handle this automagically, e.g. "[843-123-4567]" or "[8431234567]" would actually send Google the string as shown below. This gets a little more complex with international dialing obviously. If you're grabbing a CallerID number and passing it to this AGI script, then you obviously want to pass the CallerID number in the way it was received (which is typically all digits with no punctuation).

    Code:
    exten => 444,n,agi(googletts.agi,"8 4 3:,1:2:3:,4:5:6:7",en)
  5. sukasem Guru

    Hi,
    Anyway that asterisk-speech-recog script will take both key in digit and voice input as well.

    And maybe, some magic words that make script process right away like when you say Yes, No, or Stop...

    Cheers,
  6. lzaf Guru

    Speech recognition is not happening in real time. The voice data is first recorded and then send over to google for processing. This makes a voice controlling mechanism of the application highly impossible.
  7. lgaetz Pundit

    In the back of my mind I have been thinking that S2T should be useful for the rotary phone enthusiast crowd, the devices are still useable but it is getting harder and harder to get TDM/ATA devices that will accept pulse dialing. The only thing that I can think of is a silence timeout. Is that feasible? Is silence in a phone audio stream difficult to define or detect?
  8. lzaf Guru

    That's actually a good idea, and yes it is possible. I 've just tweaked the script adding silence detection. Now after 3 seconds of silence the recording will stop and the script will proceed sending voice data to google and getting back the results. Keep in mind that silence detection is not always perfect and might not work very well on some old analog or low quality phones that add static noise or if there's lots of background environment noise.
    The latest code can be found here. I'm not sure if the 3 seconds timeout is practical, I m always open to suggestions.
    Have fun testing it :biggrin5:
  9. lgaetz Pundit

    Perhaps a user selectable number of seconds with default of zero to disable it.
  10. wardmundy Nerd Uno

    3 seconds actually works pretty well. I've cleaned out all the previous calls so you can try the demo link for yourself: 1-405-FOR-WOLF. Everything can be triggered by doing nothing after the prompts. Here's the actual dialplan code for those that are curious:


    Code:
    ; Wolfram Alpha Dialplan Interface for PIAF2 servers
    exten => 4748,1,Answer()
    exten => 4748,2,Wait(1)
    exten => 4748,3,Set(calledbefore=${DB_EXISTS(blacklist/${CALLERID(num)})
    exten => 4748,4,Noop(${CALLERID(num)})
    exten => 4748,5,Noop(${calledbefore})
    exten => 4748,6,GotoIf($["foo${calledbefore}" = "foo1"]?11:51)
    exten => 4748,7,Goto(90)
    exten => 4748,10,Set(removed=${DB_DELETE(blacklist/${CALLERID(num)/${CALLERID(num)})})
    exten => 4748,11,Flite("Hi. Thanks for calling. We're very sorry. In order to give everyone an opportunity to try this service, we've had to limit calls to one call per person: You still can beat the system. Just call back from a different phone number. Have a great day. Good bye.")
    exten => 4748,12,Goto(91)
    exten => 4748,50,Set(DB(blacklist/${CALLERID(num)})=${CALLERID(num))
    exten => 4748,51,swift("Seriously,, After the beep, Say your question, then Press the pound key, or remain quiet.")
    exten => 4748,52(record),agi(speech-recog.agi,en-US)
    exten => 4748,53,Noop(= Script returned: ${status} , ${id} , ${confidence} , ${utterance} =)
    exten => 4748,54,swift("${utterance}")
    exten => 4748,55,Background(vm-star-cancel)
    exten => 4748,56,Background(continue-english-press)
    exten => 4748,57,Background(digits/1)
    exten => 4748,58,Read(PROCEED,beep,1,,1,3)                                        
    exten => 4748,59,GotoIf($["foo${PROCEED}" = "foo1"]?70)
    exten => 4748,60,GotoIf($["foo${PROCEED}" = "foo"]?70:90)
    exten => 4748,70,Set(DB(blacklist/${CALLERID(num)})=${CALLERID(num))
    exten => 4748,71,Set(FILE(/tmp/query.txt)=${utterance})
    exten => 4748,72,Background(one-moment-please)
    exten => 4748,73,System(/var/lib/asterisk/agi-bin/4747)
    exten => 4748,74,Set(foo=${FILE(/tmp/results.txt)})
    exten => 4748,75,swift("${foo}")
    exten => 4748,76,Goto(90)
    exten => 4748,90,swift("Have a nice day! Good bye.")
    exten => 4748,91,hangup
    
  11. lgaetz Pundit

  12. Aaron D. Vail New Member

    I had posted in this thread about 6 months ago, and unfortunately it disappeared. And even worse, so did the response. Now I've changed things up a bit 6 months ago i was running on PIAF Purple, and now I am on PIAF Green. I hoped that my post (and the reply that fixed it) was still here, but in moving to Green, I lost my VM script that combined this thread with a MP3 script. The plus side id I can get the below test to work now, as it wouldn't on my abused install of Purple.
    Code:
    flac --best --sample-rate=8000 msg0000.wav -o msg0000.flac
    speech-recog-cli.pl msg0000.flac | head -2 | tail -1 | cut -f 2 -d ":"
    So my original post stated that I don't know PERL at all, and yet while I can debug it sorta, I still get lost in the below script. What I would like to do is some how execute the above commands in the script below and have the results from above parsed into the temp email file. The ending result is that I get an email with transcription and attached MP3 instead of wav file. Now the MP3 Code....
    Code:
    #!/usr/bin/perl
    open(VOICEMAIL,"|/usr/sbin/sendmail -t");
    open(LAMEDEC,"|/usr/bin/dos2unix|/usr/bin/base64 -di|/usr/local/bin/lame --quiet --preset voice - /var/spool/asterisk/tmp/vmout.$$.mp3");
    open(VM,">/var/spool/asterisk/tmp/vmout.debug.txt");
    my $inaudio = 0;
    loop: while(<>){
      if(/^\.$/){
        last loop;
      }
      if(/^Content-Type: audio\/x-wav/i){
        $inaudio = 1;
      }
      if($inaudio){
        while(s/^(Content-.*)wav(.*)$/$1mp3$2/gi){}
        if(/^\n$/){
          iloop: while(<>){
            print LAMEDEC $_;
            if(/^\n$/){
              last iloop;
            }
          }
          close(LAMEDEC);
          print VOICEMAIL "\n";
          print VM "\n";
          open(B64,"/usr/bin/base64 /var/spool/asterisk/tmp/vmout.$$.mp3|");
          while(<B64>){
            print VOICEMAIL $_;
        print VM $_;   
          }
          close(B64);
          print VOICEMAIL "\n";
          print VM "\n";
          $inaudio = 0;
        }
      }
      print VOICEMAIL $_;
      print VM $_;
    }
    print VOICEMAIL "\.";
    print VM "\.";
    close(VOICEMAIL);
    close(VM);
     
    #CLEAN UP THE TEMP FILES CREATED
    #This has to be done in a separate cron type job
    #because unlinking at the end of this script is too fast,
    #the message has not even gotten piped to send mail yet
     
    
    So any help or possible restore of the "missing" or "removed" posts would be GREATLY appreciated, until then I'll just get MP3's (less space on my phone when I receive them).

    Aaron
  13. wardmundy Nerd Uno

  14. lgaetz Pundit

    There are options for recovering lost content. If you can structure a Google search such that the result lists the missing post, you may still be able to get it from Google's search cache. There is also the possibility it may be readable in the monster PDF file (link someone?).
    Last edited by lgaetz, May 29, 2013
  15. wardmundy Nerd Uno

  16. Aaron D. Vail New Member

    Sorry to hear about the crash. what is really sad is if I did the project a month ago I would still have it :D I will search the PDF and see what I can find. I will post my findings so that it maybe available to others (assuming I find it) or whether or not I'll need someone to help me again (saying I don't find it)
  17. Cam__ Member

  18. Cam__ Member

    Unfortunately that monster file doesn't help much, because all paragraphs are truncated after the first line, and even code blocks with long lines get truncated. Also, a couple of times I have searched that file for something I knew should be in there yet the search turned up empty. :(
  19. Aaron D. Vail New Member

  20. Aaron D. Vail New Member

    My Original Post ....

    Code:
        #!/usr/bin/perl
        open(VOICEMAIL,"|/usr/sbin/sendmail -t");
        open(LAMEDEC,"|/usr/bin/dos2unix|/usr/bin/base64 -di|/usr/bin/lame --quiet --preset voice - /var/spool/asterisk/tmp/vmout.$$.mp3");
        open(VM,">/var/spool/asterisk/tmp/vmout.debug.txt");
        my $inaudio = 0;
        loop: while(<>){
          if(/^\.$/){
            last loop;
          }
          if(/^Content-Type: audio\/x-wav/i){
            $inaudio = 1;
          }
          if($inaudio){
            while(s/^(Content-.*)wav(.*)$/$1mp3$2/gi){}
            if(/^\n$/){
              iloop: while(<>){
                print LAMEDEC $_;
                if(/^\n$/){
                  last iloop;
                }
              }
              close(LAMEDEC);
              print VOICEMAIL "\n";
              print VM "\n";
              open(B64,"/usr/bin/base64 /var/spool/asterisk/tmp/vmout.$$.mp3|");
              while(<B64>){
                print VOICEMAIL $_;
            print VM $_; 
              }
              close(B64);
              print VOICEMAIL "\n";
              print VM "\n";
              $inaudio = 0;
            }
          }
          print VOICEMAIL $_;
          print VM $_;
        }
        print VOICEMAIL "\.";
        print VM "\.";
        close(VOICEMAIL);
        close(VM);
       
        #CLEAN UP THE TEMP FILES CREATED
        #This has to be done in a separate cron type job
        #because unlinking at the end of this script is too fast,
        #the message has not even gotten piped to send mail yet

Share This Page