Skip to content

How to stream audio to speech api while recording ? #98

@rena1234

Description

@rena1234

Hello guys, I'm trying to reduce the latency time in my project, so I have been
trying to send chunks of an audio while I record it, what I had tried until now was mix a code that sends a previously recorded audio by chunks:

def RecognizeSpeech(AUDIO_FILENAME,CHUNK_SIZE):
   client = Wit('MYTOKENHERE') 
   def wavIterator():
       wav = open(AUDIO_FILENAME, 'rb')
       chunk = wav.read(CHUNK_SIZE)
       while chunk:
           yield chunk
           chunk = wav.read(CHUNK_SIZE)
   resp = client.speech(wavIterator(), None,
           {'Content-Type': 'audio/wav', 'Transfer-encoding':'chunked'})

with this tutorial's code: https://indianpythonista.wordpress.com/2017/04/10/speech-recognition-using-wit-ai/
and made this frankstein:

def recReturnWavIterator(RECORD_SECONDS, CHUNK_SIZE, client):

    #--------- SETTING PARAMS FOR OUR AUDIO FILE ------------#
    FORMAT = pyaudio.paInt16    # format of wave
    CHANNELS = 2                # no. of audio channels
    RATE = 44100                # frame rate
    CHUNK = CHUNK_SIZE          # frames per audio sample
    #--------------------------------------------------------#
 
    # creating PyAudio object
    audio = pyaudio.PyAudio()
 
    # open a new stream for microphone
    # It creates a PortAudio Stream Wrapper class object
    stream = audio.open(format=FORMAT,channels=CHANNELS,
                        rate=RATE, input=True,
                        frames_per_buffer=CHUNK)
    print("Listening") 
    for i in range(int(RATE / CHUNK * RECORD_SECONDS)-1):
        # read audio stream from microphone
        data = stream.read(CHUNK)
        yield data
    print("Finished recording")
def RecognizeSpeech(CHUNK_SIZE):
    client = Wit('MYTOKENHERE')
    resp = client.speech(recReturnWavIterator(5,CHUNK_SIZE,client),None,{'Content-Type': 'audio/wav', 'Transfer-encoding': 'chunked'})
    print('Yay, got Wit.ai response: ' + str(resp))

wich does return always an empty text
Yay, got Wit.ai response: {'_text': None, 'entities': {}, 'msg_id': '7ed74ba3-698a-41a9-8158-8dd3857c3808'}

Is it possible to do something like that ? How ?
PS: Sorry, I am not very experienced with programming.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions