Search code examples
pythongoogle-cloud-platformgoogle-cloud-storagegoogle-cloud-datalablibrosa

How to read audio file from google cloud storage bucket and play with ipd in a datalab notebook


I want to play a sound file in a datalab notebook which I read from a google cloud storage bucket. How to do this?


Solution

  • import numpy as np
    import IPython.display as ipd
    import librosa
    import soundfile as sf
    import io
    from google.cloud import storage
    
    BUCKET = 'some-bucket'
    
    # Create a Cloud Storage client.
    gcs = storage.Client()
    
    # Get the bucket that the file will be uploaded to.
    bucket = gcs.get_bucket(BUCKET)
    
    # specify a filename
    file_name = 'some_dir/some_audio.wav'
    
    # read a blob
    blob = bucket.blob(file_name)
    file_as_string = blob.download_as_string()
    
    # convert the string to bytes and then finally to audio samples as floats 
    # and the audio sample rate
    data, sample_rate = sf.read(io.BytesIO(file_as_string))
    
    left_channel = data[:,0]  # I assume the left channel is column zero
    
    # enable play button in datalab notebook
    ipd.Audio(left_channel, rate=sample_rate)