I am using Python with the library opensmile. My target is to generate *.arff files to use in Weka3 ML-Tool. My problem is, that It is rather unclear for me how to save the extracted features into an *.arff file.
for example:
import opensmile
smile = opensmile.Smile(
feature_set=opensmile.FeatureSet.ComParE_2016,
feature_level=opensmile.FeatureLevel.Functionals,
)
y = smile.process_file('audio.wav')
//ToDo save y in arff
I should be possible since there are questions about the generated files eg:here. However I can't find anything specific about that.
Instead of generating ARFF directly, you could generate a CSV file in a format that Weka can load it:
import csv
import pandas as pd
import opensmile
# configure feature generation
smile = opensmile.Smile(
feature_set=opensmile.FeatureSet.ComParE_2016,
feature_level=opensmile.FeatureLevel.Functionals,
)
# the audio files to generate features from
audio_files = [
'000000.wav',
'000001.wav',
'000002.wav',
]
# generate features
ys = []
for audio_file in audio_files:
y = smile.process_file(audio_file)
ys.append(y)
# combine features and save as CSV
data = pd.concat(ys)
data.to_csv('audio.csv', quotechar='\'', quoting=csv.QUOTE_NONNUMERIC)
As a second (and optional) step, convert the CSV file to ARFF using the CSVLoader class from the command-line:
java -cp weka.jar weka.core.converters.CSVLoader audio.csv > audio.arff
NB: You will need to adjust the paths to audio files, weka.jar
, CSV and ARFF file to fit your environment, of course.