I am implementing famous Iris classification problem in python for 1st time. I have a data file namely iris.data. I have to import this file in my python project. I try my hand in Google Colab.
Sample data
Attributes are:
1.sepal length in cm 2. sepal width in cm 3. petal length in cm 4. petal width in cm 5. class:
5.1,3.5,1.4,0.2,Iris-setosa
4.9,3.0,1.4,0.2,Iris-setosa
4.7,3.2,1.3,0.2,Iris-setosa
4.6,3.1,1.5,0.2,Iris-setosa
5.0,3.6,1.4,0.2,Iris-setosa
5.4,3.9,1.7,0.4,Iris-setosa
4.6,3.4,1.4,0.3,Iris-setosa
5.0,3.4,1.5,0.2,Iris-setosa
I worte
import torch
import numpy as np
import matplotlib.pyplot as plt
FILE_PATH = "E:\iris dataset"
MAIN_FILE_NAME = "iris.dat"
data = np.loadtxt(FILE_PATH+MAIN_FILE_NAME, delimiter=",")
But it did not work and through errors.
But it worked when I wrote the code in Linux. But currently I am using windows 10 and it did not work.
Thank you for help in advance.
When constructing the file name for np.loadtxt
, there is a \
missing, as FILE_PATH+MAIN_FILE_NAME = 'E:\iris_datasetiris.dat
. To avoid having to add \
manually between FILE_PATH
and MAIN_FILE_NAME
, you could use os.path.join
, which does this for you.
import os
import numpy as np
FILE_PATH = 'E:\iris dataset'
MAIN_FILE_NAME = 'iris.dat'
data = np.loadtxt(os.path.join(FILE_PATH, MAIN_FILE_NAME), delimiter=',') # not actually working due to last column of file
On the other hand, I am not sure why it did work with Linux, because numpy is not able to convert the string "Iris-setosa" into a number, which np.loadtxt tries to do. If you are only interested in the numeric values, you could use the usecols
keyword of np.loadtxt
data = np.loadtxt(os.path.join(FILE_PATH, MAIN_FILE_NAME), delimiter=',', usecols=(0, 1, 2, 3))