this is the code:
import mxnet
from mxnet import io, gluon, autograd
from mxnet.gluon import nn
from mxnet.gluon.data import ArrayDataset
ctx = mxnet.gpu() if mxnet.test_utils.list_gpus() else mxnet.cpu()
iter = io.CSVIter(data_csv="data/housing.csv", batch_size=100, data_shape=(10, ))
loss = gluon.loss.L2Loss()
net = nn.Sequential()
net.add(nn.Dense(1))
net.initialize(mxnet.init.Normal(sigma=0.01), ctx=ctx)
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.001})
for (i, iter_data) in enumerate(iter):
data = iter_data.data[0]
label_data = data[:, 8]
train_data = data[:, 3]
with autograd.record():
l = loss(net(train_data), label_data)
l.backward()
trainer.step(100)
print(l.mean().asnumpy())
the data is USA's housing price, data is like these:
-122.23,37.88,41.0,880.0,129.0,322.0,126.0,8.3252,452600.0,NEAR BAY -122.22,37.86,21.0,7099.0,1106.0,2401.0,1138.0,8.3014,358500.0,NEAR BAY -122.24,37.85,52.0,1467.0,190.0,496.0,177.0,7.2574,352100.0,NEAR BAY -122.25,37.85,52.0,1274.0,235.0,558.0,219.0,5.6431,341300.0,NEAR BAY -122.25,37.85,52.0,1627.0,280.0,565.0,259.0,3.8462,342200.0,NEAR BAY -122.25,37.85,52.0,919.0,213.0,413.0,193.0,4.0368,269700.0,NEAR BAY -122.25,37.84,52.0,2535.0,489.0,1094.0,514.0,3.6591,299200.0,NEAR BAY -122.25,37.84,52.0,3104.0,687.0,1157.0,647.0,3.12,241400.0,NEAR BAY -122.26,37.84,42.0,2555.0,665.0,1206.0,595.0,2.0804,226700.0,NEAR BAY -122.25,37.84,52.0,3549.0,707.0,1551.0,714.0,3.6912,261100.0,NEAR BAY -122.26,37.85,52.0,2202.0,434.0,910.0,402.0,3.2031,281500.0,NEAR BAY -122.26,37.85,52.0,3503.0,752.0,1504.0,734.0,3.2705,241800.0,NEAR BAY
the data is from https://raw.githubusercontent.com/ageron/handson-ml/master/datasets/housing/housing.tgz
the result is what confusing me:
[1.4657609e+10] [2.184351e+17] [7.357278e+24] [1.0737887e+32] [nan] [nan] ...
So what's wrong with my code?
====================UPDATE================================================== Using zscore to normalize the feature array, but didn't help(forgive my laziness to using numpy's function to calculate the zscore)
import mxnet
import numpy as np
from mxnet import io, gluon, autograd, nd
from mxnet.gluon import nn
from mxnet.gluon.data import ArrayDataset
ctx = mxnet.gpu() if mxnet.test_utils.list_gpus() else mxnet.cpu()
BATCH_SIZE = 100
iter = io.CSVIter(data_csv="data/housing.csv", batch_size=BATCH_SIZE, data_shape=(10, ))
loss = gluon.loss.L2Loss()
net = nn.Sequential()
net.add(nn.Dense(1))
net.initialize(mxnet.init.Normal(sigma=0.01), ctx=ctx)
trainer = gluon.Trainer(net.collect_params(), 'sgd', {'learning_rate': 0.001})
for (i, iter_data) in enumerate(iter):
data = iter_data.data[0]
label_data = data[:, 8]
train_data = data[:, 3]
train_data_np = train_data.asnumpy()
stand = np.std(train_data_np)
mean = np.mean(train_data_np)
b = (train_data_np - mean) / stand
train_data = nd.array(b)
with autograd.record():
l = loss(net(train_data), label_data)
l.backward()
trainer.step(BATCH_SIZE)
print(l.mean().asnumpy())
There might be multiple issues why your code behaves like that: simple model, lack of features, non-normalized data... I suggest you to look at the example of house prediction in the MXNet repository - https://github.com/apache/incubator-mxnet/tree/master/example/gluon/house_prices
There is a detailed explanation of the code in the following chapter of D2L online book: http://d2l.ai/chapter_multilayer-perceptrons/kaggle-house-price.html