I have a database containing tweets. Furthermore, I have classified these tweets as either being 'negative', 'neutral' or 'positive'. I have done so manually and am now trying to figure out how well my computer could classify them, based on a Naive Bayes classifier.
For testing the accuracy of classification (the amount of tweets classified by the computer in the same way as I did manually divided by the total amount), a script has been written.
I however face a problem with this PHP script. When running it, it gives the error 'Division by zero in C:\wamp\and-so-on'. This is probably due to the fact that the counter is not updated. Furthermore, the amount of 'right classes' do not seem to be updated either. These two parts are essential as the formula for accuracy is: 'right classes' divided by 'counter'.
My question is: what do you think the problem is when looking at the script? And how could I potentially fix it?
The script for testing:
$test_array = array();
$counter = 0;
$timer1 = microtime(true);
$right_classes = 0;
foreach ($test_set as $test_item) {
$tweet_id = $test_item['tweet_id'];
$class_id_shouldbe = $test_item['class_id'];
$tweet = Tweets::loadOne($tweet_id);
// # Preprocess if not done already
// $steps->processTweet($tweet_id, $tweet);
// $tweet = Tweets::loadOne($tweet_id);
if ((int) $tweet['classified'] > 0 || !$tweet['valid']) continue;
if (strlen($tweet['processed_text']) == 0) {
$steps->processTweet($tweet_id, $tweet);
$tweet = Tweets::loadOne($tweet_id);
if (strlen($tweet['processed_text']) == 0) {
echo "Kon tweet '$tweet_id' niet processen. <br>";
continue;
}
}
$class_id = $classifier->classify($tweet['processed_text']);
# Add tweets in database
// Tweets::addClassId($tweet_id, $class_id_shouldbe);
$test_array[$tweet_id] = array(
'what_human_said' => $class_id_shouldbe,
'what_classifier_said' => $class_id,
);
if ($class_id_shouldbe == $class_id) $right_classes++;
$counter++;
if ($counter > 936) break;
echo "$tweet_id,$class_id_shouldbe,$class_id<br>";
}
$timer2 = microtime(true);
echo '<br><br>klaar in '.round($timer2-$timer1, 3).' sec<br>';
echo ($right_classes/$counter)*100 .' %';
exit();
first of all just fix error, and then try to verify why $counter
is zero. To fix $counter
just verify before division:
if($counter!=0) echo ($right_classes/$counter)*100 .' %'; else echo '0 %';
Then looking at your code, you use continue to get next item in foreach
then it's not guaranteed that $counter
is reached and then you get Division by zero
error.
Hope it helps!