Search code examples
pythonpandasnumpylogistic-regression

Comparison ANN and logistic regression on 200 datasets


I am trying to compare the performance of an ANN and the logistic regression on 200 different datasets. Every dataset is named Dataseti where i is a number from 15 to 214. Therefore I run a loop:

for i in range(15,215):

and let the ANN and the logistic regression be trained and classifying the data. What I want to catch is the Error of the logistic regression:

"ValueError: This solver needs samples of at least 2 classes in the data, but the data contains only one class: 0.0"

When the error is catched I want that this dataset is skipped and it will be proceeded with the next one (i+1).

Is this somehow possible? I am quite new to programming and have no clear idea how to handle this exception. I already thought about doing it somehow with a if else formulation:

if(dataset[:,-1].max() == 1)
....
else: 

But I do not know what to take into the else expression. Would be great if anyone could help me on this issue. Thanks!


Solution

  • Use try/except. Here's some pseudocode for your specific case:

    for i in range(15,215):
        dataset = datasets[i]
    
        # first, try to evaluate your desired code
        try: 
            ANN(dataset)
            logistic(dataset)
    
        # if a ValueError occurs, catch it, report on it, and continue
        except ValueError as e: 
            print("Error on dataset {i}: {err}".format(i=i, err=e))
    

    And here's a working example with toy data:

    data = [1, 2, "foo", 3]
    
    for i in range(0,4):
    
        try:
            print(int(data[i]))
    
        except ValueError as e:
            print("Error on item {i}: {err}".format(i=i, err=e))
    

    Output:

    1
    2
    Error on item 2: invalid literal for int() with base 10: 'foo'
    3