Search code examples
pythonmachine-learningapriori

Getting no output using Apriori Algorithm


My Dataframe:

number  assignment_group    short_description   Issue Labels
Req123  Support             TP issue         Battery Failure

My code:

Converting the data frame into lists

observations = []
for i in range(len(df1)):
    observations.append([str(df1.values[i,j]) for j in range(0,10)])

Fitting the data to the algorithm

from apyori import apriori
associations = apriori(observations, min_length = 2, min_support = 0.2, min_confidence = 0.2, min_lift = 3)

Converting the associations to lists

associations = list(associations)
print(associations)

Getting no output when returning this.


Solution

  • I do not know what your df1.values really are, however, with

    df1 = [
        'Aa', 'Aa', 'Aa', 'Aa', 'Aa',
        'Bb', 'Cc', 'Dd', 'Ee', 'Ff',
    ]
    
    observations = []
    for i in range(len(df1)):
        observations.append([str(df1[i][j]) for j in range(0, 2)])
    

    the following code works.

    from apyori import apriori
    associations = apriori(
        observations,
        min_length = 2,
        min_support = 0.2,
        min_confidence = 0.2,
        min_lift = 2
    )
    
    associations = list(associations)
    print(associations)
    

    The output is:

    [
        RelationRecord(
            items=frozenset({'a', 'b'}), 
            support=0.5, 
            ordered_statistics=[
                OrderedStatistic(
                    items_base=frozenset({'a'}),       
                    items_add=frozenset({'b'}),
                    confidence=1.0,
                    lift=2.0
                ), 
                OrderedStatistic(
                    items_base=frozenset({'b'}),
                    items_add=frozenset({'a'}),
                    confidence=1.0,
                    lift=2.0
                )
            ]
        )
    ]
    
    

    I only changed min_lift from 3 to 2. When it was 3, the output was empty.

    Apriori algorithm is to find frequent item sets as association rules between each set. How frequent and the length of item sets are tuned by hyper parameters. So try different hyper parameters and see what you get.