Search code examples
pythonartificial-intelligencedata-miningtraining-data

Split a dictionary data into 2 parts and store into variable


So I have this cosine similarity metric dictionary data with me that is stored in the variable 'similarity'. PIC of the data . May I know how can I split this data into portion of 70 and 30 precents. I want to split this data into two parts and store those two in a variable most preferably the split can be 7:3 division

The reason i have asking this is I have an accuracy algorithm that gives the accuracy of that data but the problem is that i used same data for training as well as testing as you can see in the code so I receive 100% acc obviously as my training and testing data is same. so wanted to split data into 70 30 percent such training is 70 and testing is 30.

print(similarity)


train_r =  np.array(similarity)
test_r =  np.array(similarity)

train_c = train_r[:,10]
test_c = test_r[:,10]

a = train_c
b = test_c

cos_sim = (dot(a, b)/(norm(a)*norm(b))) * 100
print(cos_sim)

It would be really grateful if I can get the answer. Thanks so much


Solution

  • This should do it:

    split_rate = 0.7
    split_idx = int(len(similarity)*split_rate)
    train_r =  np.array(similarity)[:split_idx] 
    test_r =  np.array(similarity)[split_idx:]