python matplotlib seaborn visualization n-gram

How do I visualize two columns/lists of trigrams to see if the same wordcombination occur in both columns/lists?

so I have two Trigram-lists (20 Wordcombination each) e.g.

l1 = ('hello', 'its', 'me'), ('I', 'need', 'help') ...

l2 = ('I', 'need', 'help'), ('What', 'is', 'this') ...

Now I want to visualize these two list in one diagramm (maybe pairplot) to see if there are smiliarities (all 3 words must be the same).

Thank you in advance

Solution

The answer given from Larry the Llama seem to have missed the "see if there are similarities" as the solution uses set() which will remove any duplicates.

If you desire a full iteration to find fully similar trigrams:

merged = l1 + l2

results_counter = {}

# Iterate all the trigrams
for index, trigram in enumerate(merged):
    # Iterate all the trigrams which lay after in the array
    for second_index in range(index, len(merged)):
        all_same = True

        # Find all of which are the same as the comparing trigram
        for word_index, word in enumerate(trigram):
            if merged[second_index][word_index] == trigram[word_index:
                all_same = False
                break
        
        # If trigram was not found in the results_counter add the key else returning the value 
        previous_found = results_counter.setDefault(str(trigram), 0)
        # Add one
        previous_found[str(trigram)] += 1

# Will print the keys and the 
for key in previous_found.keys():
    # Print the count for each trigram
    print(key, previous_found[key])

Edit after clarification:

import seaborn as sns
import pandas as pd

d1 = [("hello", "its", "me"), ("dont", "its", "me")]
d2 = [("hello", "its", "me"), ("Hello", "I", "dont")]

word_to_number = {} 
number_to_word = {} # if you want to show the sentence again
def one_hot(l):
    """
    This function one hot encodes (converts each appearens of a word
    to a number) and returns the encoded list while also adding the
    keys to converter dictionaries for reverse converting.
    """
    one_hot_encoded = []
    for trigram in l:
        encoded_trigram = []
        for word in trigram:
            # Add encoding of the word
            encoded_word = word_to_number.setdefault(word, len(word_to_number))
            number_to_word[encoded_word] = word
            # Add to the one hot encoded trigram = {} 
            encoded_trigram.append(encoded_word)
        
        # Add to the list which is sent in
        one_hot_encoded.append(encoded_trigram)

    return one_hot_encoded

d1 = one_hot(d1)
d2 = one_hot(d2)

data = {}
for ind, trigram in enumerate(d1 + d2):
    # This will add each word to be compared
    data["t" + str(ind)] = trigram

frame = pd.DataFrame.from_dict(data)
print(frame)

plot = sns.pairplot(frame)
# Make it clear
plot.set(ylim=(frame.min().min() - 1, frame.max().max() + 1))
plot.set(xlim=(frame.min().min() - 1, frame.max().max() + 1))

import matplotlib.pyplot as plt
plt.show()

This piece will give you a pairplot of your trigrams, altough it will not be very intuitive as you must look for exactly linear values. You may use this but make sure you dont have to many different words as that will scew the axis and make it visually very difficult to see the results.