I have multiple CSV files with two columns in each of these CSV files:
I don't know what the best way would be to remove all duplicates of a link and description when found, leaving only one, so that there is only one instance of the link and description left. It would be best if I could import all of the CSV files at once, there is a possibility that one link appears in multiple CSV files. The link and description is there is a duplicate would be EXACTLY the same. Thanks!
This can be done by doing a pd.concat followed by drop_duplicates.
import pandas as pd
df1 = pd.read_csv('path/to/file1.csv')
df2 = pd.read_csv('path/to/file2.csv')
df = pd.concat([df1, df2]).drop_duplicates().reset_index(drop=True)
Please refer to the stackoverflow answer here to understand more.