I have the following dataset and I need to remove all of the links from it. The csv looks like this:
Does anyone know how I can quickly and easily do this?
You can use a regular expression in python as such:
import re
for x in list :
re.sub("http\S*\s", "", x)
where list is a list of your csv data.
This is the code I use to preprocess Twitter Data:
all_text = re.sub("#\S*\s", "", all_text)
all_text = re.sub("W+", "", all_text)
all_text = re.sub("@\S*\s", "", all_text)
all_text = re.sub("http\S*\s", "", all_text)