Search code examples
pythonregexspecial-characters

In an object column of a data frame find the entries which only have special characters


I have a dataset with multiple columns. In a particular column which has text entries(feedbacks), wanted to tag all those entries which only have special characters.

I know how to remove special characters in the whole column, but I am not able to tag those rows which only have special characters

import string
import re


def checkString(data, Feedback):
    for let in data.Feedback.str.lower():
        if let in string.ascii_lowercase:
            data["special_flag"] = "Valid"
        else:
            data["special_flag"] = "Not_Valid"

    data1 = data['Feedback'].apply(checkString(data, Feedback), axis=1)


def spec(data, x):
    if not re.match(r'^[_\W]+$', data.x):
        data["special"] = 'valid'
    else:
        data["special"] = 'invalid'


    data1 = data['Feedback'].apply(spec(data, Feedback), axis=1)

When I am running these function I am getting an error that- "name 'Feedback' is not defined"


Solution

  • The error tells you that there is no Feedback name in the current scope.

    However, you may solve the issue using code like this:

    df['New'] = ''  # Adding a new column
    df['New'][df['Feedback'].str.contains(r'^[_\W]+$')] = "Invalid" # Invalid if matched
    df['New'][~df['Feedback'].str.contains(r'^[_\W]+$')] = "valid"  # Valid if not matched