Search code examples
pythoncsvmapreduceblank-line

Removing blanks in Python


I am new to Python so i am trying to make is simple as possible. I am working with CSV file that contains data that i have to Mapreduce. In my mapper portion I get blank data which does not let me Reduce is. This is due to CSV file has blanks in it. I need an advice on how to remove blanks in my Mapper so it will not go into my Reducer.

Example of my result.
BLUE 1
GY  1
WT  1
    1
WH  1
    1
BLACK   1
    1
GN  1
BLK 1
BLACK   1
RED 1

My code

#!/usr/bin/python

from operator import itemgetter
import sys

sys_stdin = open("Parking_Violations.csv", "r")

for line in sys_stdin:
    line = line.split(",")
    vehiclecolor = line[33]          #This is the column in CSV file where data i need is located.

    try:                                                     
        issuecolor = str(vehiclecolor)
        print("%s\t%s" % (issuecolor, 1))

     except ValueError:
        continue

Solution

  • You can use the builtin string.strip function

    #!/usr/bin/python
    
    from operator import itemgetter
    import sys
    from typing import List, Any
    
    sys_stdin = open("Parking_Violations.csv", "r")
    
    for line in sys_stdin:
        vehiclecolor = line[33].strip()
    
        if vehiclecolor:
            issuecolor = str(vehiclecolor)
            print("%s\t%s" % (issuecolor, 1))
    

    What it does is getting the 33rd line and strips all whitespaces from it .strip(). This assumes that your file has actually 33 lines in it otherwise it will raise an exception.

    Then it checks if the vehiclecolor has any characters via the if and prints it only if there is a value.

    In Python an expression of an empty string is recognized as "false".