Search code examples
pythonlistsplittext-segmentation

Parsing data from a file


I have been provided with a file containing data on recorded sightings of species, which is laid out in the format;

"Species", "\t", "Latitude", "\t", "Longitude"

I need to define a function that will load the data from the file into a list, whilst for every line in the list spiting it into three components, species name, latitude and longitude.

This is what i have but it is not working:

def LineToList(FileName):
    FileIn = open(FileName, "r")
    DataList = []
    for Line in FileIn:
        Line = Line.rstrip()
        DataList.append(Line)
        EntryList = []
        for Entry in Line:
            Entry = Line.split("\t")
            EntryList.append(Entry)
    FileIn.close()
    return DataList

LineToList("Mammal.txt")
print(DataList[1])

I need the data on each line to be separated so that i can use it later to calculate where the species was located within a certain distance of a given location.

Sample Data:

Myotis nattereri    54.07663633 -1.006446707
Myotis nattereri    54.25637837 -1.002130504
Myotis nattereri    54.25637837 -1.002130504

I am Trying to print one line of the data set to test if it is splittiing correctly but nothing is showing in the shell

Update:

This is the code i am working with now;

def LineToList(FileName):
    FileIn = open(FileName, "r")
    DataList = []
    for Line in FileIn:
        Line = Line.rstrip()
        DataList.append(Line)
        EntryList = []
        for Entry in Line:
            Entry = Line.split("\t")
            EntryList.append(Entry)
            return EntryList
    FileIn.close()
    return DataList

def CalculateDistance(Lat1, Lon1, Lat2, Lon2):

    Lat1 = float(Lat1)
    Lon1 = float(Lon1)
    Lat2 = float(Lat2)
    Lon2 = float(Lon2)

    nDLat = (Lat1 - Lat2) * 0.017453293
    nDLon = (Lon1 - Lon2) * 0.017453293

    Lat1 = Lat1 * 0.017453293
    Lat2 = Lat2 * 0.017453293

    nA = (math.sin(nDLat/2) ** 2) + math.cos(Lat1) * math.cos(Lat2) * (math.sin(nDLon/2) ** 2 )
    nC = 2 * math.atan2(math.sqrt(nA),math.sqrt( 1 - nA ))
    nD = 6372.797 * nC

    return nD

DataList = LineToList("Mammal.txt")                
for Line in DataList:
    LocationCount = 0
    CalculateDistance(Entry[1], Entry[2], 54.988056, -1.619444)
    if CalculateDistance <= 10:
        LocationCount += 1
    print("Number Recordings within Location Range:", LocationCount)

When running the programme come up with an error:

CalculateDistance(Entry[1], Entry[2], 54.988056, -1.619444) NameError: name 'Entry' is not defined

Solution

  • Your DataList variable is local to the LineToList function; you have to assign to another variable at file scope:

    DataList = LineToList("Mammal.txt")
    print(DataList[1])