Search code examples
pythonjupyter-notebookgpsdata-analysisanalysis

data analysis using gps coordinates


I have this kind of data, which contain Timestamp, longitude,latitude and tripId,

can i probably find the waiting time in intersection from only this data or i need something else? and which informations can i get from this kind of data?

"timestamp","tripId","longitude","latitude" "2021-07-05 10:35:04","1866491","8.167035","53.160473" "2021-07-05 10:35:03","1866491","8.167023","53.160469" "2021-07-05 10:35:02","1866491","8.167007","53.160459" "2021-07-05 10:35:01","1866491","8.166987","53.160455" "2021-07-05 10:35:00","1866491","8.166956","53.160448" "2021-07-05 10:34:20","1866491","8.167286","53.15919" "2021-07-05 10:34:19","1866491","8.167328","53.15918" "2021-07-05 10:34:18","1866491","8.16735","53.159165" "2021-07-05 10:34:17","1866491","8.167371","53.159148" "2021-07-05 10:34:16","1866491","8.167388","53.159124" "2021-07-05 10:34:15","1866491","8.167399","53.159105" "2021-06-30 20:25:30","1862861","8.211288","53.150848" "2021-06-30 20:25:29","1862861","8.211264","53.150851" "2021-06-30 20:25:28","1862861","8.211269","53.150842" "2021-06-30 20:25:27","1862861","8.211273","53.150836" "2021-06-30 20:25:26","1862861","8.211279","53.150836" "2021-06-30 20:25:25","1862861","8.211259","53.150848" "2021-06-30 20:25:24","1862861","8.211263","53.15085" "2021-06-30 20:25:21","1862861","8.211455","53.150782" "2021-06-30 20:25:20","1862861","8.211453","53.150786" "2021-06-30 20:25:19","1862861","8.211449","53.150792"


Solution

  • Answer to this question:

    which informations can i get from this kind of data

    You have a timestamp, a tripId and a coordinate (longitude and latitude).

    Therefore, you know that the person or vehicle was at the location 8.166987 / 53.160455 during his/its trip 1866491 at 2021-07-05 10:35:01. In addition, you are able to calculate the trip duration.

    You are also able to create a line by connecting all coordinates of a trip in sequence of the time stamp. And then, you can find out at which location these trips do intersect with each other.

    From the line features, you can calculate their length (trip's distance). Together with the trip's duration, you are also able to calculate the average walking or driving speed.

    (Your example data has no trips intersecting each other though. I am not sure what you mean with waiting time.)


    Below an example script how to convert the points to lines and find the points where the trips cross each other's path:

    from itertools import combinations
    
    import pandas as pd
    import geopandas as gpd
    
    from shapely.geometry import LineString
    
    # Read CSV File
    df = pd.read_csv("trips.csv")
    
    # Create Points
    points = gpd.GeoDataFrame(
        df,
        geometry=gpd.points_from_xy(df.longitude, df.latitude),
        crs="EPSG:4326"
    )
    points = points.drop(columns=["latitude", "longitude"])
    
    # Make sure Points are ordered (important)
    points = points.sort_values(["tripId", "timestamp"])
    
    # Create Lines
    tolist = lambda x: LineString(x.tolist())
    lines = points.groupby(["tripId"], as_index=False)["geometry"].apply(tolist)
    lines = gpd.GeoDataFrame(lines, geometry="geometry", crs="EPSG:4326")
    

    Find points where trips are crossing paths with each other:

    # Get Intersection Points
    template={"tripA":[], "tripB":[], "geometry":[]}
    intersection_points = gpd.GeoDataFrame(template, geometry="geometry")
    for index in combinations(lines.index, 2):
        
        combination = lines.loc[index,:]
        
        geometries = combination["geometry"].tolist()
        point = geometries[0].intersection(geometries[1])
    
        if point:  # LINESTRING EMPTY evaluates to false
    
            trips = combination["tripId"].tolist()
            row = pd.Series([trips[0], trips[1], point], index=intersection_points.columns)
            intersection_points = intersection_points.append(row, ignore_index=True)
    

    Write intersection_points to a CSV file:

    intersection_points["longitude"] = intersection_points.geometry.x
    intersection_points["latitude"] = intersection_points.geometry.y
    
    columns = ["tripA", "tripB", "latitude", "longitude"]
    intersection_points.to_csv("intersections.csv", columns=columns)
    

    Write created lines and the intersection points to shape files:

    # Write Shape Files
    lines.to_file("trips.shp")
    intersection_points.to_file("intersections.shp")
    

    Example data with two trips intersecting each other (based on the question's example data):

    "timestamp","tripId","longitude","latitude"
    "2021-07-05 10:35:04","1866491","8.167035","53.160473"
    "2021-07-05 10:35:03","1866491","8.167023","53.160469"
    "2021-07-05 10:35:02","1866491","8.167007","53.160459"
    "2021-07-05 10:35:01","1866491","8.166987","53.160455"
    "2021-07-05 10:35:00","1866491","8.166956","53.160448"
    "2021-07-05 10:34:20","1866491","8.167286","53.15919"
    "2021-07-05 10:34:19","1866491","8.167328","53.15918"
    "2021-07-05 10:34:18","1866491","8.16735","53.159165"
    "2021-07-05 10:34:17","1866491","8.167371","53.159148"
    "2021-07-05 10:34:16","1866491","8.167388","53.159124"
    "2021-07-05 10:34:15","1866491","8.167399","53.159105"
    "2021-06-30 20:25:30","1862861","8.211288","53.150848"
    "2021-06-30 20:25:29","1862861","8.211264","53.150851"
    "2021-06-30 20:25:28","1862861","8.211269","53.150842"
    "2021-06-30 20:25:27","1862861","8.211273","53.150836"
    "2021-06-30 20:25:26","1862861","8.211279","53.150836"
    "2021-06-30 20:25:25","1862861","8.211259","53.150848"
    "2021-06-30 20:25:24","1862861","8.211263","53.15085"
    "2021-06-30 20:25:21","1862861","8.211455","53.150782"
    "2021-06-30 20:25:19","1862861","8.211449","53.150792"
    "2021-06-30 20:25:20","1862861","8.211453","53.150786"
    "2021-06-30 20:25:18","1862861","8.166607","53.159654"