Azzera filtri
Azzera filtri

Taxi datset - lat lon colocation

4 visualizzazioni (ultimi 30 giorni)
LeoAiE
LeoAiE il 11 Nov 2022
Modificato: LeoAiE il 12 Nov 2022
Hi everyone,
I’m practicing data science with NY Taxi data (https://www.kaggle.com/competitions/nyc-taxi-trip-duration/data). I want to check if certain taxis were collocated within a short distance from each other.
The task I’m trying to accomplish is:
  • Measure the distance for each taxi (against all other taxis)
  • If the taxis were within 20 m of each other – just an example
  • Check the date column associate with those lat lon matches
  • If the date match, meaning the two taxis visited the general area on the same date
  • Return the two unique IDs of those taxis, the date, the distance, and the lat lon of both of them
Here is my progress and I would really appreciate your help.
taxi = readtable('yellow_tripdata_2015-01.csv') % reading the data
taxi.uniqueID = randi([1000000,9999999],[height(taxi),1]) % Creating random hopefully unique ID :)
% measuring the distance from pick up to drop off
taxi.distance_travelled_in_M = deg2km(distance(taxi.pickup_latitude, taxi.pickup_longitude, taxi.dropoff_latitude, taxi.dropoff_longitude))
% spliting date and time in two diffrent columns
taxi.date = datestr(datetime(taxi.tpep_pickup_datetime),'dd/mm/yyyy')
taxi.time = datestr(datetime(taxi.tpep_pickup_datetime),'HH:MM:SS')
% In this loop I took each lat lon point and ran it agaist the rest of the
% data- probably not an efficient way - open to suggestions
% then converting the distance to meters
% checking if the distance within for example 20 meters
% and the date is the same - this step is not implemented yet , need help
% return the unique ids of both taxis, date of that event, the location
% (lat,lon), the distacne how far from each other
for idx = 1:length(taxi.pickup_latitude)
my_dist = distance(taxi.pickup_latitude(idx), taxi.pickup_longitude(idx), taxi.pickup_latitude, taxi.pickup_longitude); % distqances in degrees
my_dist = deg2km(my_dist); % distance in meters
if my_dist < 2
pos_results = [pos_results;my_dist;taxi(idx:idx, ["uniqueID", "date", "pickup_latitude", "pickup_longitude"])]
end
end
  2 Commenti
Walter Roberson
Walter Roberson il 12 Nov 2022
my_dist = distance(taxi.pickup_latitude(idx), taxi.pickup_longitude(idx), taxi.pickup_latitude, taxi.pickup_longitude); % distqances in degrees
Not really degrees. You are doing Euclidean distance calculations on a non-linear surface. The result is only meaningful if the data is recorded near the equator. New York City is 40.7128N and cosd() of that is about 0.76 so degrees longitude there are only roughly 3/4 of the distance of a degree latitude.
Have you considered calculating Great Circle Distance ?
LeoAiE
LeoAiE il 12 Nov 2022
Thank you for your comments! I have not. I will research how to calculate great circle disctance!

Accedi per commentare.

Risposte (1)

Walter Roberson
Walter Roberson il 12 Nov 2022
Modificato: Walter Roberson il 12 Nov 2022
Instead, do a rangesearch which is a knnsearch by distance.
Either do a custom distance calculation of Great Circle Distance, or else divide the longitudes by cosd() of the latitudes in order to adjust the two of them to be on the same scale, and then use Euclidean.
  3 Commenti
Walter Roberson
Walter Roberson il 12 Nov 2022
https://www.mathworks.com/help/map/ref/distance.html if you have the mapping toolbox
LeoAiE
LeoAiE il 12 Nov 2022
Modificato: LeoAiE il 12 Nov 2022
Yes I do have it and thats why I use the distance function in my original post but I guess I have to specify somehting like this
s = referenceSphere('Earth')
distance(taxi.pickup_latitude(1), taxi.pickup_longitude(1), taxi.pickup_latitude(2), taxi.pickup_longitude(2),s)
The issue now how to check the date column associate with those lat lon matches and If the date match, meaning the two taxis visited the general area on the same date then return the two unique IDs of those taxis, the date, the distance, and the lat lon of both of them

Accedi per commentare.

Categorie

Scopri di più su Earth, Ocean, and Atmospheric Sciences in Help Center e File Exchange

Prodotti


Release

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by