pgrouting Dijksta Travelling Salesman Problem (TSP) with OpenStreetmap

By dose | November 17, 2013
Under: Uncategorized

I recently had to educate myself about Route Optimization due to an upcoming project. One of the interesting features was to solve the so-called “Travelling Salesman Problem“: To find the correct ordering of places to visit when driving on a road in order to minimize the number of km to drive (so cost=length of route in this case).

I wanted to do the Optimazation on freely available data, therefore I choose OpenStreetMap as a data source. I then tried to find out which program is suited best to do such kind of optimization under Linux. I first stumbled upon OsmSharp and took the pain of all the Mono compilation woes, but soon learned, that it does not suit well for the imposed task, because it has to load all data before doing optimization and so it doesn’t work well in a timely manner, if you for example want to do an optimization for a lot of points within Austria.

So I learned about pgrouting which seems to be a PostgreSQL database designed specially for Geospatial processing. There is a simple guide on how to import OSM data to pgrouting.  libgaul isn’t needed anymore, as the TSP extention dependency on it was removed in version 2.0.

I found out, that there are simple TSP-Solutions in pgrouting using the pgr_tsp function, however, they only do euclidean distance checking and that wasn’t what I wanted (even tough the help for the function mentions that euclidean distance should be sufficient in most cases – for me, it wasn’t good enough). I wanted to find the correct order of points using the given road network. In order to do this, you have to take all your points you want to visit and then calculate the distance between all of them forming a distance matrix. As the way between all points is the same in both directions, you end up calculating n²/2-n ways (where n is the number of points to visit). Then you have to do a TSP-calculation over that matrix and as a result, you get the correct order.
To calculate the fastest route between 2 points, you can use the pgr_kdijkstraCost SQL function of pgrouting (as the name implies, it’s using the Dijkstra algorithm).

I am an absolute newbie on PostgreSQL, therefore my solution may not be the cleanest, but here is how I solved this problem:

1) Install my pgrouting Dijkstra functions, which basically offer you 2 functions:

pgr_tspDijkstra    - Does a TSP optimization using Dijkstra distances
pgr_tspDijkstraLen - Does the same, but implies that cost = length of route

2) My functions are designed so that you have to create a table containing all the points that you want to visit on your route. So create this table:

create table my_route (id serial, lat double precision, lon double precision,
  node integer);

3) Now you most probably have the points you want to visit as Longitude/Latitude points, so just insert your points into that table:

insert into my_route(lat,lon) values (48.30609, 14.28642);
...

4) Now you have to calculate the correct nodes for all the points you entered, because pgrouting is only accepting node-IDs. This can be done with the folowing statement:

update my_route g set node=(
  SELECT source FROM (
    SELECT source, distance(the_geom,
      GeometryFromText('POINT('||g.lon||' '||g.lat||')', 4326)) AS dist
    FROM ways
    WHERE the_geom && setsrid(
     ('BOX3D('||g.lon-0.1||' '||g.lat-0.1||','||g.lon+0.1||' '||g.lat+0.1||')'
     )::box3d, 4326)
    order by dist LIMIT 1
  ) as foo
);

5) Now that you have all the nodes in your table, it’s time to do the real calculation. If length is the only cost of a route for you, simply use the pgr_tspDijkstraLen function. You have to find out the node ID of the starting node from your my_route table and pass it to the function, so that it knows where to start. The syntax for the function is:

function pgr_tspDijkstraLen(thetbl text, start_id integer,
   end_id integer default (-1))
thetbl   - Name of your table that contains all the nodes to visit
start_id - Node ID of the starting point
end_id   - Node ID of the last point to visit. If not given, it's a round trip
           ending at the starting point.

Therefore it’s as simple as i.e.

select * from pgr_tspDijkstraLen('my_route', 189064);

As a result, you get a table with the folowing columns:

 seq | id1 |  id2   |       cost
-----+-----+--------+------------------
   0 |   2 |  64186 | 11.1983268959869
...
seq  - row sequence number in the resulting table.
id1  - internal index into the distance matrix
id2  - id of the node
cost - cost to traverse from the current node to the next node.

So theoretically, you can use the following statement to get your lon/lat points ordered:

select id,lon,lat from pgr_tspDijkstraLen('my_route', 189064
) dj, my_route rt where dj.id2=rt.node;

Now there may be the case that you have your own slightly trickier definition of the cost, not just the length of the route. In this case, you can use the pgr_tspDijkstra function.
Its syntax is:

function pgr_tspDijkstra(thetbl text, sql text, start_id integer, 
   end_id integer default (-1))

thetbl   - Name of your table that contains all the nodes to visit
sql      - SQL statement that returns a pgr_costResult to do TSP optimization on
start_id - Node ID of the starting point
end_id   - Node ID of the last point to visit. If not given, it's a round trip
           ending at the starting point.

So basically, it’s just the same as the pgr_tspDijkstraLen function, but hsa the additional parameter sql, which lets you define your own costresult to do TSP optimization on. For an example on how to use this, just have a look at what pgr_tspDijkstraLen returns:

     return query SELECT * FROM pgr_tspDijkstra(thetbl,
          'SELECT gid AS id, source::integer, target::integer,
           length::double precision AS cost FROM ways',
           start_id, end_id);

I hope that this function is useful to anybody and feedback is appreciated.

10 comments | Add One

Comments

  1. Ivan - 12/10/2013 at 16:57

    Nice example. Thanks for providing your solution to all of us. I played a little bit with your code and stumbled upon something interesting, which I would like to share with you.

    Step 4 can be reduced drastically if you use the KKN-Method newly implemented in PostGIS 2.0.
    Your code could look like this

    update my_route g set node=(
    SELECT source FROM ways
    ORDER BY the_geom st_GeometryFromText(‘POINT(‘||g.lon||’ ‘|| g.lat||’)’, 4326)
    LIMIT 1
    );

    I checked this on my very very slow machine and gained a couple of seconds with this approach.

    I thought it might be interesting for you.

    Bye
    Ivan

  2. Ivan - 12/10/2013 at 16:59

    Sorry little mistake:

    update my_route g set node=(
    SELECT source FROM ways
    ORDER BY the_geom st_GeometryFromText(‘POINT(‘||g.lon||’ ‘|| g.lat||’)’, 4326)
    LIMIT 1
    );

  3. Ivan - 12/10/2013 at 17:00

    OK
    your comment system sucks 🙂

    Between the_geom and st_GeometryFromText should be “” (without the “”)

  4. dose - 12/12/2013 at 12:59

    Hi,

    Sorry for the wordpress bugs which seem to limit some input characters?
    Maybe you could just spell the missing characters out?
    I guess there may be problems with some type of brackets?I’d be curious how to improve the query 🙂

  5. Ivan - 12/15/2013 at 13:55

    Hi,

    ok let’s try it this way:

    update my_route g set node=(
    SELECT source FROM ways
    ORDER BY the_geom <-> st_GeometryFromText(’POINT(’||g.lon||’ ‘|| g.lat||’)’, 4326)
    LIMIT 1
    );

  6. Ivan - 12/15/2013 at 14:15

    Second improvement:

    Use bounding boxes for your queries. Therefore you have to include “the_geom” into the my_route table followed by an ANALYZE on the “the_geom” column.

    Then use
    bbox := st_expand(st_estimated_extent(‘my_route’, ‘the_geom’), 0.1); –0.1 can be varied of course

    Finally include this bounding box (bbox) in your subsequent pgr_tspdijkstralen method exactly the way you already did in step 4).

    This boiled my queries down from something around 13 ~ 15 seconds to 0.5-1.5 seconds.

    Hope you might find it helpful

    Cheers

  7. Carsten - 02/20/2014 at 10:18

    Hi,

    thanks for this post! This is really the solution for my TSP. But my DB is complaining about one of your functions.

    I got my routing table from osm2po. The table has the following structure:

    1;”id”;”integer”
    2;”osm_id”;”bigint”
    3;”osm_name”;”character varying”
    4;”osm_meta”;”character varying”
    5;”osm_source_id”;”bigint”
    6;”osm_target_id”;”bigint”;
    7;”clazz”;”integer”
    8;”flags”;”integer”
    9;”source”;”integer”
    10;”target”;”integer”
    11;”length”;”double precision”
    12;”kmh”;”integer”
    13;”cost”;”double precision”
    14;”reverse_cost”;”double precision”
    15;”x1″;”double precision”
    16;”y1″;”double precision”
    17;”x2″;”double precision”
    18;”y2″;”double precision”
    19;”geom_way”;”geometry(LineString,4326)”

    – I install your functions via SQL window in pgAdmin3 (copy&paste then excecute)
    – I create and populate the my_route table
    – When it comes to step 5 and I try

    select id,lon,lat from pgr_tspDijkstraLen(‘my_route’, 1260
    ) dj, my_route rt where dj.id2=rt.node;

    my DB warns

    ERROR: cannot concatenate incompatible arrays
    DETAIL: Arrays with differing element dimensions are not compatible for concatenation.
    CONTEXT: PL/pgSQL function “pgr_makedijkstramatrix” line 25 at assignment PL/pgSQL function “pgr_tspdijkstra” line 8 at RETURN QUERY
    PL/pgSQL function “pgr_tspdijkstralen” line 3 at RETURN QUERY

    ********** Fehler **********

    ERROR: cannot concatenate incompatible arrays SQL Status:2202E
    Detail:Arrays with differing element dimensions are not compatible for concatenation.
    Kontext:PL/pgSQL function “pgr_makedijkstramatrix” line 25 at assignment PL/pgSQL function “pgr_tspdijkstra” line 8 at RETURN QUERY
    PL/pgSQL function “pgr_tspdijkstralen” line 3 at RETURN QUERY

    Do you have any idea what went wrong? I really really need your solution get working. I would appreciate a lot.

    Thanks

    Carsten

  8. Stephen Woodbridge - 02/20/2014 at 17:11

    I recently added a new function in the develop branch called pgr_vidsToDMatrix() in the src/kdijkstra/ directory that takes an array of vertex ids and returns a distance matrix. It makes N calls to pgr_kdijkstra() in C with some optimizations so it is much faster. This will be part of the next release of pgRouting.

    BTW, nice write up!

  9. Daniel - 07/18/2014 at 04:02

    This is great, thanks for posting!

    I’m getting this error trying to run your pgr_tspDijkstraLen function.

    ERROR: Error: matrix[num][num] in its definition.
    CONTEXT: PL/pgSQL function pgr_tspdijkstra(text,text,integer,integer) line 8 at RETURN QUERY

    Any ideas?
    PL/pgSQL function pgr_tspdijkstralen(text,integer,integer) line 3 at RETURN QUERY

    ********** Error **********

    ERROR: Error: matrix[num][num] in its definition.
    SQL state: XX000
    Context: PL/pgSQL function pgr_tspdijkstra(text,text,integer,integer) line 8 at RETURN QUERY
    PL/pgSQL function pgr_tspdijkstralen(text,integer,integer) line 3 at RETURN QUERY

  10. Daniel - 07/18/2014 at 04:24

    If it helps diagnose, Im seeing a lot of NULL values in the dmatrix.

Trackbacks

Leave a Comment

Name:

E-Mail :

Subscribe :
Website :

Comments :