Dijkstra's algorithm
29 September 2007, 04:33 PM
Learning path finding is probably one of the most exciting experience for a programmer. I remember my share of frustation with the little maze game I wrote years ago, that was the bad guys can't find ways to my character (yes I put bad creatures in the maze just for fun), wandering aimlessly like plagues. Years later, still dazed by the idea of boundless information called the internet, I stumbled upon a tutorial for A* (A-star) path finding algorithm. It was the first path finding algorithm that inspired me, too bad the maze game has long been buried with Pascal. The poor maze dwellers never get a chance for brain tweaking. To be honest, at that time learning A* was a horrible yet exciting challenge, especially with no knowledge of dijkstra it took quite some time to run the first successful prototype. Since A* is a dijkstra with heuristic, it is important to learn dijkstra's algorithm first before trying A*. At least that is how I feel based on previous experience.

Edit: This is quite a long post so I decided that seeing how the algorithm works by mean of applet will give a nice head start, especially if you are new with the algorithm. Check the applet first, see the algorithm in action, and come back to read the rest of this article. By the way, it is possible to understand the algorithm by just looking at how it works!
See dijkstra's algorithm java applet interactive demo

Dijkstra's algorithm is basically a greedy path finding algorithm, where the goal is to find the cheapest route from a given starting point to any other point. Consider the following figure:

There are eight points or nodes, with each having alphabetical signs from a to h. The straight lines indicates path between nodes, and the numbers represent the cost between any two connected nodes. The cost meaning can be anything. It can be a distance, traffic jam factor, terrain cost, road risk, etcetera, or a combination of all these. But for our puspose, lets just say the numbers are cost.

Suppose we want to find the best way from node g to node f. Can you find it? ...
Ok that was fast. The obvious answer will be g->b->a->f with a total movement cost of 3 units. Although the answer is so obvious to us, it isn't so once we try to translate that obviousness into program code. That is what we want to tackle in this article, using Dijkstra's algorithm of course.

Before anyone can hope to understand the algorithm, it is a must to take note of a few terms. The open list, closed list, and movement cost. Movement cost must never be mistaken as cost. The movement cost represents a total cost sum from a starting point to any other node. For example the movement cost for path g->b->a->f is 3 units. In the other hand cost is the cost between TWO neighboring nodes, which means for example between g and b the cost is 1 unit. During initialization, both open list and closed list is empty. The open list, closed list, and movement cost are also characteristic of Dijkstra's algorithm.

Let us solve the previous best path question using Dijkstra's algorithm. To find the best path from g to f, first declare both open list and closed list as empty array. Set movement cost and parent node for all nodes (yes all of them a to h) to 0 and null respectively. Again don't confuse a movement cost as cost between nodes. Once this initialization is done, put the starting node g into open list, so the open list becomes, open = { { g, 0, null } }. What?? Now, now... as a decent programmer you shouldn't fuss over something like, open = { { g, 0, null } }. It simply means open is an array with one element in it, that one element is a node with name g, has a movement cost of 0, and has a parent of null. Less cloudy now? Great, the algorithm can start kicking now.

Step one, pick one node from open list with the LOWEST movement cost. Pretty obvious, at this time only one element exists - node g. So we don't have much option do we? Select the node and proceed to next step.

Step two, find all immediate neighbors for the node we picked previously. Looking back at the figure up there, g has three neighbors, b, h, and d. Since all three is not (yet) members of open list, set parent node for all node b, h, and d to g, and calculate movement cost for node b, h, and d to 0+1, 0+0.5, and 0+2 respectively. Why set parent node? Why calculate movement cost? Why 0+1, 0+0.5, 0+2?? Keeping parent node record is important for back tracking, especially when tracing from target node f back to starting node g, but of course this is after the search is done. Keeping total movement cost is also important, it tells the movement cost of any visited node from starting node g. 0+1 means movement cost of g + cost needed to travel from g to b. 0+0.5 means movement cost of g + cost needed to travel from g to h. 0+2 means, well you get the idea. Now remove g from open list and transfer it to closed list. Put all its neighbors, b, h, and d into the open list. Don't bother about g anymore. The open list should now contain, open = { { b, 1, g }, { h, 0.5, g }, { d, 2, g } }, and closed list, closed = { { g, 0, null } }. One more thing to remember is b, h, and d's parent node is g, while g itself has no parent. The figure below graphically describes the search progress so far. Arrow indicates child/parent relationship between nodes, a child always points to its parent. Red number inside a bracket is total movement cost. Black number is cost between two nodes.

Now to continue the search, repeat step one by picking a node with the lowest movement cost from open list. Do you agree with selecting node h? Sure it has the lowest movement cost of (0.5), compared to b with (1), or d with (2). With a picked node of h repeat step two by finding all of its immediate neighbors, node g and d. Node g is in the closed list, so just ignore it. The other neighbor node d is already a member of open list, meaning d has a parent (node g) and a current movement cost (2). Because of this we can't just change the parent and movement cost like we did previously. What to do now is to check which path is shorter. The current movement cost for d is (2), this is due to d's parent is g, and the travel cost from g to d is (0)+2 = 2. The current movement cost for h however is (0.5), and to travel from h to d only an additional cost of 0.5 is needed, making the path g->h->d to cover only 1 unit of movement cost. Clearly path with less movement cost is to be favored. To reflect this, reassign d's parent to h and recalculate its movement cost to (0.5)+0.5 = 1. Node h can now be removed from open list and closed. At this point open list should be, open = { { b, 1, g}, { d, 1, h } }, and closed list is, closed = { { g, 0, null }, { h, 0.5, g } }. Notice node d in open list has changed, movement cost is now 1 and parent is node h. Following figure shows the search progress so far.

From here on it is a matter of repeating step one and step two until open list becomes empty, or the target node f is put into the closed list. If any one of these conditions is met, the search stop. If target node f is in the closed list, there is a best path solution, and it can be retrieved by tracing back from f to g using the parent information. Reverse the trace and that's the best path.

Easier said than done huh? Alright, it seems we have to explore the search till completion anyway. Revisiting step one, pick a node with lowest movement cost from open list. This time we have a tie, both node b and d has equal movement cost of (1). In this case, pick b because it comes first. From step two, find all of b's neighbors, and we get two nodes, a and c. Both a and c are not members of open list so set their parent to b, and calculate their movement cost with (1)+1 and (1)+2 respectively. (1)+1 is due to (b)+a, and (1)+2 is (b)+c, movement cost wise. Put both node a and c to open list and close node b by removing it from open list and tranfering to closed list. At this time open list should be, open = { { d, 1, h }, { a, 2, b }, { c, 3, b } }, and closed list, closed = { { g, 0, null }, { h, 0.5, g }, { b, 1, g } }. Check figure below for the latest search progress.

Continue the search again by repeating step one and two. This time, node with lowest movement cost in open list is d, and its neighbors are g, h, c, and e. That's a handful of nodes. Lets check them one by one. Node g and h are already closed so ignore them. Node c however is already a member of open list. We need to check which is a better path. Currently node c is a child of b, and the movement cost of c from b is (3). However, since movement cost for node d is (1), it takes only 2 movement cost to get to c from d. Clearly moving to c from d is much better than from b. Because of this reassign c's parent node to d, and recalculate its movement cost to (2). Now we check the last neighbor e. Node e is not in open list, so just assign its parent to d, calculate its movement cost to (1)+1, and put it into the open list. Close node d. Confirm that open list is now, open = { { a, 2, b }, { c, 2, d }, { e, 2, d } }, and closed = { { g, 0, null } , { h, 0.5, g }, { b, 1, g }, { d, 1, h } }. See following figure for the latest search progress.

Pick node with lowest movement cost from open list, at this time all a, c, and e ties with movement cost of (2). Just pick node a because it comes first in the list. Find all of its neighbors to get node b, e, and f. Node b is closed ignore it. Node e is in the open list and currently has a movement cost of (2). To get to e from a, total movement cost of 4 is needed, twice the current movement cost of e, so just leave node e be. Node f is not in open list, so assign node a as its parent, calculate its movement cost to (2)+1, and put it into the open list. Close node a. Right now open list should be, open = { { c, 2, d }, { e, 2, d }, { f, 3, a } }, and closed = { { g, 0 , null } , { h, 0.5 , g }, { b, 1, g }, { d, 1, h }, { a, 2, b } }. Search progress is pictured below,

Pick node c from open list and we find that all of its neighbors, node b and d, are already closed. Since there is nothing else to do, close node c. Now open list should be, open = { { e, 2, d }, { f, 3, a } }, and closed = { { g, 0 , null } , { h, 0.5 , g }, { b, 1, g }, { d, 1, h }, { a, 2, b }, { c, 2, d } }. Next pick node e from open list and we find that it has three neighbors, node a, d, and f. Node a and d are closed just ignore them. Node f however is already a member of open list and currently has a movement cost of (3). Moving from e to f takes (2)+2 = 4, larger than (3) so don't do any modification to node f. Close node e. Open list will be, open = { { f, 3, a } }, and closed = { { g, 0, null }, { h, 0.5, g }, { b, 1, g }, { d, 1, h }, { a, 2, b }, { c, 2, d }, { e, 2, d } }. Below is the latest search progress,

Only one node remaining in our open list. So pick it and do the routine neighbor check. It has node a and e as neighbors however both node a and e are already closed. Finally we close node f by putting it into the closed list. Open list then becomes, open = { }, and closed list becomes, closed = { { g, 0, null }, { h, 0.5, g }, { b, 1, g }, { d, 1, h }, { a, 2, b }, { c, 2, d }, { e, 2, d }, { f, 3, a } }. That's it. The search is done, open list is now empty and target node f is in the closed list. In practice meeting any one of the condition should end the search. However in our case, both conditions are met. Now how do we know of the shortest path from g to f? Easy, take node f and trace backward using the parent information. You should get f->a->b->g. The total movement cost from g to f? Even easier, take node f and you get (3) as the total movement cost. That is how Dijkstra's algorithm find the best path between two nodes. In fact not just between two nodes, the algorithm actualy found the shortest path to ALL other nodes (yes not just to f) from node g. The end result of Dijkstra search can be described by the figure below. Black arrow lines represent child/parent relationship between nodes. A child node always points to its parent, arrow wise. The gray lines are paths that will never be used.

Looking at the figure, if one pick ANY node other than g, the path from that node to node g can be traced by simply following the arrow, and one can be sure it is the shortest possible path to g. The catch is, this is only true if the search is stopped once open list becomes empty.

That concludes the tiring walkthrough.

Last edited on 29 March 2009

~Xventure, 16 July 2009, 04:56 AM
Hmm...very clear tutorial :) thx alot
~unique, 3 November 2009, 02:32 PM
good explanation of algo !!
can u implement it using prority queue
~John, 4 November 2009, 06:54 PM
Thats an excellent suggestion unique. Maintaining the 'open list' with priority queue (like a simple sorted list) makes it more efficient. However using it practically means implementing a sorting algorithm of some sort - which I want to avoid explaining in this already long winded article ;)
~Pico, 13 December 2009, 05:21 PM
A very good and detailed walkthrough.

Thanks for this.
~good guy, 10 August 2010, 01:27 AM
shit this is too long.
~goes, 5 April 2011, 10:41 AM
Thanks man, it was perfect
~djg, 2 May 2011, 07:27 PM
Very helpful explanation of Dijkstra\'s algorithm.

Thanks very much!!
~dave, 8 May 2011, 01:04 AM
~Mukesh, 22 May 2011, 09:07 AM
Best tutorial to understand Dijkstra\'s algo.Great work.
~Mike, 1 March 2012, 11:32 PM
Simply the best tutorial. Some people want to mix into too much theory, getting in the way of grasping the basic principle. Others start with a good walkthrough to get the hang of it, but get lazy or cop out, rather than completing the conceptualization. You sir, did neither, and for it you are a true champion.
~Ciera, 2 September 2012, 12:28 PM
Thank you so much for this. I had a totally hard time understanding what was written in the book. Thank God, I found this very detailed and very understandable walkthrough. What a big help! Thanks!
~thunk, 11 October 2012, 11:02 AM
man, well done! nice work!
~Harsh, 14 December 2012, 12:14 PM
Very well explained. Luck that I could search the shortest path to here... :)