A practical overview of A*(with code!)

Edit:

There have been some issues copying pasting my code into wordpress. I guess copying pasting staright from Eclipse into WordPress did some issues with ‘<‘ in many places.  I have put part of the code in PasteBin.com.
The license of the code here is: CC-BY

Preface

By popular request I have decided to make a post about the A* algorithm. Some people say it’s easy to implement, but I think it’s non trivial, especially for new programers.

I have to put a disclaimer, by all means I am not an educator or don’t know how to teach people that well. I might also make some mistakes and I think it’s essential to note this because it’s very important when you try to teach or write a tutorial on an algorithm. Because people might take your words as absolute truth without questioning you, especially when they are not familiar with the subject in matter.
Always ask yourself if I am doing any mistake or if it make sense for you. I will also try to revise and fix this post as questions and remarks about it will arise.

Background

What is A*? A* is a search algorithm. The most common use is to search for the shortest path from a starting point to a goal to reach. Finding this path is not always straight forward.
If you are on an open field then pointing to your target and going in a straight line is the shortest path, so you can basically have a local solution for that. If you are inside a maze and you are at the starting point and there is an exit, you need to search for the exit. Aiming at the target and going in a straight line will not work because there are walls and dead ends that will block your progress.

But even then, you can have a local algorithm that will find the exit. One such algorithm is to always walk along the right side of the walls. If there are no circles this algorithm will find the exit, but it might  make a lot of unnecessary turns and walking before reaching the goal.
A* is not only ment to find the exit, but also ment to do that in a more efficient way.

However, even with A*, sometimes it can’t perform well. A maze is a good example because in a maze the only possible path to the exit might also be the longest one, which with the most common heuristics of A* will make it go search for a lot of wrong paths with dead ends.
In an environment that is not exactly a maze, but also not an open field, A* can be very efficient. Something like an open field with obstacles or with a few short walls.

I also mentioned heuristics. Before we start looking at some code I will explain what heuritics are. A heuristic is a set of rules or a set of decisions that are based on apriori knowledge of the problem. You specify a certain heuristic assuming you know the nature of the problem you are about to solve.
Let’s get back to the maze. Let’s say you are put in a maze, but you have no knowledge of how the maze is built. In this case you don’t know which algorithm will be the most efficient in solving the maze. You can pick an algorithm, but you will only be lucky if it solved it in a short time.
Now let’s say someone told you that the path to the exit in this maze is shorter than 100 meters(meter is a unit of length in the metric system). Now you can save yourself some time and walking, because you know that if you walked in a certain path for more than 100 m, it is the wrong path.

Code, code, code…

The wikipedia page on A* has a pretty good pseudo code. It can help you understand the A* algorithm. However, it use some naive data structure choices that make the A* very inefficient in practice. These inefficiency are due to the order of complexity of the data structures themselves and not the A* algorithm.

I have decided to put here code from my game Sumerian Blood. This will also show you how code from a real game looks like, but will hopefully not have so much bugs since it was used and tested in a real game.
As before, I have to put a disclaimer that this code might not be bug free, and might not be the best educational code. If you find bugs and errors, please point me out to them.
(Code was put in pastebin due to the brokeness of word press and source code)

http://pastebin.com/RDCXrbD0

A bit scary, isn’t it?
I suggest going over the code while having the wikipedia pseudo code as reference, it will make it more clear.
I will not go completly into depth of this code, but I will explain a few key points.

The algorithm basically keeps a front of nodes or cells it is advancing with. It will go over many nodes and cell, and without a heuristic it will just spread this front like a drop of ink spreads in a glass of water.
On each node or cell the algorithm will save the actual distance or actually weighted distance it took to reach there. If the algorithm reach a cell that havn’t been visted before, it will simpley mark it as a new front cell, and will save the weighted distance for that cell.
If the algorithm reach a cell that was already visited before, it will compare the weighted distance and overwrite it if the new weighted distance is smaller.

Where does the heuristic comes into play? The heuristic will calculate the estimated distance or weighted distance it will take to reach the destination. This is stored inside neighbour.s.f in the code.
However, you will notice that apart from storing this value in the neighbour node, it is not referred anywhere in the search function, and doesn’t affect the actual calculated weighted distance for every cell.

This is not entirely true, if you will look in ACell, you will notice it has an “<” opreator. The “<” operator compares the heuristic estimation score.
This comes into play when choosing the next node of the front to advance with. So instead of the front spreading like a drop of ink in a glass of water, the front might now advance like a snake or in some other way. The heurisitc gives a priority to certain nodes in the front to search there first.

Some implementations details. You notice that I am using a priority_queue instead of a linked list like in the wikipedia pseudo code. This is because each iteration we need to search the most prioritized node in the front according to the heuristic. The priority_queue is the most optimized data structure for this purpose. I also use some 2D arrays or maps instead of linked list in some other parts. Again, those are more optimal data structures for this purpose.

Heuristics, again

Now that we know how the AStar algorithm works, I want to show you some possible heuristics.

class ChaseHeuristic: public AStar::Heuristic {
	public:
		ChaseHeuristic()
		{
			i = 0;
			n = 8000;
			e = 1.75;
			D = 1;
			D2 = sqrt(2.0)*D;
			Distance = 0;
		}
		double Calculate (ACell w1, ACell w2, double Factor)
		{
			double d = min(fabs((double)w1.x-(double)w2.x), fabs((double)w1.y-(double)w2.y));
			double s = (fabs((double)w1.x-(double)w2.x) + fabs((double)w1.y-(double)w2.y));
			return e*(D2 * d + D * (s - 2*d));
		}
		bool IsStop (ACell w, double Factor)
		{
			double d = min(fabs((double)w.x-(double)Goal.x), fabs((double)w.y-(double)Goal.y));
			double s = (fabs((double)w.x-(double)Goal.x) + fabs((double)w.y-(double)Goal.y));
			if ((D2 * d + D * (s - 2*d)) < Distance*0.5) 				return true; 			return i>=n;
		}

		bool IsCancel ()
		{
			return false;
		}

		void Step()
		{
			i++;
		}
		void Start(ACell First, ACell Goal)
		{
			this->Goal = Goal;
			double d = min(fabs((double)First.x-(double)Goal.x), fabs((double)First.y-(double)Goal.y));
			double s = (fabs((double)First.x-(double)Goal.x) + fabs((double)First.y-(double)Goal.y));
			Distance = D2 * d + D * (s - 2*d);
			i = 0;
		}
		ACell Goal;
		double Distance;
		unsigned int n, i;
		double e;
		double D, D2;
};

This is the chase heuristic in Sumerian Blood. Again, I could have made a simplifed educational code, but I preferred to take code that is actually used in a real game.
Notice this class implementes the heuristic class defined inside the AStar class. So an instance of this class can be provided to the AStar.
This heuristic search for the shortest path from start position to the destination position. It is called chase because in the game it is actually used for the AI character to chase the player which can be moving.

In the AStar algorithm the goal does not change it’s position, in the game we will recalculate the path each time the player(goal) change it’s position.
First take a look at the calculate function. This calculates the heuristic estimation. In short the calculation is calculating the flight distance to the goal. Notice this distance is multiplied by a factor of ‘e'(1.75).
I will explain this ‘e’ shortly.

The IsStop is comparing the current flight distance to the goal with the flight distance from the starting point to the goal. If this distance is smaller than half of the start to goal flight distance, it stops the search.
This is due to performance optimizations. In Sumerian Blood, there are no actual walls, there are only a few obstacles, so reaching half the distance to the goal probably means we are in the right direction.
Notice I am recalculating this path almost every frame, so even if we calculate the path to the goal in the current frame, it is only used for a few small steps the AI makes before recalculating the path again. That is why we don’t actually need the full path to the goal.
Another thing is that the IsStop is checking the number of iterations that was already made and see if they are over 8000(should have been 9000 😉 ). This is also ment for performance consideration, if the AI can’t find the goal fast enough, it simply cancels the calculation.

Now back to the ‘e’. The ‘e’ is a factor that multiplies the flight distance estimation. I do this to over estimate the weighted distance to the goal.
The reason I do this is because I want to find the goal faster, and thus giving nodes which are closer to the goal in flight distance an even higher priority. This will cause the front to spread a lot more in the flight distance direction.
This will make the search faster, but because I am overestimating, I might not find the optimal solution. Though for Sumerian Blood it’s ok, because there arn’t any walls in the battle field, and this inaccuracy is neglible.

We kept talking about reaching the goal, but with specific heuristics, A* can be used to search other paths. For instance, a fleeing path, or a path to keep a certain distance from the goal.

class FleeHeuristic: public AStar::Heuristic {
	public:
		FleeHeuristic()
		{
			i = 0;
			n = 8000;
			e = 1.75;
			D = 1;
			D2 = sqrt(2.0)*D;
			w = 0;
			h = 0;
			l = 0;
		}

		bool IsGoal() {return false;}

		double Calculate (ACell w1, ACell w2, double Factor)
		{
			double d = min(fabs((double)w1.x-(double)w2.x), fabs((double)w1.y-(double)w2.y));
			double s = (fabs((double)w1.x-(double)w2.x) + fabs((double)w1.y-(double)w2.y));
			return -e*(D2 * d + D * (s - 2*d));
		}
		bool IsStop (ACell w, double Factor)
		{
			Vector v;
			v.x = (double)Goal.x-(double)w.x;
			v.y = (double)Goal.y-(double)w.y;
			if (v.Length()>=l)
				return true;
			return i>=n;
		}

		bool IsCancel ()
		{
			return false;
		}

		void Step()
		{
			i++;
		}
		void Start(ACell First, ACell Goal)
		{
//			v.x = max((double)Goal.x, (double)w-(double)Goal.x);
//			v.y = max((double)Goal.y, (double)h-(double)Goal.y);
			Length.clear();
			Vector v;
			v.x = (double)Goal.x;
			v.y = (double)Goal.y;
			Length.push_front(v.Length());
			v.x = (double)w-(double)Goal.x;
			v.y = (double)Goal.y;
			Length.push_front(v.Length());
			v.x = (double)w-(double)Goal.x;
			v.y = (double)h-(double)Goal.y;
			Length.push_front(v.Length());
			v.x = (double)Goal.x;
			v.y = (double)h-(double)Goal.y;
			Length.push_front(v.Length());
			Length.sort();
			Length.pop_front();
			l = Length.front();
			this->Goal = Goal;
			i = 0;
		}
		std::list < double > Length;
		double l;
		ACell Goal;
		unsigned int w, h;
		unsigned int n, i;
		double e;
		double D, D2;
};

In this heuristic, we want the AI to flee the player while also avoiding the obstacles. The estimation calculation is very similar to the one in the chase heuristic, only we now retur the flight distance with a negative sign.
This will cause the AStar to give priority to the nodes in the front that are actually going away from the goal.
However, if we don’t give the AStar a proper stoping condition, it will do an exhaustive search because it has no goal to reach.
The stop condition I use is whether the AStar reached a cell that is farther than the 2nd most closest corner in the “screen”. The battle areana in Sumerian Blood is a bounded rectangle, so this makes sense and works fine.

Conclusion

A* is not a very trivial algorithm to understand, in my humble opinion. I hope that with this practical approach to A* you will be able to implement your own A* algorithm.
I don’t have much to add except that comments and corrections are welcome, I hope this becomes a useful resource.

Ofer.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s