Grave of Creativity. A tip on how a 3D game coder can get by on a shoe string budget.

I am working on my latest game called “Dragons High“.

The game is a 3D Dragon flight combat simulation. It’s meant for the mobile platforms and will also have a desktop build.

I am a programmer and I can do a little bit of art but my art is not really that good.

I have recently learned that in order for me to make games quicker and look better I should buy stock art and models.

The reasons are that finding a dedicated good 3D artist can be hard and expensive and doing the art yourself is cheap but takes a lot of time and is often of poor quality.

However, you can’t always find what you want on the few royalty free art assets websites.

I was working on the GUI of Dragons High and I was in need of a next button.

I modeled a button in Lightwave and rendered it into a 2D sprite.

This was the result:

My Next Button

This is how it looked inside the game:

GUI Screenshot

At first I thought “Who cares, it’s good enough” but then I realized how bad it looks and how a simple button like this can make your game look unprofessional.

I then began looking for some assets to buy but I couldn’t find anything that was an arrow button or a sign that looked suitable for the game.

Then I found this:

Grave DemonstrationIt’s not really a sign or an arrow and it has writings on it. But what if I would rotate it by 90 degrees and render it from the back?

After a few rendering tweaks and making it look more suitable for the game(less cartoony) I got this:

Next WoodThis looks like an arrow and is way better looking than what I did myself.

So this is how you can get along on a budget and not give up on quality.

Just try to make the most of limited resources by editing existing assets into things with a different purpose than what was originally intended.

For the sake of completion here is the GUI with the new next button:

New GUI Screenshot

GLSL(OpenGL Shader Language) compilation bug with for loop on Adreno 205, Android.

One of the biggest advantages of the OpenGL API specification is that OpenGL is language agnostic.

That means it can be implemented on almost any programming language which makes it a very portable library.

However, there is a serious issue with OpenGL. It’s shader language(GLSL) has no specification for compilation. You can’t rely on binary files of compiled shaders to work on different devices.

Not only that but compiling the GLSL source code while running the app on different devices might produce different results or even silent bugs(Depending on driver implementation).

My game Shotgun Practice was running perfectly on my device(Galaxy Note N7000) but didn’t work on my friend’s device(HTC Desire Z).

On my friend’s ‘HTC Desire Z‘ Android device with the ‘Adreno 205‘ GPU it had graphics artifacts.

After quite some tests I found that a specific shader was the culprit. That shader was the vertex shader of skinned objects.

It took me a lot of tests because the driver for HTC Desire Z didn’t report any error or warning upon compiling and validating the skinning shader.

Eventually it boiled down to the part of code that transforms the vertices with the relevant bones.

Doesn’t work on HTC Desire Z

for(int i = 0; i < 4; ++i)
	mat4 m = BoneTransform[Index[i]];
	posOut += (w[i]*m*vec4(position, 1.0)).xyz;
	normalOut += (w[i]*m*vec4(normal, 0.0)).xyz;

Works on HTC Desire Z

mat4 m = BoneTransform[Index[0]];
posOut += (w[0]*m*vec4(position, 1.0)).xyz;
normalOut += (w[0]*m*vec4(normal, 0.0)).xyz;
m = BoneTransform[Index[1]];
posOut += (w[1]*m*vec4(position, 1.0)).xyz;
normalOut += (w[1]*m*vec4(normal, 0.0)).xyz;
m = BoneTransform[Index[2]];
posOut += (w[2]*m*vec4(position, 1.0)).xyz;
normalOut += (w[2]*m*vec4(normal, 0.0)).xyz;
m = BoneTransform[Index[3]];
posOut += (w[3]*m*vec4(position, 1.0)).xyz;
normalOut += (w[3]*m*vec4(normal, 0.0)).xyz;

As you can see the code that doesn’t work has a ‘for loop’ and in the code that works I manually unrolled the ‘for loop’.

I also tested if the issue was that ‘mat4 m’ was inside the ‘for loop’ block or that using a hard coded number of iterations would cause a faulty loop unrolling.

Neither attempts worked. I don’t know exactly what is the driver issue with this but I was told you should use ‘for loops’ very cautiously in GLSL meant for mobile devices.


Beware of ‘for loops’ and generally branching in GLSL meant for mobile devices.

But even worse, some drivers(hopefully only old devices) might not warn you that the shader isn’t going to work on the device even though it passed all the validation.

Announcing “Heroes Of Honesty”

For several months I have been developing a new game called “Heroes Of Honesty”.

It is a 3D Android RPG game with a possible build for the PC.

The game plays like a classic RPG with tactical battles. The name of the game is meaningful to the story.

Once the game is complete, I will be releasing a series of articles about how I improved the game’s performance for the Android.

I am not going to reveal too many details about the game before the date of release.

If you are curious about the game you can subscribe to the newsletter list, follow me on twitter or subscribe to the RSS feeds.

The game’s website is

A Work In Progress screenshot that will give you a glimpse of the game:

Work In Progress

Tessellation Simplified

I have implemented tessellation + displacement map for my game Shoe String Shooter.
To decide how much to tessellate a triangle I calculated it’s screen space area. The more screen space area it takes, the more I tessellated it.

This would theoretically make the triangles uniformly sized across the screen space. It made sense to me, but it has some major issues.

The displacement occurs along the triangle’s normal. If the triangle is facing the camera, its normal is facing the camera as well.
There is very little visible effect when the geometry is displaced towards the viewer. A flat triangle with a normal mapped texture would be just as good(almost).

Another thing is that triangles which  are 90 degrees from the camera will almost disappear in screen space and will have a very small area in screen space.
However, when displacing those triangles the geometry is very much visible since the normal is tangential to the screen space.

The simpler approach

The new approach I have taken is simpler and give better results. It also requires only 3 control points instead of 6.

The first step is to tessellate every edge of the triangle according to how long the maximum displacement vector(at the direction of the normal) is on the screen space.
This metric will have the facing and farther triangles tessellate less, and the conspicuous and closer triangles tessellate more.

This is not enough. Some triangles are very big in world space and we don’t modulate the tessellation with the triangle’s area any more. So large triangles will appear more coarse.

The solution is to modulate the displacement vector tessellation with world space edge length. This will achieve spatial uniformity in world space.
Tessellating according to edge length has some advantages compared to tessellating according to triangle area.
First, we only need 3 control points instead of 6.
Second, we tessellate more along the longer edges and tessellate less along the shorter edges. Area calculation will not differ between a golden triangle and a very narrow but long triangle.

The last step is to bound the tessellation amount since we don’t want unnecessary triangles on the geometry that is up close. We don’t set a constant tessellation bound, but instead set a bound modulated with the world space edge length.


PatchTess ScreenSpaceTessellator(float3 w[3], float4 p[3], float4 q[3])
	PatchTess pt;
	float Res = 768.0;
	float Cell1 = 16.0;
	float Cell2 = 8.0;
	float MaxTes = 10.0;

	unsigned int i=0;
	for (i=0; i<3; i++)
		pt.EdgeTess[i] = 1;
	pt.InsideTess = 1;
	float Tess[3] = {0, 0, 0};
	if (IsScreenCull (p[0], p[1], p[2]))
		return pt;
	for (i=0; i<3; i++)
	for (i=0; i<3; i++)
	for (i=0; i<3; i++)
		float3 a1 = (w[(i+1)%3]-w[(i+2)%3]);
		Tess[i] = length(a1)/Cell1;
//		Tess[i] = max(Tess[i], 1);
		float2 a2 = (q[(i+1)%3].xy-p[(i+1)%3].xy)*Res*0.5;
		float2 b2 = (q[(i+2)%3].xy-p[(i+2)%3].xy)*Res*0.5;
		Tess[i] *= 0.5*(length(a2)+length(b2))/Cell2;
		Tess[i] = min(max(Tess[i], 1), MaxTes*length(a1)/Cell1);
	for (i=0; i<3; i++)
		pt.EdgeTess[i] = Tess[i];
	pt.InsideTess = (Tess[0]+Tess[1]+Tess[2])/3.0;
	return pt;

Bottlenecks and instancing.


While testing the client\server network aspect of my game Shoe String Shooter, my tester was experiencing a slow down in the frame rate. He was getting 30 FPS(Frames Per Second) instead of 60 FPS.

The issue was not because of the network but rather the GPU. But he has quite a powerful GPU and moreover he was getting 60 FPS when he tested the server.

So why was the client slower?
I was drawing text into the screen by having a draw call for each character in the string. The client is simply drawing more text to the screen so this in turn made it slower.
From all the beautiful tessellated graphics, the text was having a really big performance hit.

“Waiting connection…” been drawn many times.

In order to draw the text more efficiently I could either use Direct2D or use instancing.
Instancing allows you to draw the same vertex buffer several times with a single draw call.

Instancing in DirectX11

DirectX11 has support for instancing. As mentioned, you can load a single vertex buffer offline and draw it multiple times in a single draw call.

In order to draw the text we will have a vertex buffer with a single quad and we will draw a string of text by drawing instances of the quads as the same amount of characters in the string.

Instancing allows you to use additional vertex buffers with per instance vertex data. This is data that is added to the vertex parameter in the vertex shader.
We won’t add additional per instance data. Moreover, this will allow us to call as many or as little instances we want regardless of the vertex buffer size.

But how will we differenciate between vertices if they all use the same vertex data? Well, HLSL provide us with a reserved type that gives us the vertex instance ID.
We then use that to access a constant array buffer we will set to the vertex shader before calling the draw call.

struct VertexIn
float3 PosL : POSITION;
float2 Tex : TEXCOORD;
uint InstanceID : SV_InstanceID; // A reserved type that is set to the instance ID.

For the sake of completion I am including the vertex and pixel shaders code. Notice that color2 is the velocity map vector. I am setting it to 0 because I don’t want the text to be blurred by the motion blur.

VertexOut VS(VertexIn vin)
VertexOut vout;
vout.PosH = mul(float4(vin.PosL+float3(gXPos[vin.InstanceID], 0, 0), 1.0f), gWorldViewProj);
vout.Tex = vin.Tex;
vout.Index = gChar[vin.InstanceID];
return vout;
struct PS_OUTPUT {
float4 color : SV_Target0;
float4 color2 : SV_Target1;
PS_OUTPUT PS(VertexOut pin)
float4 texColor = float4(1, 1, 1, 1);
texColor = gAtlas.Sample( samAnisotropic, float3 (pin.Tex, pin.Index) );
clip(texColor.a - 0.1f);
pout.color = texColor*gColor;
pout.color2 = 0;
return pout;

Motion Blur of Skinned Meshs, Particles and a Bug Fix.

In the previous post about motion blur I presented my results even though I was not completely happy with them.

For the motion blur I was calculating screen space vectors in an offscreen texture. These velocity values are then used by the compute shader post processing to calculate how much to push pixels around to make pixels smear in places with high velocity.

It turns out I forgot to normalize the screen space x and y coordinates by the w component which actually stores a depth value to make the perspective projection. This resulted in a very acute blur for far objects, and a very minute blur for closer objects.

Another thing I wanted to add is motion blur for skinned meshes. To calculate the velocity of each pixel for each object I was calculating each vertex’s position AND what would be its position in the previous frame. I was considering world movement and camera movement. 
However, for skinned meshes I didn’t include object deformation differences in the previous frame, so a skinned mesh had only extrinsic blur and not intrinsic blur.

Similar to calculating the extrinsic velocity, in order to calculate bone affected position of the vertices, I had to calculate their position in the previous frame. To do that I simply sent a copy of the previous frame’s bones offsets. This might prove to be a performance issue since now I send twice the amount of bones’ offsets, but for now it’s good enough.

Finally, I also did motion blur for particles. I simply rendered the velocity vectors of the particles with blending into the velocity offscreen texture. It works good enough, but I didn’t put more effort than that. I might do something else later.

I think the next stage would be to combine motion blur with Depth Of Field. Hopefully it’s just a matter of pipelining the two filters.

Skinned motion blur.