Memory Usage and Normal Map Texture Formats in OpenGL ES2.


I am working on a 3D mobile racing game using OpenGL ES2.

For the cars in the game I am using normal maps to give them more details while using the same amount of geometry.

32 Bit RGBA uncompressed normal mapped car.

32 Bit RGBA uncompressed normal mapped car.

Normal maps are images that contain normal vectors per texel of the texture.

The normal vector is usually packed inside the 24 bit RGB values of the texture and is converted into a vector inside the fragment shader.

vec3 texNormal =texture2D(normalSampler, texOut).xyz;
texNormal = (texNormal*2.0)-1.0;

Here is an example of a test normal map my artist created for the car(it doesn’t have a lot of details it’s just a test):

Test Normal Map

Test Normal Map

Memory Usage

When you load a texture into OpenGL it doesn’t matter how much space it takes on the disk, what matters is how much space it takes in memory.

For instance an image could be saved as a PNG and take only 700KB on disk but when loaded to OpenGL it would take 4MB because it’s a 1024×1024 32 bit RGBA uncompressed format.

In my racing game I would have 4 different cars, each one with it’s own normal map. If each normal map is a 1024×1024 RGBA uncompressed image, it would take 16MB of memory with just the normal maps!

We would like to reduce the memory usage of these textures.

Compressed Texture

OpenGL ES2 supports loading compressed texture using glCompressedTexImage2D.

In the case of iOS the native supported compressed texture format is PVRTC.

The advantage of these compressed formats is that they are stored compressed in the OpenGL memory and they are only decoded in real time. So the memory usage is only of the compressed texture size.

A 4 bit PVRTC always give you a compression of 1/8 of the uncompressed 32 bit RGBA bitmap.

While PVRTC is great for diffuse or color textures, with normal maps the compression of the texture make it seem like there are lumps on the surface.

PVRTC compressed normal mapped car.

PVRTC compressed normal mapped car.

We might be missing the point of making things look better when we have such artifacts on the surface of our car.

16 Bit 565 RGB

Our original texture is 32 bit RGBA per pixels. Which means we have 4 bytes per pixel of the image.

There are uncompressed formats that use only 2 bytes per pixel.

For instance GL_UNSIGNED_SHORT_5_6_5.

We only need RGB since we don’t use the Alpha channel in our texture.

What will happen if we store the xyz components of the normal inside 16 bit RGB instead of 24 bit RGB?

16 bit RGB normal mapped car.

16 bit RGB normal mapped car.

Again, with the reduced accuracy of the normals the end result is not very good.

16 Bit Two Components Packing

Our normals are normalized to the length of 1. We can actually derive the z component of our normal from our x and y components using Pitagoras theorem.

We only need 16 bits to store the x and y components.

However, we do not have a pixel format of two 8 bit components.

We do have this format: GL_UNSIGNED_SHORT_4_4_4_4.

We are going to pack the two 8 bit RG components into the four 4 bit RGBA components of our new format.


	stbi_uc * Tmp = Data;
	Data = (unsigned char *)new unsigned short[w*h];
	unsigned int n = w*h;
	for (unsigned int i=0; i<n; i++)
		unsigned char r = Tmp[4*i];
		unsigned char g = Tmp[4*i+1];
		unsigned short v = (unsigned short)r;
		v = v<<8;
		v+=(unsigned short)g;
		((unsigned short*)Data)[i] = v;
	delete [] Tmp;

After packing the texture into 16 bits we need to change the code in our fragment shader to unpack the texel into a normal vector.

vec4 compressedNormal =texture2D(normalSampler, texOut);
vec3 texNormal;
texNormal.r = 2.0*(compressedNormal.r*240.0+compressedNormal.g*15.0)/255.0-1.0;
texNormal.g = 2.0*(compressedNormal.b*240.0+compressedNormal.a*15.0)/255.0-1.0;
texNormal.b = sqrt(1.0-texNormal.r*texNormal.r-texNormal.g*texNormal.g);

In this case we reduce the memory usage by half of the original 32 bit texture and we don’t compromise any quality.

We might be paying in GPU processing power since now our fragment shader is more complex.

However, I didn’t notice any difference in performance on my iPad New.

We also assume our z component is always positive so we cannot represent normals that go inward.

However, in most cases we do not need normals that go into the triangle or the surface.

16 bit packed normal mapped car.

16 bit packed normal mapped car.

Talk To Your Artist

Until now we were able to reduce the memory usage by a factor of 2 without sacrificing the quality.

However, this might not be enough.

There is another way to reduce memory usage and that is having the artist optimize the texture mapping.

In our case the bottom part of the car is rarely seen or is only partially seen.

We don’t need a lot of details in the bottom part of the car so the artist can allocate a much smaller area of the texture for the bottom part.

He might also separate the car into two surfaces(which means two passes) and to have different textures for each surface while the bottom part will have a much lower resolution texture.

In our game I have told the artist to reduce the texture size from 1024×1024 to 512×512 while taking into consideration that we don’t need a lot of details in the bottom part.

With both the 16 bit texture packing and the reduction of the resolution of the texture, we will get a factor of 8 in reducing the memory usage.

Projecting A Dynamic Decal onto a 3D Mesh.


For my new mobile racing game I have implemented decals for a few reasons.

Decals are useful to add extra details to a mesh when it is impossible to include all the details with one big UV mapped texture.

They are also useful for adding extra details dynamically without having to change the texture of the mesh they are being added to.

In this article I am going to focus on how to calculate the decal’s geometry so it can be used inside your code in real time.

How to Derive a Decal From a Mesh

The decal geometry is derived from the mesh’s geometry we are adding the decal into.

What we want to do is to project a 2D rectangle(or any convex polygon) onto the mesh’s surface and cut out the geometry of the projected rectangle.

Think of it as like using a cookie cutter to cut out a shape from the dough only our dough does not have to be flat.

A mesh in our case is made out of triangles.

What we want to do is cut all the triangles using our 2D rectangle(or “cookie cutter”) and then add the resulting triangles to our vertex buffer so we could send it for rendering.

We could go over all the triangles in our mesh and cut them one by one.

However, most of the triangles will be cut out completely so it will be a wasteful process.

Instead we need to find an optimized method to select only the relevant triangles. For instance this method: Using a Spatial Partitioning Grid to Improve Performance.

Making a Convex “Cookie Cutter” using Planes.

In order to cut out the geometry from the mesh’s surface using our convex shape we use 3D planes to cut out each triangle.

This is the reason why we can only cut a convex shape using this method.

Each edge in our “cookie cutter” shape will be represented by a plane that it’s normal is the cross product of a vector on the edge and a vector that represent the direction of the projection.

So the edge and the vector that represents the direction of the projection are both contained by the plane.

When we cut a triangle using a plane we keep the part of the triangle that is on one side of the plane and we throw the part of the triangle that is on the other side.

It is possible that the triangle is completely thrown away or kept whole.

There are two scenarios in the case where the triangle is cut into a smaller piece.

Either the piece that we keep is made out of the “tip” of the triangle(and hence made out of one triangle), or it is made out of the “base” of the triangle. And thus it is a quad and is made out of two triangles.

When I say “base” and “tip” it could be relative to any of the 3 vertices of the triangle.

A possible algorithm would be:

1) Find all relevant triangles.
2) If relevant triangles list is empty go to (12)
  3) Add top relevant triangle to cutList
  4) If cutList is empty go to (10)
    5) Get next plane in the convex shape and cut the triangle into 0 to 2 triangles.
    6) Add the cut out triangles(if any) to the cutList.
    7) If didn't reach end of plane list go to (4)
  8) Add cutList triangles to finalGeometry list(if any).
  9) clear cutList
  10) Pop top triangle of relevant triangles list.  
  11) go to (2)
12) Finish (finalGeomtry list contain all the relevant triangles).

 UV Mapping of The Decal

There is another issue we didn’t even talk about.

The decal need it’s own UV coordinates. It cannot use the mapping of the mesh since they are mapped over the entire mesh and we want to map an entire texture into our decal.

I will explain briefly how to calculate the UV coordinates for a quad based decal.

The UV coordinates correspond to the projection.

Each one of the 4 vertices in our quad will be mapped to 4 UV coordinates on the texture space.

However, when we cut the mesh’s triangles using planes we might get vertices that are anywhere in between and inside the projection of our quad.

In order to calculate the UV  coordinates we would need to project the new decal vertices back into the plane of our quad(or back into the “cookie cutter”).

Our projection is orthogonal and in our case it is in the direction of (0, -1, 0) so in order to project vertices back to the quad we simpley drop the y component.

If our quad was a rectangle we could have calculated the UVs from the distance of the projected vertex to the edges of the rectangle.

However, we want to solve this for a generic quad.

In order to do this we solve the following set of equations:

p = (v1*(1-t1)+v2*t1)*(1-t2)+(v3*(1-t1)+v4*t1)*t2;

This is a set of two equations where t1 and t2 are our variables(what we are trying to find), p is a 2D vector which is the point we projected back to the quad and v1, v2, v3 and v4 are 4 2D points of the quad itself.

To solve this we need to express t2 using t1 with one equation and then assign t2 to the other equation.

Then we will get a second order equation in t1.

However, we are suppose to get only one solution, right?

Well without getting into the details, we just need to get the most positive solution or just reduce the equation into a first order equation(where the factor that multiplies t1*t1 is nearly zero).

(Edit: We need to get the solution that is inside or closest to (0..1).

We do this by using the solution that is closest to 0.5.)

What will guide you which equation you need to solve is by testing every place that divides by an expression that it is not zero and test that a Square root is not negative.

When those expressions are zero(in the case of division) or negative(in the case of the Square root) you will know which equation you need to solve.

For each vertex in the decal you get it’s UV coordinates from solving the set of equations for the respective projection of the vertex into the quad(our p).

t1 and t2 are our UV coordinates.

Rendering The Decal

We render the decal in a separate pass than the mesh itself.

We would use similar shaders to the mesh but we would also use a blend mode to blend the decal into the mesh.

For instance normal blending but otherwise rendering the decal is very similar to rendering the mesh.

There is another difference though. Since the decal is derived from the mesh geometry, it will cause Z Fighting in the Z Buffer.

In order to have our decal render over the mesh we need to offset the decal’s depth.

In GLES2 we have a method called glPolygonOffset that does that.

For the sake of completion here is the relevant source code:


    class GenerateQuadMeshShader: public Graphics3D::GeometryShader {
        GenerateQuadMeshShader(std::vector<math::Triangle> & Triangles, Graphics2D::Position StartLeft, Graphics2D::Position StartRight, Graphics2D::Position EndLeft, Graphics2D::Position EndRight):Triangles(Triangles), StartLeft(StartLeft), StartRight(StartRight), EndLeft(EndLeft), EndRight(EndRight)
            StartLeft.y = 0;
            StartRight.y = 0;
            EndLeft.y = 0;
            EndRight.y = 0;


        void Init()

        Graphics3D::GeometryVertex GenerateVertex(unsigned int i, unsigned int n)
            Graphics3D::GeometryVertex v;
            float3 Vertex = Triangles[i/3].Vertex(i%3);
            Graphics2D::Position v1(Vertex.x, Vertex.y, Vertex.z);
            v.pos.x = v1.x;
            v.pos.y = v1.y;
            v.pos.z = v1.z;
            v1.y = 0;
//            double t1;
//            double t2 = (v1.x-(StartLeft.x*(1.0-t1)+StartRight.x*t1))/((EndLeft.x*(1.0-t1)+EndRight.x*t1)-(StartLeft.x*(1.0-t1)+StartRight.x*t1));
            double a = (StartLeft.x-StartRight.x)*(-EndLeft.z+EndRight.z+StartLeft.z-StartRight.z)-(StartLeft.z-StartRight.z)*(-EndLeft.x+EndRight.x+StartLeft.x-StartRight.x);
            double b = (v1.x-StartLeft.x)*(-EndLeft.z+EndRight.z+StartLeft.z-StartRight.z)-(v1.z-StartLeft.z)*(-EndLeft.x+EndRight.x+StartLeft.x-StartRight.x)+(StartLeft.x-StartRight.x)*(EndLeft.z-StartLeft.z)-(StartLeft.z-StartRight.z)*(EndLeft.x-StartLeft.x);
            double c = (v1.x-StartLeft.x)*(EndLeft.z-StartLeft.z)-(v1.z-StartLeft.z)*(EndLeft.x-StartLeft.x);
            double t1 = 0;
            if (fabs(a)<0.000001)
                t1 = -c/b;
                double delta = b*b-4*a*c;
                t1 = -b/(2.0*a);
                if (delta>0.)
                    t1 = (-b+sqrt(delta))/(2.0*a);
                    // Fix!
                    double Solution2 = (-b-sqrt(delta))/(2.0*a);
                    if (fabs(Solution2-0.5)<fabs(t1-0.5))
                        t1 = Solution2;
                    //if (t1<0.0)
                    //    t1 = (-b-sqrt(delta))/(2.0*a);
            double x1 = ((EndLeft.x*(1.0-t1)+EndRight.x*t1)-(StartLeft.x*(1.0-t1)+StartRight.x*t1));
            double t2 = 0;
            if (fabs(x1)<0.0001)
                double z1 = ((EndLeft.z*(1.0-t1)+EndRight.z*t1)-(StartLeft.z*(1.0-t1)+StartRight.z*t1));
                t2 = (v1.z-(StartLeft.z*(1.0-t1)+StartRight.z*t1))/z1;
                t2 = (v1.x-(StartLeft.x*(1.0-t1)+StartRight.x*t1))/x1;
            v.u = t1;
            v.v = t2;

            return v;
        unsigned int VertexAmount(unsigned int n)
            return (unsigned int)Triangles.size()*3;

        unsigned int SurfaceAmount()
            return 1;

        std::vector<math::Triangle> & Triangles;
        Graphics2D::Position StartLeft, StartRight, EndLeft, EndRight;

    class DecalMesh {
        DecalMesh (Graphics3D::Factory & f, Graphics3D::Mesh m1):Map(m1), f(f)
            std::vector <Graphics2D::Position> & p1 = m1->GetPosition(0);
            std::vector <unsigned int> & i1 = m1->GetIndex(0);
            for (unsigned int i=0; i<Triangles.size(); i++)
                unsigned int j = i*3;
                Triangles[i] = math::Triangle(float3(p1[i1[j]].x, p1[i1[j]].y, p1[i1[j]].z), float3(p1[i1[j+1]].x, p1[i1[j+1]].y, p1[i1[j+1]].z), float3(p1[i1[j+2]].x, p1[i1[j+2]].y, p1[i1[j+2]].z));

            DecalMat = f.LoadMaterial(0.0);
            DecalMat->SetResource (f.LoadResourceEx ("Resource/DecalSS.png", true));
            DecalMat->SetSpecular (f.LoadResourceEx ("Resource/White.IC", true));
            DecalMat->SetSpecular(Graphics3D::Float4D(1., 1., 0.8, 8.0));
            DecalMat->SetAmbient(Graphics3D::Float4D(1, 1, 1., 0));
            DecalMat->SetDiffuse(Graphics3D::Float4D(0, 0, 0, 1));


        void AddStripe (Graphics2D::Position StartLeft, Graphics2D::Position StartRight, Graphics2D::Position EndLeft, Graphics2D::Position EndRight)
            Graphics2D::Position Min = StartLeft;
            Graphics2D::Position Max = StartLeft;
            Min = PosMin(Min, StartRight);
            Max = PosMax(Max, StartRight);
            Min = PosMin(Min, EndLeft);
            Max = PosMax(Max, EndLeft);
            Min = PosMin(Min, EndRight);
            Max = PosMax(Max, EndRight);
            std::vector<unsigned int> List1 = Map.GetPositionRangeTriangles(Min, Max);
            std::vector<math::Triangle> FinalList;
            for (unsigned int i=0; i<List1.size(); i++)
                std::list<math::Triangle> TriangleList;

                std::vector<Graphics2D::Position> p1;
                p1[0] = StartLeft;
                p1[1] = EndLeft;
                p1[2] = EndRight;
                p1[3] = StartRight;
                std::vector<math::Plane> Planes;

                for (unsigned int k=0; k<p1.size(); k++)
                    p1[k].y = 0;
                for (unsigned int k=0; k<p1.size(); k++)
                    float3 a3(p1[k].x, p1[k].y, p1[k].z);
                    Graphics2D::Position q = Graphics2D::Position(0, 1, 0).Cross((p1[(k+1)%p1.size()]-p1[k]).Normalize()).Normalize();
                    Planes[k] = math::Plane(a3, float3(q.x, q.y, q.z));
                for (unsigned int j=0; j<Planes.size() && TriangleList.size()>0; j++)
                    std::list<math::Triangle> NextList;
                    math::Plane Plane1 = Planes[j];
                    std::list<math::Triangle>::iterator q;
                    for (q=TriangleList.begin(); q!=TriangleList.end(); q++)
                        math::Triangle t1 = *q;
                        math::Triangle ResultT1, ResultT2;
                        float3 Factor1, Factor2;
                        unsigned int Cycle1 = 0;
                        unsigned int Type = Plane1.Clip2(t1, ResultT1, ResultT2, Factor1, Factor2, Cycle1);
                        if (Type==0)
                        if (Type==3)
                        else if (Type==1)
                        } else
                    TriangleList = NextList;
                while (!TriangleList.empty())
            GenerateQuadMeshShader g(FinalList, StartLeft, StartRight, EndLeft, EndRight);
            Graphics3D::Mesh Result = f.GenerateMesh(g);
            CurrentMesh = Result;

        Graphics3D::Mesh CurrentMesh;
        Graphics3D::Material DecalMat;
        Graphics2D::Position PosMin(Graphics2D::Position p1, Graphics2D::Position p2)
            Graphics2D::Position Result = p1;
            Result.x = std::min(Result.x, p2.x);
            Result.y = std::min(Result.y, p2.y);
            Result.z = std::min(Result.z, p2.z);

            return Result;

        Graphics2D::Position PosMax(Graphics2D::Position p1, Graphics2D::Position p2)
            Graphics2D::Position Result = p1;
            Result.x = std::max(Result.x, p2.x);
            Result.y = std::max(Result.y, p2.y);
            Result.z = std::max(Result.z, p2.z);

            return Result;
        SurfaceTriangleMap Map;
        std::vector <math::Triangle> Triangles;
        Graphics3D::Factory & f;

iOS app works on xCode but crash running from device.

It is possible that your iOS app runs fine when running from both Release and Debug on xCode but when running directly from your device it will crash.

Well first of all you should know you can view crash logs of the device from xCode.

If you have xCode, select Window->Devices, select your device and then select “View Device Logs”.

It’s pretty straight forward.

The reason your app might be crashing while running on the device but not from xCode is because you are spending too much time doing CPU work on the splash screen.

iOS will kill your app if it takes it too much time to load.

The crash report for this would have an exception code of 8badf00d(this is in Hexadecimal) and it will also say “failed to scene-create in time”.

This might be a very simple issue but also quite overlooked. More so for people who always run their app from xCode and are surprised that the app crash on the Apple review because they never run it directly from the device.

When running from xCode the behavior is different and your app will not crash for spending too much time on the splash screen. This is the source of confusion.

Things to Consider Before Backing up a Game on Kickstarter.

(I will try to make this article more informative than being a rant).

I think any indie game developer that wasn’t being able to make a living out of his games had those frustrating moments where he saw a game on Kickstarter gets a lot of money for almost no effort.

It seems people don’t realize that a lot of Kickstarter projects are using simple tricks to make people excited about the project while the end result(if ever finished) might be quite paling in comparison.

So without further ranting I will tell you about little things that hint how much quality the presented content actually have.

First of all the video.

Trailers and videos can easily impress people. Sometimes it is justified and sometimes it is not.

For instance, there is a program called Aftereffects which is a video post processing program(mostly).

This means that it adds effects to the trailer after all the video content is ready.

A simple example is beautiful titles or subtitles.

You can also add all sort of overlay animations. You can overlay a video of small red ash coming from a flame to make it seem like the game has awesome particles.

It is also quite a common effect to have flaming ash in videos. When you see those things you need to understand that the developer probably didn’t render it by himself, it’s just a video overlay.

You can take it a step further.

There are websites with pre-rendered CG which you can upload your logo or text and it will render them beautifully and seamlessly inside a high quality CG video you can put in your trailer.

Again, this is content that wasn’t created by the developer.

I am not saying that developers shouldn’t use those things.

However, as a potential backer you should notice how much content was actually created by the developer so you can tell how likely this developer will deliver a product that is similar to what was presented.


Besides the visual effects, there is also the actual gameplay and what gameplay the developer suggests the game might have.

Many in-game footage are scripted.

Watchdogs is a good example for scripted gameplay. The gameplay we saw in the trailers at first seemed amazing, but it was too amazing.

If the game presents a chain of events that reminds more of a Hollywood movie than actual gameplay it’s probably because it was scripted and produced like a movie rather than being actual in game scenarios.

There are also limitations of the medium. There is a limit how much control you can have over an in game character in a chase scene(like in watch dogs) with your keyboard and mouse.

You can’t control every little detail of the character with your mouse and keyboard in real time.

What about the AI?

Does the game imply that characters will have something interesting to say about everything you do?

That means that the developer will either have to create tons of dialogues or will have to tackle on generating sentences with an AI which as far as I can tell was never done in a game before(or not very good outside of a game).

That being said, the developer might actually have an amazing tech that does amazing things that no other game have done before.

But if you consider how much content was actually created for the Kickstater by the developer and what is the quality of the content, it could hint whether it is possible the developer has an amazing tech for amazing gameplay.

Well there is a lot more to be said on this subject, but I think those are some easy tricks to make people impressed by your Kickstarter project and as a consumer you should have those things in the back of your head while viewing another Kickstarter project.

You should also consider how much content and how much of the tech and game itself the developer have at the moment of making the Kickstarter.

If the developer have very little content and game ready, he is not in a very good position to look far ahead and suggest which features and what scope the game will actually have.

Disclaimer: I am not an expert on the subject, I am just pointing out a few things I consider when watching a Kickstarter game project.

And maybe you just like to back up a project that looks cool on Kickstarter and you don’t care how it was created.

Anyway, these are my 2 cents.

Mipmaps and GL_REPEAT Artifacts in OpenGL ES 2.

I am working on a new racing game and I encountered some odd artifacts in rendering the textures of the track.

I verified that the artifacts are not ZBuffer fighting related and I couldn’t tell what caused it.

The track has a repeating texture which means it use one texture and repeat it along the track segments.

Track Start Artifacts

Track Start Artifacts

Guard Rail Artifact

Guard Rail Artifact

In the first image notice the patch of asphalt just before the car. It has artifacts while the patch further away does not.

In the second image look at the guard rails.

This phenomena happened by using either a raw mipmapped texture or a compressed mipmapped texture(pvr on iOS).

However, it only happened on textures that used the GL_REPEAT flag and not on other mipmapped textures.

So what was it?

It turned out that there was an issue with calculating the level of the mipmap.

Notice on the rail guard image that the artifact happen at a specific band of depth range which is the transition between one mipmap level to a lower level.

Look at the UV mapping of the track on the bottom right window:

Long UV mapping

Long UV mapping

The UV mapping for this track stretch way beyond the U texture mapping coordinate .

If the UV coordinates repeat on the range of [0..1]x[0..1] then the mapping on this object reach a U coordinate of about 50.

This is done to make use of the repeating texture but it is also messing up with the OpenGL ES 2 internal calculation of the mipmap.

With large U coordinates I am guessing there is a floating point accuracy issue with the mipmap calculation. I don’t exactly know how the mipmap calculations are done but I assumed this is the cause of the artifacts.

Notice that in the first artifacts screenshots the car is at the beginning of the track, which means the first patch you see on the screenshot is actually the last patch on the track model as the track is cyclic.

The artifacts get worse the further you go on the UV mapping as the values are bigger and cause more floating point accuracy issues.

Since the texture is repeating it doesn’t matter if the UV mapping is repeating the same UV area.

I made a more compact UV mapping version of the same model:

Compact UV mapping

Compact UV mapping

Using this version of the mesh made all the artifacts disappear!

Start Fixed

Start Fixed

Guard Rail Fixed

Guard Rail Fixed

In conclusion:

If you verified that you don’t have ZBuffer issues.

If you see depth related artifacts on a specific range band of a texture with mipmaps.

And if the artifacts do not appear on small UV coordinates but become more severe the more the UV coordinates are away from [0..1]x[0..1], then there is a good chance you have a UV mapping mipmap related artifacts.

Solving this issue might be as simple as remapping the UVs to have values closer to [0..1].

There is more to say why this happens and what exactly are the floating point inaccuracies that happen in the mipmap calculation done by OpenGL but this is beyond the scope of this article.

I hope you find this article useful.


glDepthRangef and Depth Clipping on OpenGL ES 2.

I am working on a new racing game for mobile devices.

In this game I have a plane for the ground which stretch to the horizon and a spherical sky panorama.

As you may know you can’t really render into infinite in OpenGL and even if you could the rasterization and resolution limitation would still make the horizon look jaggy and aliased.


In order to make the ground plane blend with the sky sphere I needed to make the ground pixels near the edge of the render frustum to blend with the background.

However, most of the ground does not need to blend with the background but rather a thin stripe near the far end of the view.

We can render the ground in two phases. One with regular shading where there is no fading and one from where the fading begins.

The fading is dependent on depth so we need to split the rendering based on depth.

Notice: Splitting the render based on depth can be useful performance wise or can be useful to simplfy shaders. In my case there wasn’t a big difference in performance in rendering the ground in two pieces(with and without blending).

So what we need is Depth Clipping.

Our frustum box already does clipping. It clips whatever we project into it that is outside the box of [-1..1]x[-1..1]x[-1..1]

How would we clip based on depth that is smaller than 1?

When rendering the scene in the racing game we are using a perspective projection matrix which is created with the following parameters:

Field of View angle(on Y axis), Aspect ratio, Near plane and Far plane.

For instance we can have a perspective camera with a field of view of 45 degrees, an aspect ratio of 4:3, a near plane of 1 and a far plane of 500.

Lets say we want to render an object in the scene but clip it on depth of 450 instead of 500.

We can create a separate projection matrix with the exact same parameters as in the example but with 450 in the far parameter instead of 500.

This will render the object clipped to 450.

However, notice that we said the frustum only clips at -1 and 1 on the depth axis.

With the original projection matrix we projected 500 into 1. With the current matrix we project everything the same(the x and y into [-1..1]x[-1..1]) but the depth is projected from 450 to 1.

While rendering the object to the screen with the 450 depth matrix is rendered correctly, it’s depth values are rendered inconsistently with the depth values of the original matrix(with the far set to 500).

This will cause the Z Buffer to behave incorrectly.


In order to fix this we change the range of the depth values rendered into the ZBuffer from [-1..1] into [a..b] where we choose the new a, b.

We do this by using the function glDepthRangef.

glDepthRangef accepts values between 0 and 1. Our frustum box depth values are between -1 and 1.

Which means in glDepthRangef 0 is mapped to -1 and 1 is mapped to 1.

How do we choose the range for glDepthRangef so the depth values with the 450 depth projection will match all the other objects with the original projection?

We use the original matrix to calculate where 450 is projected into the frustum. Like so:


vector4 p = OriginalProjection.MulPosition (vector3(0, 0, 450));

float newFar = p.z/p.w;

newFar = 0.5*(newFar+1.0);

glDepthRangef (0, newFar);




Another thing to remember:

glDepthRangef change the depth values written to the Z Buffer.

This means the vertex shader still need to write into depth values of [-1..1].

In addition, the fragment shader will still see the depth values between -1 and 1 and not the ones glDepthRangef set them to.

Using A Spatial Partitioning Grid to Improve Performance.

For my new racing game I am using a surface made out of triangles on which the car drives.

In order to place the car on the track I need to find the triangle the car is above.

A simple way of finding this triangle is to test against every triangle in the track and find the one the car is directly above.

However, this is very wasteful CPU wise because most of the triangles are not even near the car.

In order to speed up the calculation I keep a 2D array or a grid that saves which triangles intersect with a grid cell for every cell in the grid.

This way I can test only against the triangles that are in the cell the car is also intersecting.

Notice that the same triangle might be indexed in more than one cell. It is indexed in every cell it intersects.

This guarantee that if I test a point against the triangles of a cell I won’t miss any triangle that intersects the cell.

There is a small issue when you test a line that span across several cells because then you might test against the same triangle more than once.

The solution is to return only one instance of each triangle index in the cells(by using an unordered set data structure for instance).

Here is a video of the game in which I use the grid to give context:

And here is the code of the class I use.

(Notice I also use MathGeoLib by clb but just for the triangles data structure and I also assume the grid is on the XZ plane)


Code on Gist.

#pragma once

#include "stdafx.h"
#include "resource.h"
#include "Graphics2D.h"
#include "Graphics3D.h"
#include "Geometry/Triangle.h"
#include "Math/float3.h"

namespace Logic {
	class SurfaceTriangleMap {
			SurfaceTriangleMap(Graphics3D::Mesh TrackMesh)
				this->TrackMesh = TrackMesh;
				std::vector<Graphics2D::Position> & Positions = TrackMesh->GetPosition(0);
				std::vector<unsigned int> & Indices = TrackMesh->GetIndex(0);
				for (unsigned int i=0; i<TrackGrid.size(); i++)
				TrackGeometry.resize (Indices.size()/3);
				Min = Positions[0];
				Max = Positions[0];
				for (unsigned int i=0; i<Positions.size(); i++)
					Min.x = std::min(Min.x, Positions[i].x);
					Min.y = std::min(Min.y, Positions[i].y);
					Min.z = std::min(Min.z, Positions[i].z);
					Max.x = std::max(Max.x, Positions[i].x);
					Max.y = std::max(Max.y, Positions[i].y);
					Max.z = std::max(Max.z, Positions[i].z);
				for (unsigned int i=0; i<TrackGeometry.size(); i++)
					Graphics2D::Position LocalMin, LocalMax;
					LocalMin = Positions[Indices[i*3]];
					LocalMax = Positions[Indices[i*3]];
					for (unsigned int k=1; k<3; k++)
						LocalMin.x = std::min(LocalMin.x, Positions[Indices[i*3+k]].x);
						LocalMin.z = std::min(LocalMin.z, Positions[Indices[i*3+k]].z);
						LocalMax.x = std::max(LocalMax.x, Positions[Indices[i*3+k]].x);
						LocalMax.z = std::max(LocalMax.z, Positions[Indices[i*3+k]].z);
					TrackGeometry[i].a = float3(Positions[Indices[i*3]].x, Positions[Indices[i*3]].y, Positions[Indices[i*3]].z);
					TrackGeometry[i].b = float3(Positions[Indices[i*3+1]].x, Positions[Indices[i*3+1]].y, Positions[Indices[i*3+1]].z);
					TrackGeometry[i].c = float3(Positions[Indices[i*3+2]].x, Positions[Indices[i*3+2]].y, Positions[Indices[i*3+2]].z);
					unsigned int StartX = std::min((unsigned int)(std::max((double)TrackGrid[0].size()*(LocalMin.x-Min.x)/(Max.x-Min.x), 0.0)), TrackGrid[0].size()-1);
					unsigned int StartZ = std::min((unsigned int)(std::max((double)TrackGrid.size()*(LocalMin.z-Min.z)/(Max.z-Min.z), 0.0)), TrackGrid.size()-1);
					unsigned int EndX = std::min((unsigned int)(std::max((double)TrackGrid[0].size()*(LocalMax.x-Min.x)/(Max.x-Min.x), 0.0)), TrackGrid[0].size()-1);
					unsigned int EndZ = std::min((unsigned int)(std::max((double)TrackGrid.size()*(LocalMax.z-Min.z)/(Max.z-Min.z), 0.0)), TrackGrid.size()-1);
					for (unsigned int z1=StartZ; z1<=EndZ; z1++)
						for (unsigned int x1=StartX; x1<=EndX; x1++)

			math::Triangle & GetTriangle (unsigned int i)
				return TrackGeometry[i];

			unsigned int GetTrianglesAmount()
				return TrackGeometry.size();

			std::vector<unsigned int> GetPositionTriangles (Graphics2D::Position Pos)
				unsigned int BaseX = std::min((unsigned int)(std::max((double)TrackGrid[0].size()*(Pos.x-Min.x)/(Max.x-Min.x), 0.0)), TrackGrid[0].size()-1);
				unsigned int BaseZ = std::min((unsigned int)(std::max((double)TrackGrid.size()*(Pos.z-Min.z)/(Max.z-Min.z), 0.0)), TrackGrid.size()-1);

				std::vector<unsigned int> IndexList;

				std::list<unsigned int>::iterator q;
				for (q = TrackGrid[BaseZ][BaseX].begin(); q != TrackGrid[BaseZ][BaseX].end(); q++)
				return IndexList;

			std::vector<unsigned int> GetPositionRangeTriangles (Graphics2D::Position PosStart, Graphics2D::Position PosEnd)
				unsigned int StartX = std::min((unsigned int)(std::max((double)TrackGrid[0].size()*(PosStart.x-Min.x)/(Max.x-Min.x), 0.0)), TrackGrid[0].size()-1);
				unsigned int StartZ = std::min((unsigned int)(std::max((double)TrackGrid.size()*(PosStart.z-Min.z)/(Max.z-Min.z), 0.0)), TrackGrid.size()-1);
				unsigned int EndX = std::min((unsigned int)(std::max((double)TrackGrid[0].size()*(PosEnd.x-Min.x)/(Max.x-Min.x), 0.0)), TrackGrid[0].size()-1);
				unsigned int EndZ = std::min((unsigned int)(std::max((double)TrackGrid.size()*(PosEnd.z-Min.z)/(Max.z-Min.z), 0.0)), TrackGrid.size()-1);

				if (EndX<StartX)
					unsigned int KeepX = StartX;
					StartX = EndX;
					EndX = KeepX;
				if (EndZ<StartZ)
					unsigned int KeepZ = StartZ;
					StartZ = EndZ;
					EndZ = KeepZ;

				std::vector<unsigned int> IndexList;

				for (unsigned int CountZ = StartZ; CountZ<=EndZ; CountZ++)
					for (unsigned int CountX = StartX; CountX<=EndX; CountX++)
						std::list<unsigned int>::iterator q;
						for (q = TrackGrid[CountZ][CountX].begin(); q != TrackGrid[CountZ][CountX].end(); q++)
				return IndexList;
			std::vector<math::Triangle> TrackGeometry;
			std::vector<std::vector<std::list<unsigned int> > > TrackGrid;
			Graphics3D::Mesh TrackMesh;
			Graphics2D::Position Min, Max;