Memory Usage and Normal Map Texture Formats in OpenGL ES2.


I am working on a 3D mobile racing game using OpenGL ES2.

For the cars in the game I am using normal maps to give them more details while using the same amount of geometry.

32 Bit RGBA uncompressed normal mapped car.

32 Bit RGBA uncompressed normal mapped car.

Normal maps are images that contain normal vectors per texel of the texture.

The normal vector is usually packed inside the 24 bit RGB values of the texture and is converted into a vector inside the fragment shader.

vec3 texNormal =texture2D(normalSampler, texOut).xyz;
texNormal = (texNormal*2.0)-1.0;

Here is an example of a test normal map my artist created for the car(it doesn’t have a lot of details it’s just a test):

Test Normal Map

Test Normal Map

Memory Usage

When you load a texture into OpenGL it doesn’t matter how much space it takes on the disk, what matters is how much space it takes in memory.

For instance an image could be saved as a PNG and take only 700KB on disk but when loaded to OpenGL it would take 4MB because it’s a 1024×1024 32 bit RGBA uncompressed format.

In my racing game I would have 4 different cars, each one with it’s own normal map. If each normal map is a 1024×1024 RGBA uncompressed image, it would take 16MB of memory with just the normal maps!

We would like to reduce the memory usage of these textures.

Compressed Texture

OpenGL ES2 supports loading compressed texture using glCompressedTexImage2D.

In the case of iOS the native supported compressed texture format is PVRTC.

The advantage of these compressed formats is that they are stored compressed in the OpenGL memory and they are only decoded in real time. So the memory usage is only of the compressed texture size.

A 4 bit PVRTC always give you a compression of 1/8 of the uncompressed 32 bit RGBA bitmap.

While PVRTC is great for diffuse or color textures, with normal maps the compression of the texture make it seem like there are lumps on the surface.

PVRTC compressed normal mapped car.

PVRTC compressed normal mapped car.

We might be missing the point of making things look better when we have such artifacts on the surface of our car.

16 Bit 565 RGB

Our original texture is 32 bit RGBA per pixels. Which means we have 4 bytes per pixel of the image.

There are uncompressed formats that use only 2 bytes per pixel.

For instance GL_UNSIGNED_SHORT_5_6_5.

We only need RGB since we don’t use the Alpha channel in our texture.

What will happen if we store the xyz components of the normal inside 16 bit RGB instead of 24 bit RGB?

16 bit RGB normal mapped car.

16 bit RGB normal mapped car.

Again, with the reduced accuracy of the normals the end result is not very good.

16 Bit Two Components Packing

Our normals are normalized to the length of 1. We can actually derive the z component of our normal from our x and y components using Pitagoras theorem.

We only need 16 bits to store the x and y components.

However, we do not have a pixel format of two 8 bit components.

We do have this format: GL_UNSIGNED_SHORT_4_4_4_4.

We are going to pack the two 8 bit RG components into the four 4 bit RGBA components of our new format.


	stbi_uc * Tmp = Data;
	Data = (unsigned char *)new unsigned short[w*h];
	unsigned int n = w*h;
	for (unsigned int i=0; i<n; i++)
		unsigned char r = Tmp[4*i];
		unsigned char g = Tmp[4*i+1];
		unsigned short v = (unsigned short)r;
		v = v<<8;
		v+=(unsigned short)g;
		((unsigned short*)Data)[i] = v;
	delete [] Tmp;

After packing the texture into 16 bits we need to change the code in our fragment shader to unpack the texel into a normal vector.

vec4 compressedNormal =texture2D(normalSampler, texOut);
vec3 texNormal;
texNormal.r = 2.0*(compressedNormal.r*240.0+compressedNormal.g*15.0)/255.0-1.0;
texNormal.g = 2.0*(compressedNormal.b*240.0+compressedNormal.a*15.0)/255.0-1.0;
texNormal.b = sqrt(1.0-texNormal.r*texNormal.r-texNormal.g*texNormal.g);

In this case we reduce the memory usage by half of the original 32 bit texture and we don’t compromise any quality.

We might be paying in GPU processing power since now our fragment shader is more complex.

However, I didn’t notice any difference in performance on my iPad New.

We also assume our z component is always positive so we cannot represent normals that go inward.

However, in most cases we do not need normals that go into the triangle or the surface.

16 bit packed normal mapped car.

16 bit packed normal mapped car.

Talk To Your Artist

Until now we were able to reduce the memory usage by a factor of 2 without sacrificing the quality.

However, this might not be enough.

There is another way to reduce memory usage and that is having the artist optimize the texture mapping.

In our case the bottom part of the car is rarely seen or is only partially seen.

We don’t need a lot of details in the bottom part of the car so the artist can allocate a much smaller area of the texture for the bottom part.

He might also separate the car into two surfaces(which means two passes) and to have different textures for each surface while the bottom part will have a much lower resolution texture.

In our game I have told the artist to reduce the texture size from 1024×1024 to 512×512 while taking into consideration that we don’t need a lot of details in the bottom part.

With both the 16 bit texture packing and the reduction of the resolution of the texture, we will get a factor of 8 in reducing the memory usage.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s