std::deque 4K Bytes high memory cost on iOS, C++

Preface

My iOS build of my game Dragons High was using about 150MB of live memory.

I used the Allocations instrument to find out that I had more than 10,000 allocations of the size of 4K Bytes.

All those allocations were of the same class, std::deque<unsigned int>

In Dragons High there is a terrain of the world. The terrain has collision geometry data.

Basically I test the dragon or other moving characters against the triangles of the terrain geometry so they won’t go through it.

In order to not go over all the triangles I have a grid of lists(std::deque) with indices to the triangles, so if the character is inside a certain cell in the grid I just test against the triangles in that cell and not against all the triangles in the mesh.

std::deque Cost

It turns out that for every item in std::vector<std::deque<unsigned int>>, std::deque was allocating 4K Bytes even if it had less than 20 items.

For my grid I allocated 128×128 cells, so as a result I had more than 50MB usage just for this std::vector.

For every std::deque<unsigned int> in the std::vector in my iOS game I had 4K allocated even if there were very few elements in the std::deque.

My solution was to use my own custom Data Structure which used dynamically allocated unsigned int * pointer.

This way I was able to shave about 50MB of memory usage.

EDIT:

A better alternative to making your own custom class with a pointer is to use std::vector.

std::vector takes care of allocations for you, so it’s safer and it does not have a noticeable memory footprint(for a small amount of items).

Racing Game Custom Physics Simulation (Part 2)

Preface

In part 1 we have shown how to make a car(truck) glide on a track made out of 3D triangles.

We were able to steer the car and it was “glued” to the track to make it seem like there are physics involved.

In this part we are going to have an actual physics simulation(to some degree).

The Models

The model of the track remains the same as in part 1(a set of 3D triangles).

The car model is made of 5 points: 4 for the wheels and 1 for the bottom center. Just like in part 1.

However, we will also maintain variables to keep the orientation of the car and the gravity velocity of each of the 4 wheels(in addition to the car’s gravity velocity).

Steering

Since we now maintain the orientation of the car from the previous frame we can’t just assume the steering vector is calculated aligned to the xz plane.

To calculate the new steering vector we rotate the Look and Right vectors by a delta of an angle on the Look x Right plane:

				Look = LastRight*sin(DeltaAngle)+ LastLook*cos(DeltaAngle);
				Right = LastRight*cos(DeltaAngle)- LastLook*sin(DeltaAngle);

The delta of the angle is the angular speed multiplied by the frame’s time delta.

The forward velocity is calculated the same as in part 1 but notice the Look vector might not be on the xz plane this time.

Gravity

This time we are also going to take gravity into account.

Our gravity is the following vector (0, -9.8, 0).

It is a vector pointing downwards with an acceleration of 9.8 meters per square second.

We compose the car’s velocity from two components. The steering velocity from above and the Gravity velocity.

The reason we separate the two is because it would be easier to set the gravity velocity to 0 whenever a point in the model hits the ground or track.

We also need to maintain a separate gravity velocity for each of the 5 points in the car model to make the physics simulation of the car work as if we had an object with volume.

The Simulation

After we calculated the Look and Right orientation vectors and the steering velocity vector we need to apply gravity.

For each of the 5 gravity velocity vectors we add the Gravity vector multiplied by the frame’s time delta.

				CurrentGravityVelocity=CurrentGravityVelocity+Gravity*t;
				for (unsigned int i=0; i<WheelsGravityVelocity.size(); i++)
					WheelsGravityVelocity[i]=WheelsGravityVelocity[i]+Gravity*t;

The next step is to calculate the current position of the wheels and the position of the car center.

For the car center position, we add the current steering velocity vector and the current gravity velocity vector multiplied by the frame’s time delta.

Like so:

				Pos = Pos+(Look*CurrentFlatSpeed+CurrentGravityVelocity)*t;

For each of the 4 wheels we calculate the wheel’s position relative to the car’s center using the wheels base dimensions and the Look and Right orientation vectors.

We then add the car’s center position to each of the 4 wheels positions.

The last step is to add the steering velocity and the gravity velocity multiplied by the time delta. The steering velocity is the same as it was for the car’s center but notice the gravity velocity might be unique for each wheel point(We saved them in their own variables).

				for (unsigned int i=0; i<Wheels.size(); i++)
					Wheels[i] = Right*WheelsDelta[i].x+Look*WheelsDelta[i].z+Pos+(Look*CurrentFlatSpeed+WheelsGravityVelocity[i])*t;

In a similar fashion to part 1 we now check which triangles each one of the 5 points of the model intersect with.

The difference is that instead of setting the points to the intersection height of the respective triangles, we only set them to the intersection height in case their own height is lower.

In addition we set to zero the Gravity velocity of each point only if  it’s height was adjusted by a triangle.

Now we have 5 points of the car displaced into new heights. The car’s center point will be used for the new position, but like in part 1, we need to calculate the new orientation from the 4 wheel points.

The current car model simulates a rigid body. However while adjusting the wheels’ heights the car’s wheel base is now deformed.

We will restore the original form of the wheel base by treating the 4 wheels as if they have springs among themselves(a total of 6 springs).

This will make the 4 wheel points simulate the wheels base as a if it was a rigid body.

In order to restore the original form of the wheels base we go over all the 6 springs and adjust them to be closer to their original length.

We iterate over this process for ten times and at the end we would get something closer to the original form.

				for (unsigned int k=0; k<10; k++)
				{
					for (unsigned int i=0; i<Wheels.size(); i++)
						for (unsigned int j=i+1; j<Wheels.size(); j++)
						{
							Graphics2D::Position v = Wheels[i]-Wheels[j];
							Graphics2D::Position center = (Wheels[i]+Wheels[j])*0.5;
							double l = (WheelsDelta[i]-WheelsDelta[j]).Length();
							double radius = (0.1*l+0.9*v.Length())/2.0;
							Wheels[i] = (Wheels[i]-center).Normalize()*radius+center;
							Wheels[j] = (Wheels[j]-center).Normalize()*radius+center;
						}
				}

Now that we have the wheel points placed on the wheels base frame we can calculate the new Look and Right orientation vectors in a similar fashion we did in part 1.

				Look = (Wheels[0]-Wheels[3]).Normalize();
				Right = (Wheels[1]-Wheels[0]).Normalize();

Conclusion

We now have a more physically based simulation that also support falling off from edges.

The result of this simulation can be seen in this video:

For the sake of completion I am adding the entire code for the update function.

 

			void Update (double t)
			{
				double WheelFactor = 0;
				if (Input.GetLeft())
					WheelFactor = -1;
				else if (Input.GetRight())
					WheelFactor = 1;
				if (Input.GetThrust())
					CurrentFlatSpeed += MaxFlatSpeed*t/AccelLatency;
				CurrentFlatSpeed = std::max(std::min(CurrentFlatSpeed, MaxFlatSpeed), 0.0);

				double DeltaAngle = WheelFactor*t;
				Graphics2D::Position Look = LastRight*sin(DeltaAngle)+ LastLook*cos(DeltaAngle);
				Graphics2D::Position Right = LastRight*cos(DeltaAngle)- LastLook*sin(DeltaAngle);
				CurrentGravityVelocity=CurrentGravityVelocity+Gravity*t;
				for (unsigned int i=0; i<WheelsGravityVelocity.size(); i++)
					WheelsGravityVelocity[i]=WheelsGravityVelocity[i]+Gravity*t;
				CurrentVelocity = Look*CurrentFlatSpeed+CurrentGravityVelocity;
				CarParms->SetLook (Look, Graphics2D::Position(0, 1, 0));
				std::vector<Graphics2D::Position> Wheels;
				Wheels.resize(WheelsDelta.size());
				for (unsigned int i=0; i<Wheels.size(); i++)
					Wheels[i] = Right*WheelsDelta[i].x+Look*WheelsDelta[i].z+Pos+(Look*CurrentFlatSpeed+WheelsGravityVelocity[i])*t;

				Pos = Pos+CurrentVelocity*t;
				if (Pos.y<0.0)
					CurrentGravityVelocity.y = std::max(0., CurrentGravityVelocity.y);
				Pos.y = std::max(0.0, Pos.y);

				unsigned int StartX = std::min((unsigned int)(std::max((Pos.x-Min.x)/(Max.x-Min.x), 0.0)), TrackGrid[0].size()-1);
				unsigned int StartZ = std::min((unsigned int)(std::max((Pos.z-Min.z)/(Max.z-Min.z), 0.0)), TrackGrid.size()-1);

				std::list<unsigned int>::iterator q;
				for (q = TrackGrid[StartZ][StartX].begin(); q != TrackGrid[StartZ][StartX].end(); q++)
				{
					const math::Ray r(float3(Pos.x, 100.0, Pos.z), float3(0, -1, 0));
					float d = 0;
					math::float3 Point;
					if (TrackGeometry[*q].Intersects(r, &d, &Point))
					{
						if (r.pos.y-d>=Pos.y)
						{
							CurrentGravityVelocity.y = std::max(0., CurrentGravityVelocity.y);
							Pos.y = r.pos.y-d;
						}
					}
				}
				CarParms->SetPosition (Pos);
//				std::vector<bool> IsWheelContact;
//				IsWheelContact.resize(4, false);
				for (unsigned int i=0; i<Wheels.size(); i++)
				{
					Graphics2D::Position p = Wheels[i];
					unsigned int StartX = std::min((unsigned int)(std::max((p.x-Min.x)/(Max.x-Min.x), 0.0)), TrackGrid[0].size()-1);
					unsigned int StartZ = std::min((unsigned int)(std::max((p.z-Min.z)/(Max.z-Min.z), 0.0)), TrackGrid.size()-1);

					std::list<unsigned int>::iterator q;
					for (q = TrackGrid[StartZ][StartX].begin(); q != TrackGrid[StartZ][StartX].end(); q++)
					{
						const math::Ray r(float3(p.x, 100.0, p.z), float3(0, -1, 0));
						float d = 0;
						math::float3 Point;
						if (TrackGeometry[*q].Intersects(r, &d, &Point))
						{
							if (r.pos.y-d>=Wheels[i].y)
							{
								WheelsGravityVelocity[i].y = std::max(0., WheelsGravityVelocity[i].y);
//								IsWheelContact[i] = true;
								Wheels[i].y = r.pos.y-d;
							}
						}
					}
					if (FloorHeight>=Wheels[i].y)
					{
						WheelsGravityVelocity[i].y = std::max(0., WheelsGravityVelocity[i].y);
//						IsWheelContact[i] = true;
						Wheels[i].y = FloorHeight;
					}
				}
				for (unsigned int k=0; k<10; k++)
				{
					for (unsigned int i=0; i<Wheels.size(); i++)
						for (unsigned int j=i+1; j<Wheels.size(); j++)
						{
							Graphics2D::Position v = Wheels[i]-Wheels[j];
							Graphics2D::Position center = (Wheels[i]+Wheels[j])*0.5;
							double l = (WheelsDelta[i]-WheelsDelta[j]).Length();
							double radius = (0.1*l+0.9*v.Length())/2.0;
							Wheels[i] = (Wheels[i]-center).Normalize()*radius+center;
							Wheels[j] = (Wheels[j]-center).Normalize()*radius+center;
						}
				}
				Look = (Wheels[0]-Wheels[3]).Normalize();
				Right = (Wheels[1]-Wheels[0]).Normalize();
				LastLook = Look;
				LastRight = Right;
				Graphics2D::Position Up = Look.Cross(Right);
				CarParms->SetLook(Look, Up);
			}

Simple yet important tip for passing std::string by reference.

Using C++ I quickly learned that I should pass large std::vector by reference.

Passing a std::vector by reference (using &) as a parameter for a function will save you from the operation of copying the std::vector into a local copy.

For my next game I was getting bad performance parsing jsons.

It turns out that nearly the entire CPU time spent on parsing the json was due to passing std::string by value in internal functions.

Passing std::string by reference will save you from a copy operation of that string.

Location 0? location -1? glGetUniformLocation, C++ and bugs. (Android\iOS GLES 2)

In OpenGLES 2 glGetUniformLocation receives the program id and a string as parameters. It then attempts to return a location int that can be used to set uniform GLSL shader variables.

If the variable is found it will return a 0 or positive value. If it fails to find the uniform variable it will return -1.

In C++ we should initialize the location ints in the ctr. If we don’t initialize the locations we might have garbage values when in Release mode.

Using the locations with garbage values might overwrite uniform variables with values we did not intend them to have.

So what we should initialize the locations with? One might think that 0 is a good value to initialize but it is not.

Remember! 0 is a valid shader uniform variable location. If we set all the locations to 0 we might overwrite the uniform variable at location 0.

We should initialize the location ints with -1.

We should do this because -1  is the value that is returned in case the uniform variable was not found and setting a value at location -1 will be ignored.

Calling a multiple parameters Java method with JNI (Loading ETC1 for Android)

Calling a Java method from C++ with JNI is kind of difficult since you don’t have static type compilation errors.

I was calling a Java method from my Android OpenGL graphics module in order to load an ETC1 texture with this command:

jmethodID mid = env-&gt;GetMethodID(cls, "LoadETC1", "(Ljava/lang/String; [I [I)I");

The Java method I was trying to call was the following:

		public int LoadETC1(String s, int[] w1, int[] h1)
		{
			if (!ETC1Util.isETC1Supported())
				return 0;
			try {
				InputStream s2 = assetManager.open(s);
				ETC1Util.ETC1Texture t = ETC1Util.createTexture(s2);
				int textureName = -1;
				int[] texture = new int[1];
				GLES20.glGenTextures(1, texture, 0);
				textureName = texture[0];

		        GLES20.glBindTexture ( GLES20.GL_TEXTURE_2D, textureName );

//				GLUtils.texImage2D(GLES20.GL_TEXTURE_2D, 0, bitmap, 0);

		        int w = t.getWidth();
		        int h = t.getHeight();
		        int l = t.getData().capacity();
		        GLES20.glCompressedTexImage2D(GLES20.GL_TEXTURE_2D, 0, ETC1.ETC1_RGB8_OES, t.getWidth(), t.getHeight(), 0, t.getData().capacity(), t.getData());
//		        GLES20.glTexImage2D ( GLES20.GL_TEXTURE_2D, 0, GLES20.GL_RGBA, w, h, 0,
//		                              GLES20.GL_RGBA, GLES20.GL_UNSIGNED_BYTE, ib );

		        GLES20.glTexParameteri ( GLES20.GL_TEXTURE_2D, GLES20.GL_TEXTURE_MIN_FILTER, GLES20.GL_LINEAR );
		        GLES20.glTexParameteri ( GLES20.GL_TEXTURE_2D, GLES20.GL_TEXTURE_MAG_FILTER, GLES20.GL_LINEAR );
		        GLES20.glTexParameteri ( GLES20.GL_TEXTURE_2D, GLES20.GL_TEXTURE_WRAP_S, GLES20.GL_REPEAT );
		        GLES20.glTexParameteri ( GLES20.GL_TEXTURE_2D, GLES20.GL_TEXTURE_WRAP_T, GLES20.GL_REPEAT );
		        TextureNode n = new TextureNode();
		        w1[0] = t.getWidth();
		        h1[0] = t.getHeight();
		        return texture[0];
			} catch (IOException e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			}
			return 0;
		}

The issue was that  GetMethodID was returning 0. In LogCat it mentioned that my method signature is bogus “(Ljava/lang/String; [I [I)I“.

The issue actually was that I had spaces between the 3 parameters of the method. There should be no spaces and there should be ‘;’ only after a non primitive Java class like “Ljava/lang/String;”.

The following method gave a valid jmethodID:

jmethodID mid = env-&gt;GetMethodID(cls, "LoadETC1", "(Ljava/lang/String;[I[I)I");

resize performance for std::vector in C++

I am working on an iOS port of my Android game Concussion Boxing.

Inside my rendering function I have a certain loop that use the same struct for each mesh that is drawn. This struct has std::vector variables of various kinds in it.

Whenever I want to use this struct to pass parameters to a function I would resize its std::vector variables.

Allocating memory is a CPU demanding operation that is why resize can be used to save on these allocation operations.

A good idea would be to declare the struct outside of the loop so we can reuse the memory that was allocated in the previous iteration of the loop.

However, there is a case in which resize won’t work so well. If you have a std::vector<std::vector<float>> then resize will have a performance hit.

I don’t know what is going inside resize in this case but I found that simpley having a pointer to float or a struct of a constant amount of floats(in an array) will be a lot more efficient rather than using a vector of a vector.

For reference, I improved my game’s frame CPU time from 27 ms to 22 ms by using less vectors of vectors.

std::vector resize bug

There is a certain bug that I keep making again and again.

In C++ there is a standard template library data structure called vector.

It is a data structure that gives you random access to it’s content, much like an array.

A vector can be of variable length and there are several ways to initialize and change it’s size. One of the methods to change it’s size is std::vector::resize.

The advantage of resize is that it can be used after the ctr. But there is a subtle point that needs to be considered.

When you resize std::vector it doesn’t necesseraly call the dtr of it’s content.

The most common issue of this for me is when I want to load new data to std::vector not realizing the old data is still there. Imagine we have std::vector<std::list<unsigned int> >.

After resizing the lists in this case might not be empty.

In order to make sure your vector is empty you should call std::vector::clear first.