Depth Of Field

In photography DOF(Depth Of Field) is the range in which the image blur is smaller than one pixel. That means that outside the DOF the image gets blurry.

As you might have already seen, I had a Blur Compute Shader implemented. So making a DOF is just a simple matter of doing blur adjusted to the depth of the pixel, right? Not really.
Doing DOF is more complex than I initially thought.

The first issue I encountered is that when I wanted to blur pixels, which are before the DOF, my blur wasn’t wide enough. I didn’t want to increase the kernel so I just sampled with steps of 2 pixels instead of 1.

That was simple enough but then I realized something else. Even though I was sampling wider and even though I was performing a blur there were still hard edges in areas with discontinuity.
What do I mean? Imagine you look behind the corner of a wall. The wall that is near you should be blurred, and the wall that is far should be sharp.
However, even though the nearer wall is blurred, there is still a hard edge because the blur is happening on a per pixel basis.
What we wanted to happen is for the closer wall to smear and blur above the far wall. To do that I had to consider an environment of depth values when deciding on the blur factor instead of just considering the depth value of the current pixel.

Hard Edge

Soft Edge

 The final step was to make the DOF adjust according to what the player is looking at. For that I sampled the 4×4 center pixels of the depth buffer inside a separate Compute Shader and wrote out the result into a 1×1 texture that will be read by the DOF Compute Shader as a shader resource.  

Geometry shader based particles

I always remembered that when doing particles you have to involve the CPU in updating the vertex buffer for the newly created and the destroyed particles. Well things have changed since and nowadays you can create and destroy the particles entirely on the GPU.
The shader pipeline has something called “Stream Output” and this allows for the geometry shader to write and update GPU memory buffers such as the vertex buffer. Instead of updating the vertex buffer using the CPU every frame, we can now simply update the vertex buffer using a Geometry shader. Pretty neat, right?
In order to write into a vertex buffer, we need to bind the vertex buffer using the flag D3D11_BIND_STREAM_OUTPUT.
We write a Geometry shader in a similar way to a standard Geometry shader, only this time the Geometry shader is our final programmable shader in our draw call. In order for this to work we need to bind the stream out view of the vertex buffer using SOSetTargets.
This is a quick dry review of how to do this, but the important thing is to know that Stream out exists and that it can be used for particles.

We now have our “GPU only” particle system. Though, it still looks more or less the same like the particles we had in DX9 only now the CPU does not intervene.

Notice how the particles are abruptly cut when they intersect with the wall\floor. We can fix this by making the particles more soft. We do this by simply comparing the current pixel to the depth buffer and fade it if the depth of other objects are close to the depth of the particle at that pixel.
There is an issue though, how can we read from the depth texture as it is part of our render target? In DX9 we could not read from the depth buffer and we would have to create a copy of the depth map by redrawing he whole scene into an off-screen texture.
However, in DX11 we can simply create a shader resource view for the Depth map. But we still cannot bind the depth map as both the Z Buffer for the render target and as a shader resource on the same draw.
What we will do is have no depth map bound to the render target but since we do want to clip pixels that are hidden behind solid objects, we will clip them in the pixel shader.
‘clip’ is an intrinsic function that allows you to clip pixels inside the pixel shader. A pixel that is clipped will stop the executing of the pixel shader and will not be drawn into the back buffer.
So now we can use the depth buffer as a shader resource for both clipping and fading the particles near objects.
In order to fade the particles we need a linear Z value, but the values in the depth map are not linear. We would need to convert the depth map values into linear Z using the projection matrix.
I will not go into details on this, but I will paste my pixel shader code.

Soft particles

There is still a lot that can be done, but this is a good start.
With more than one pass we could probably make a lot more, but we will have to see how it will affect performance.

Screen Space Tesselation.

The tessellation shaders are used to subdivide 3D geometry inside the shader pipeline. This allows for dynamic LOD(Level Of Details) but also allows to produce a lot more geometry from a given vertex buffer. This in turn save in bandwidth and processing and even allows for LOD that change within the same model.
The tessellation shaders consist of two Hull shaders, the tessellation stage(which is not programmable) and the Domain shader.
When sending geometry with the tessellation shaders activated, we don’t send normal vertices but rather control points. The control points are processed in the vertex shader as normal vertices and then they are passed to the Hull shaders. The Hull shaders decide what kind of geometry and how many subdivisions the tessellation stage should produce. After the tessellation stage produces the new vertices, the Domain shader process all the new geometry as if they were normal vertices processed by the vertex shader. Which means the Domain shader usually projects the vertices into the screen space for rasterization.

A naive subdivision would be to set a constant subdivision factor per model. Each primitive in our model will subdivide into an identical amount of triangles.
The issue is that some primitives in the original model might be bigger and some smaller, it make no sense to subdivide both the big and small primitive the same. In addition, some primitive might be far from the camera and consist only a few pixels in the screen space. This leads us to try to subdivide the mesh according to how many screen space pixels the triangle or primitive covers.

3 Control points, constant tessellation

6 Control points, screen space tessellation

This might seem simple enough. However, there is an issue. What happens if we subdivide two adjacent triangles, while one triangle will be subdivided into 4 triangles and the adjacent triangle into 9? At first it doesn’t seem like an issue, but if we displace the new vertices with a displacement map(texture), we might have a gap between what were originally two triangles.

Gaps due to unequal edge tessellation.

We need to make sure that adjacent primitives in the original model, will subdivide exactly the same on their shared edge. This will result that for every new vertex in one primitive, there will be a spatially identical vertex on the neighbouring primitive. Thus when the displacement map translate the vertices, the two new vertices will move to the same place.
In other words, when ever we decide how to subdivide an edge of a primitive, we need to rely only on data that is available to the primitives of both sides of the edge. And we need the tessellation factor of this edge to be identical on both primitives who shared that edge.

With the constant tessellation, its easy, we just provide an identical constant value for all the edges.
When we want it to be tesselated according to screen space area, for each edge we need to calculate the average of the area of the triangles from both sides of the edge.
We use six control points to achieve this. Three for the triangle itself, and another three for the three triangles that share the edges of the current triangle. With those 6 control points we are able to calculate the screen space area of 4 triangles which are used to calculate the tessellation factor of the 3 edges.

Here is a nice video showing screen space tessellation in real-time:

Warnning! fxc compilation bugs(DirectX)

I have been working on DX11 and tessellation lately. It turns out my tessellation shaders are broken when I let fxc optimize my shader. The “solution” is to specify the flag /Od which tells fxc to not do any optimizations.
This will have to do for now, but you realize how deadly it can be. You might work on a problem for a long time and not solve it because of unsafe optimizations, and then suddenly you turn into debug and it works. Scary stuff.
I must note that I don’t have any warnings from fxc.

I tried to look at the assembly code, and the biggest thing I noticed is that the “Release” code has loop unrolling while the “Debug” code has real loops. I think most of the optimizations are the loop unrolling anyway.

Here is the unoptimized shader results compared to the optimized shader results:

Defracture tool, a displacement map fixer.

I have been working lately on displacing geometry of a 3D model in real time, after it was tesselated.
Displacement maps means I modify the geometry vertices position according to a height map texture. The issue is that most game models don’t have enough geometry to support the fine details of the texture. You could send a lot of geometry to the shader, but this is wasteful. Instead, I subdivide the mesh inside the shader pipeline(tesselate).

I bought a nice 3D model of a stone golem that had normal maps but no displacement maps(height maps + normal maps). I decided to generate the height map using CrazyBump.
That worked pretty nice, but there was an issue. After displacement the model had gaps in certain parts of its surface.

The reason was that because the 3D model was stitched at certain places, like the shoulders and the head. Artists do that so they can unwrap the model’s surface into a 2D space which is required for mapping the texture.
The gaps appeared in those stitched areas because they had duplicate vertices in the same position but with different texture space coordinates.
CrazyBump only use the image and has no reference of the geometry, so it can’t tell there are different places in the texture that belong to the same 3D position in the model.

What I did to fix this is go over all the edges of the triangles that are shared in the 3D model, but are separate in the texture space. Then I just made the pixels in the height map on these edges share the same values for both edges. I had to make this edge fix more than 1 pixel width and it wasn’t a simple challenge.
This image shows where I fixed the displacement map. In practice I didn’t put black colors on the image, but rather an average of values on the two edges.

Denoting the edges

Here are the results: