Pages

Sun Shafts & SSAO

 Added screen space sun shafts, based on this article from GPU Gems 3.





Also added basic SSAO(screen space ambient occlusion).  Right now it is just using the depth buffer.  For performance I will switch this over to using a downsampled depth buffer at some point in the future.  There are also some alternative methods that take into account surface normals that tend to generate more accurate results, so I might add one of those methods later.
Ambient Occlusion

Atmospheric Scattering



       Atmospheric scattering based on Eric Bruneton's Precomputed Atmospheric Scattering.

 I really like the look this algorithm gives, I'm only part way through implementing it so not everything is working, or at least as well as I'd like.

 His algorithm produces three colors, inscatter color, ground color, and sun color.  Although Bruton's sun color looks good in his screen shots, in the demo it looks rather bad; I believe the demo is an incomplete version.

 Part of the algorithm requires the distance from the viewer, here I am reconstructing based on the depth buffer, and to get this working I went ahead and implemented a proper deferred renderer, so yeah!

  Oh, and the first time I ran the atmospheric code this is what my planet looked like(precomputed table wasn't right )....

Deferred Texturing ?

  One method I tried, that didn't end up working very well, was deferred texturing for the terrain.

The terrain texture coordinates are based on its world coordinates and normal.  Using the GBuffer normal + depth buffer to reconstruct world position gave me what I needed.

 And while this worked and produced identical texture coordinates, the gradients were not always correct, resulting in some nasty looking aliasing anywhere that the depth buffer contained a large difference between adjacent pixels(because originally they were separate objects).

 My understanding is that the graphics card works on 2x2 pixels--

AB
CD

So the world coordinates are calculated in the four pixels, the difference between the adjacent pixels world coordinates is used as the derivative for MIP calculation.

But if pixel A was from a completely different mesh than pixel C, and 1000 meters closer to the camera, it ends up with really large derivatives and selects the lowest MIP--this is not what you want!

 So everything ends up with a blocky/pixelated outline that shimmers and looks just awful.

 If you passed down the derivative information perhaps you could work around it, but that is quite a large amount of extra data, and I didn't want to go down that route.

Terrain Video

 I made a short little video flying around the terrain, 1080p if you click on on the word youtube.

Terrain Blending

NOTE: the methods in this post are old & moronic, don't use them:)

 Blending:

 I've updated the terrain so that it blends seamlessly between different LOD's.  Previously it would just switch from one LOD to the next without any attempt to blend, which looked pretty bad, especially if running the terrain at a fairly low detail setting.

 Textures

  For now just using some random textures off the internet.  Texturing is being applied using triplanar texturing.  Currently it is just a global set of 3 textures, but eventually I plan to make it store per vertex a texture ID, so that the terrain can be, at least to some degree, painted upon.


Seamless blending & basic texturing
 
  Some blending details...


       Typically terrain blending is done via vertex morphing & manual texturing blending, at least this is how most height field based algorithms work.
   I am using voxels though.  Rendered as chunks of triangles, essentially each LOD's meshes are unrelated to the next LOD's meshes.  There is no real obvious way to morph between these meshes since they have nothing in common.
     What I have seen done before is alpha blending between LOD's.  Nvidia did this in their GPU terrain demo.  I plan to use a deferred renderer, which generally runs counter to alpha blending, so I've modified this idea somewhat.  Currently what I am doing is rendering high detail terrain into one set of buffers, and low detail/parent terrain into another. In post I blend them together based on a few different things.  The overhead for this is actually surprisingly small, and eventually I will be able to blend the data together prior to lighting being applied, essentially dumping the result into the deferred gbuffer.


 and cracks..
          Are gone, mostly.  I added a check so that if no data is found in the high details, it tries to fill it from the low detail, this seems to fix most cracks. I will probably expand on this to remove all cracks at some point in the future, but it removes most already.
Blue represents any location that was hole filled or crack patched, red represents places where fixes failed.  (The sky is red, which is good since it is shouldn't be patched).

Components system + Flow Graph

 Like lots of other games, my game uses a component system.

This system is designed to allow for improved cache usage and easy parallelization.
It is loosely based on this article published by Insomniacs.

For cache friendly behavior all components of a given type live in contiguous memory, and are all updated together.

To specify attributes for components I have a Lua file,  which for each component type indicates, among other things, how they should update(parallel, serial), and which other component types they depend on.

 I use TBB Flow Graph to express dependencies, any component that has a dependency on another being updated prior to it, just lists that component as a dependency in the Lua file.


For instance this is the declaration of my Occlusion Rasterizer component in Lua.

reg("occlusion_rasterizer",    no_inteface,   TM.MULTI,  0.0,  3, {"commands"}) 

The parameters in order are:

name      - name in C++, this matches them up
interface - specifies which interface, if any, the component implements, if Occlusion Rasterizer had implemented an interface this would be the name of that component.  This allows for things like finding all components that implement a given interface.
parallel      - either SINGLE or MULTI, SINGLE means each component of this type is updated sequencially, MULTI means they are all updated in parallel
time delay - how long between updates to this type, so if you want a given type to only update  4 times per second make this value 0.25 etc
count      - how many components to reserve, this allocates a contiguous block of memory. It will grow as needed, but this allows for the equivalent of a vector.reserve()
dependencies - list of any components that must update prior to this component, TBB flow graph guarantees they will finish updating prior to this component updating. Occlusion Rasterizer depends on a command list being generated, so it lists "commands" as a dependency.

 Each component type derives from a base component type in C++, currently this adds 12 bytes of overhead to each instance in 32 bit builds, and 16 bytes in 64 bit builds.

The base component contains this:
vtable ptr -  4/8 bytes. Removing this is possible but makes the system somewhat clumsier to use.
chain index  - 4 bytes, which chain it is on, chains are a series of components
type            - 2 bytes, each component type has a unique ID
offset          - 2 bytes,  index into the pool containing components of its type


 I've been pretty happy with this system, my components are small self contained little objects,  they are cache friendly, and can be allocated and destroyed in large numbers. Adjusting dependencies as new types are added is also a very simple since no recompilation is needed, just a chance to a Lua script file.


 A major advantage of component systems, which I think does not get stated clearly enough, is how much glue code they save you from writing.  Having to create classical C++ game objects and populate them with appropriate member data, worry about all the fragile and arbitrary hierarchies, and then writing all the systems that control them is a huge amount of unnecessary code.