Sunday, August 22, 2010

Is Carmack working on a ray tracing based game engine?

At least that's what his tweets (http://twitter.com/id_aa_carmack) seem to suggest:

# There has GOT to be a better way to exactly test triangle-in-AABB than what I am currently doing.

# Idea: dynamically rewrite tree structures based on branch history to linearize the access patterns. Not thread safe…

# The valuation strategy behind the Surface Area Heuristic (SAH) used in ray tracing is also adaptable for rasterization engines.

# It is often better to use a global spherical sampling pattern and discard samples behind a plane instead of local hemispheres.

# @Wudan07 equal area but not shape. Got a better non iterative one? Poisson disc for higher quality.

# To uniformly sample a sphere, pick sample points on the enclosing cylinder and project inwards. Archimedes FTW!

# I need to decide if I am going to spend some quality time with CUDA or Android. Supercomputing and cell phones, about equally interesting…

# @castano triangle intersection is 33% of the wall clock time and a trace averages 12 triangle tests, so gather SIMD tests could win some.

# Doing precise triangle tests instead of just bounds tests when building KD trees makes a large difference with our datasets.

# For our datasets, my scalar KD tree trace code turned out faster than the SSE BVH tracer. Denser traces would help SSE.


If Carmack's pioneering work still has the same impact and influence on the game industry as it had in the 90's, ray traced games could become the standard "soon" (which in id terms means about 5-8 years :-).

UPDATE: I wasted a couple of hours and transcribed the part of Carmack's keynote at Quakecon 2010, where he's specifically talking about raytracing (from http://www.youtube.com/watch?v=LOdfox80VDU&feature=related time 3:17 ) English is not my mother tongue so there are some gaps here and there, this part is not about real-time raytracing, but about precomputing the lightmaps and megatextures with raytracing instead of rasterization:


"We were still having precision issues with shadow buffers and all this and that, and finally I just sort of bit the bullet and said “Alright, we’re on the software side, let’s just stop rasterizing, let’s just raytrace everything. I’m gonna do a little bit of rasterizing for where I have to, but just throw lots and lots of rays." And it was really striking from my experience how much better a lot of things got. There are a lot of things that we live with for rasterization as we do in games with making shadows, trying to get shadow buffers working right, finding shadow acné vs Peter Pan effect on here and this balance that never quite get as things move around. Dealing with having to vastly oversize this trying to use environment maps for ambient lighting. And these are the things that people just live and breathe in games, you just accept it, this is the way things are, these are the issues and things will always be like this just getting higher and higher resolution. But it was pretty neat to see a lot of these things just vanish with ray tracing, where the shadows are just, the samples are right. We don’t have to worry about the orientation of some of these things relative to the other things. The radiosity gets done in a much better way and there are far less artifacts. And as we start thinking about things in those terms, we can start thinking about better ways to create all of these different things. So that experience has probably raised in my estimation a little bit the benefit of raytracing in the future of games. Again, there’s no way it’s happening in this coming generation, the current platforms can’t support it. It’s an open question about whether it’s possible in the generation after that, but I would say that it’s almost a foregone conclusion that a lot of things in the generation after that will wind up being raytraced, because it does look like it’s going to be performance reasonable on there and it makes the development process easier, because a lot of problems that you fight with just aren’t there. There’s a lot of things that, yeah, it’s still you could go ahead and render five times or ten times as many pixels, but we’re gonna reach this point where our ten times as many pixels or fragment samples going to give us that much more benefit or would we really like to have all the local shadows done really really right and have better indirect ambient occlusion. Or you can just have the reflections and highlights go where they’re supposed to, instead of just having a crummy environment map that reflects on the shiny surfaces there. So, I can chalk that up as one of those things where I definitely learned some things in the last six months about this and it modified some of my opinions there and the funny coming back around about that is, so I’m going through a couple of stages of optimizing our internal raytracer, this is making things faster and the interesting thing about the processing was, what we found was, it’s still a fair estimate that the gpu’s are going to be five times faster at some task than the cpu’s. But now everybody has 8 core systems and we’re finding that a lot of the stuff running software on this system turned out to be faster than running the gpu version on the same system. And that winds up being because we get killed by Amdahl’s law there where you’re throwing the very latest and greatest gpu and your kernel amount (?) goes ten times faster. The scalability there is still incredibly great, but all of this other stuff that you’re dealing with of virtualizing of textures and managing all of that did not get that much faster, so we found that the 8 core systems were great and now we’re looking at 24 thread systems where you’ve got dual thread six core dual socket systems (?) it’s an incredible amount of computing power and that comes around another important topic where pc scalability is really back now where we have, we went through sort of a ..."

Friday, August 13, 2010

Arnold is back!




During the last Siggraph, a lot of presentations were centered around Arnold, the production renderer which is going to kick Renderman's and mental ray's collective asses. The VFX industry is slowly discovering this marvel, which is doing Monte Carlo path tracing at incredible speeds. It eats huge complex scenes with motion blur and DOF for breakfast.

The argument that ray tracing is only used sparingly in the movie industry no longer holds, as almost all big film companies are shifting towards more and more (and sometimes full) raytracing as the method of choice. I'm sure this industry wide shift will trickle down to the game industry as well

Some interviews and presentations:

interview Marcos Fajardo: http://tog.acm.org/resources/RTNews/html/rtnv23n1.html#art3

HPG2009 Larry Gritz talking about Arnold: http://www.larrygritz.com/docs/HPG2009-Gritz-Keynote-clean.pdf

more on Larry Gritz' talk: http://www.realtimerendering.com/blog/hpg-2009-report/

interesting interview wiht Larry Gritz about the decision to move to Arnold: http://news.cnet.com/8301-13512_3-10306215-23.html

basic features of Arnold: http://www.graphics.cornell.edu/~jaroslav/gicourse2010/giai2010-02-marcos_fajardo-slides.pdf

many details on Arnold (don't miss this one): http://renderwonk.com/publications/s2010-shading-course/martinez/s2010_course_notes.pdf

Thursday, August 12, 2010

Carmack loves ray tracing

Some excerpts from his QuakeCon keynote (different live blogs):

"There are a lot of things we live with as rasterization in games...Ray tracing was the better way to go. The shadows and samples were right.. the radiosity gets done in a much better way."
http://kotaku.com/5611429/quakecon-2010-keynote-liveblog-aka-man-i-hope-they-mention-doom-4

Ray Tracing definitely not coming in this generation, but it's definitely the future of gaming once the technology becomes performance reasonable. Will make development easier. A lot of his opinions on ray tracing has changed in the past six months. Now that multicore CPUs are mainstream they're starting to catch back up to rendering performance compared to GPUs.
http://ve3d.ign.com/articles/news/56520/John-Carmack-Quakecon-Live-Feed-Coverage

and from Carmack's own twitter:
"Reading Realistic Ray Tracing 2nd Ed. I discourage the use of .x .y .z style members and favor arrays. Many bugs due to copy-paste."
http://twitter.com/id_aa_carmack

Wednesday, August 4, 2010

Faster raytraced global illumination by LOD tracing

Just read this Siggraph presentation about PantaRay, a GPU-accelerated raytracer to compute occlusion, which was used in Avatar and developed by Nvidia and Weta: http://bps10.idav.ucdavis.edu/talks/13-fascionePantaleoni_PantaRay_BPS_SIGGRAPH2010.pdf

The idea is simple and elegant: use the full res geometry for tracing primary rays and use lower LOD geometry for tracing secondary rays. The amount of triangles to test intersection with is significantly reduced, which speeds up the GI computation considerably.

These papers use the same principle:

"R-LODs: Fast LOD based raytracing for massive models" by Yoon
"Two-Level Ray Tracing with Reordering for Highly Complex Scenes" by Hanika, Keller and Lensch

The idea is similar to Lightcuts and point based color bleeding (which both use a hierarchy to cluster sets of lights or points to reduce the computational cost of tracing) but is used for geometry instead of lights. It is also being used by Dreamworks:



Picture from Siggraph presentation by Tabellion (PDI Dreamworks)

From http://www.graphics.cornell.edu/~jaroslav/gicourse2010/giai2010-04-eric_tabellion-notes.pdf:

We describe here one of the main ways in which we deal with geometric complexity. When rays are cast, we do not attempt to intersect the geometry that is finely tessellated down to pixel-size micropolygons - this would be very expensive and use a lot of memory. Instead we tessellate geometry using a coarser simplified resolution and the raytracing engine only needs to deal with a much smaller set of polygons. This greatly increases the complexity of the scenes that can be rendered with a given memory footprint, and also helps increase ray-tracing efficiency. To avoid artifacts due to the offset between the two geometries, we use a “smart bias” algorithm described in the next slides.

Since rays originate from positions on the micropolygon tessellation of the surface and can potentially intersect a coarser tessellation of the same surface, self intersection problems can happen.

The image above illustrates cross-section examples of a surface tessellated both into micropolygons and into a coarse set of polygons. It also shows a few rays that originate from a micropolygon whith directions distributed over the hemisphere. To prevent self intersection problems to happen, we use a ray origin offsetting algorithm. In this algorithm, we adjust the ray origin to the closest intersection before / after the ray origin along the direction of the ray, within a user-specified distance. The ray intersection returned as a result of the ray intersection query is the first intersection that follows the adjusted ray origin. The full algorithm is described in [Tabellion and Lamorlette 2004].

Here is a comparison between a reference image that was rendered while raytracing the micropolygons micropolygons. The image in the center was rendered with our technique while raytracing the coarser polygon tessellation illustrated in the image on the right.

It has been shown in [Christensen 2003] that it is possible to use even coarser and coarser tessellations of the geometry to intersect rays with larger ray footprints. This approach is also based on using only coarser tessellation and is not able to provide very coarse tessellations with much fewer polygons than there are primitives (fur, foliage, many small overlapping surfaces, etc.), which would be ideal for large ray footprints. This problem is one of the main limitations of ray-tracing based GI in very complex scenes and is addressed by the clustering approach used in point-based GI, as discussed in later slides.


With this technique, you could have a 1 million polygon scene to trace primary rays, while secondary rays are traced against a 100K or a 50K polygon LOD version of the scene. When using voxels, this becomes a walk in the park, since LOD is automatically generated (see http://voxelium.wordpress.com/2010/07/25/shadows-and-global-illumination/ for an awesome example)