Sunday, July 21, 2013

Picker a number

More than once, I receive questions like "what is the max polycount for this new monster model?", or "what should the texture resolution be?". Not that I'm an art director, but since we lack one, I'll usually take the role to give such advise. And of course, I (should) know the technical specs to answers questions like these.


Artists need reference materials, style guidelines, and also technical boundaries. The ultra-detailed meshes that roll out programs such as ZBrush are usually not useable in a game, because they carry way too many polygons. For the non-technicians; an average (high polygon) ZBrush character can easily contain a few hundred thousand to a few million polygons. An in-game version of the same dude is usually somewhere between thousand to ten thousand polygons. In other words, the detail has to be reduced with a factor 10 to 100.

It's not that a (modern) computer can't render a 100k model, but the games I know usually show a bit more than just one character. Maybe the game has to render a whole army of the same characters, besides drawing the environment, doing some special FX, calculating physics, playing audio and figuring out what is going on in the heads of those characters (AI).

Same thing for textures. When drawing, its common to start on a big canvas. Not only does this allow more detail, it also just works easier. But, a single 4096 x 4096 texture can eat up to 64 MB. Even though hardware increases its memory every generation, you would only be able to pump a few of those into your videocard. Can't make a game that only uses 15 different textures of course.



Luckily we have a "stretch" function in pretty much any painting program, and there are also plenty of tools to reduce a million-poly model to a low(game) poly mesh that still looks close to its original (in combination with a normalMap to compensate the lack of smaller details). Nothing new so far. But, how to answer these "how much polygon?" questions? What are the magical quantities for this concoction?

Well, very hard to say really. First of all, I can't easily look into a GPU chip and see the desired throughput. There is no exact upper limit of polygons that are allowed. In theory, each extra polygon reduces the speed a tiny bit, but in practice the performance is a result of many, many factors. What else is going on? Do you have 1 giga model, or thousands of tiny models? How smart does the engine batch & render things? What are the bottlenecks? What can the hardware do? If you either have super-hardware, or the game doesn't do a whole lot, you can go berserk on the models or textures. If you need a lot of everything on the other hand, you need to balance things carefully.

To make things worse (or actually this is a good thing), the hardware changes all the time. A 2.000 triangle puppet would be heavy shit in the Quake1 days, now the GPU eats it for breakfast. It's like answering how much calories should you eat over the next few years. If we spend our days in the couch watching baseball, we probably don't need that extra pizza slice. If we get a heart attack after too many pizza slices and start playing baseball instead of watching it, the extra calories may eventually be needed to fuel our sports-bodies. A stupid comparison maybe, but the saying “enough is as good as a feast” is true in both cases. Take what you need, but don't overdo it. Or you'll get fat and slow.


Ok... that's some vague useless advice so far, let's see if we can transform it into some more useful guidelines for both artists and programmers. Here ten hints.

Quite some triangles were spend on this badboy. No, we won't show you the whole character, but let's say he might play a little role in a next demo movie...

1>> You can always downscale
First, and maybe most important, don't care about specs to begin with. At least not during the production of your initial model or texture. Downsizing a model or texture afterwards is easy. Upsizing on the other hand leads to quality loss, or is simply impossible.

As an art director, you should worry more about providing structure and space to process all this data. Preferably, the original "master file" should be on a shared location. Use a cloud, Dropbox or something similar to store the files. Everyone (privileged) can make a copy or review it, and if the artist computer crashes (which happens suspiciously often...), you always have a back-up.

Warning though, usually the Master files are insane big, so a free (2-gig) Dropbox account will be full before you know it. As a screwy Dutchman, I didn't pay for an unlimited account either so far, but you'll be needing it sooner or later.


2>> Don't overdo
“Drink with moderation”. As with drinking beer, eating hamburgers, smoking pot, or whatever you do, enjoy but don't take more than needed...(writing this, still recovering from my hang-over…). This is a golden rule for pretty much everything in life, including for making textures and 3D models.

If you are an artist, you want to show awesome stuff, your skills, and surpass yourself each time. That is a good attitude, but know when to stop. First of all, you can either perfect a few pixels/polygons, or make two assets in the same time. Being a perfectionist costs a lot of time, and often you don't have that time when making a game that will consists then-thousands of assets.

Second, you, or your reviewer, will always find small flaws. But will the end-user really notice it? Did you complain about the pixels in the Wolfenstein Nazi's back in 1991? Are the fake reflections in Crysis2 killing the vibe? Did you know that plenty of lights in games don't actually cast shadows? Are the non-perfectly-round shapes in 3D games a pain in the ass? The average gamer doesn't care or notice. Just as long the whole composition of things looks good.

The problem is, when the artist sends a near-finished object, say a chair object, the reviewer will be 100% focused on that single chair. The lack of fart-marks in the seat will make it unrealistic, the pegs aren't nicely rounded, the texture is a bit blurry at the bottom side, its 4 cm taller than a normal chair, and so on. I know this, because I'm a nitpicker as well. But maybe, and this is a lesson to learn for me as well, it's better to first finish a whole room and see how everything works out when being combined together.

If you look at a Rembrandt painting from a distance, you see one whole. A theme, a scene, a mood, a certain color palette. If the theme grabs you and the composition is good, the painting is a success. If you look closer you might notice Rembrandt was too lazy to paint the fucking vase in the background properly, but that doesn't matter. So artist, zoom out, lay back, put your object in an existing scene, and see if it does its job. Eventually downsize the texture, or use less polygons and check again. If you didn't saw a real difference, your asset is still good to go.


>> 3 The illusionist
In addition to 5, sometimes simple tricks can do the job as well. Even though it might feel cheap, dirty, wrong, and old fashioned, simple 2D sprites or decals are still used a lot. An old industrial corridor needs details such as wires, switches, cables, cracks, dirt, broken tiles, oil smear, and so on. Modeling all those things takes more time, costs more performance, and sometimes doesn’t even look better. A flat decal can look more detailed than a low poly object.

When to exactly use decals or sprites then? I usually use them for small stuff, or things you would only watch from a distance, so you can’t really walk around it, breaking the illusion. Faking stuff also goes for a lot of special effects. Realtime G.I., real 3D particle clouds, or volumetric dust sure are cool, but also terribly expensive. Often they can be replaced with a much cheaper variant, and in some cases those even look better as well, as it can be very tricky to implement the realtime counterparts correctly. Pre baked lightMaps versus realtime G.I. is a perfect example.

No matter how advanced your particle engine is, the 2D flame is still unbeatable. Though I have to implement XZ rotation to keep facing the camera, the left flame starts showing its fakeness.


4>> Stress! test!
If you can, just fill a room with heavy models and see what happens. I used to remove triangles with a microscope & tweezers. But in practice, high polycounts aren't the bottleneck usually. At least not in my case. The number of assets, whether those are small or big objects, matters though. Videocards love to batch things, so rather than nitpicking about a few polygons, you'd better ensure the (engine)code to sort, group and batch things as good as possible. Especially when you have to deal with dense stuff, like a jungle. Or garbage mountains.

On lower-end hardware, the polycounts may become a bottleneck much sooner. So yet again, stress-test if possible. Just to see where your limits are (and to test code improvements / changes). It avoids getting stuck with impossible scene concepts, or having false fears for high polycounts on the other hand.


5>> Guess the texture memory usage
If you worry about texture sizes, calculating the memory usage for a single texture is quite easy:
KB = ( pixelsWidth x pixelsHeight x channelCount ) / 1024
MB = KB / 1024
Textures are usually in the RGB8 or RGBA8 format, which means you multiply the width and height with 3 or 4. Thus a 1024 x 1024 x RGBA texture takes 4 megabyte. That doesn't sound like much, assuming a modern card has at least 1 gig video memory. BUT! There are some catches:
* Textures are often mipmapped, increases the size with 33%
* Textures often come as a set (diffuse + normalMap + height + specular + whatever)
This easily doubles or triples the requirements

* Your program loads more than just textures into the video card memory.
Buffers to draw on (FBO's), 3D models (VBO's), shaders, et cetera.

* Windows and other programs want a bit of the honeypot as well.
Let's just say ~15% of the available memory isn't for you
In other words, if you have 1 gig of memory, it doesn't mean you supply your scene with 1 gig of texture data (simultaneously). Last but not least, there is also a thing called "bandwidth". If you render a room that is made of 128 MB data in total - for example some geometry + ~21 1024x1024 RGBA texturesets (diffuse & normalMap) - it means that your videocard has to squish 128 MB of data through its hydraulic hoses each time you render. Difficult to tell what the exact maximum is, as it varies for each card and probably the GPU also caches stuff, but there is a limit somewhere. Just as long you don't hit this ceiling (meaning some other factors are limiting your performance), there is nothing to worry about really. Just don't overdo things to stay away from the critical limits, see 5.


6>> Texture Compression / LODs
Use DDS textures & compression(DXT, ...) to reduce the texture sizes to ~1/4. This allows to load a lot more textures into your memory, and/or to reduce memory requirements/bandwidth. Careful though, compression doesn't always turn out good for every texture. Especially normalMaps can look "crunchy". Check your results.

When rendering very complex models, or just a lot of models in big (outdoor) scenes, you can save bandwidth and triangles as well by switching to lower polygon versions of your models and (background) environments. You can sure render a monster with 20.000 triangles from nearby, but when stepping some meters away you can half the polycount without really noticing it. And forest trees might even do a good job with flat billboard sprites when being rendered from a distance (that’s how Farcry did it for example).


>> 7 Batch & combine
Rendering the same palmtree hundred times might go faster than rendering fifty different palmtrees, even though the polycount could be the same. This is because you’ll have to interrupt the videocard each time, telling to switch tasks. Use grouping, sorting and tricks like instancing to avoid this as much as possible.

As an artist, you can help as well. Make asset sets that can be re-used. Sounds obvious, but sometimes objects are so “showy” that you shouldn’t place too much of them. Another trick is to combine as much as possible in one model, and/or let the subparts share the same texture. Rendering a house as one big object with one texture(set) will go faster than constructing the house of many sub assets. Of course, this might give problems with things like texture quality, different required shaders, or destructibility though. Stay in contact with the technicians or map designers to find out what works best.


8>> Make performance options
T22 has some graphical options that aren't too hard to implement. When using DDS texture files with mipmaps for example, you can easily create a "Texture Quality" setting. When choosing a lower setting, the highest resolution mip-map level(s) aren't loaded, which can save a big amount of memory / bandwidth.

Special effects such as SSAO, shadows, screen-space reflections or realtime GI can be switched on/off, though this is tricky. When tuning the graphics, the lighting matters a lot. When I disable SSAO and use another type of cheaper GI, the scene just looks different. So instead of switching things on/off, I usually give them a quality setting as well. Lower quality usually means that the effect is rendered on smaller buffers, reducing framerate issues.

You could also decrease particle cloud densities, or make IFDEFs in your shader code to skip some parts, or to use less samples for effects such as blurring or raymarching. But as said, don't get lost in too many settings. While developing you should try to render the scene "representative". If it looks completely different on another computer, it gets very difficult for the artists to make a good looking scene.


9>> Bottlenecks & Profilers
If the framerate stinks, then one or more factors have become a bottleneck. If you render very simple scenery but use gigantic textures, bandwidth could be the problem. If you are rendering dozens of layers of transparent particles on top of each other, the fillrate might be killing. Just to name a few examples. Unfortunately, it’s not always so easy to tell why things are slow. In that case you can try using a profiler, that tests shader execution, what is being loaded onto your card, what API (OpenGL / DirectX) calls are done, et cetera. If you are using OpenGL like me, gDEBugger for example might be useful. AMD and nVidia also have tools, though I didn't successfully implement those in combination with OpenGL & good old Delphi.


10>> When will the game release?
Tomorrow? Next year? .... "When it’s done"? Unless your game is small or you have a clear deadline, chances are big that you still have year(s). Ifso, don't panic if the framerate is a bit low on your current computer. In fact, T22 runs like sirup with lard on my 4 year old laptop. But when running it an another modern computer, it goes surprisingly smooth again.

That doesn't mean you can ignore the performance. If the speed suddenly drops seriously after implementing some new effect, make sure you're not doing something wrong. But then again, be careful not to waste many precious hours on increasing the performance with tiny bits. Chances are that the efforts are negligible one year later, and optimization often leads to bug-sensitive, ugly code. Make a solid basis, but spare the tuning for the final stages, when knowing the actual target hardware.

Coming to think of it, I just remember playing leaked Doom3 and Halflife2 versions. And well, they performed terrible even though their maps, shaders & effects weren’t even finished. Just saying.



Well ladies and gentleman, what fun we had again. But it's time to go on a little vacation, so no posts this month anymore. After the vacation the pressure at work will hopefully drop a bit, which also gives more time to work on our next Demo movie. It has been way too long since we showed a new clip or an interesting screenshot, but we're working on it. Ciao!