Step 2: Detection

Submitted by epreisz on Sun, 02/11/2007 - 08:04.

When attempting to optimize for your first time, the process of detecting an optimization opportunity is sometimes the toughest job. It shouldn’t be! There are many tools that exist to help us along in the process. As we stated earlier, it is easy to spend time performing useless optimizations if you haven’t detected the most critical hotspot or bottleneck.

So where do we start? Consider Figure 1 below. The first task in the detection is to determine which part of the graphics subsystem is causing our performance limitation. Earlier we stated that every PC contains at least two processors. The two processors that always exist are the GPU and CPU. The piece that connects the two processors is the bus, usually AGP bus or PCI express. These resources are the three most common bottlenecks, and since these resources act as a multi-processor system, our first step is to determine which one is limiting our performance.

Figure 1 First, we must determine if we are bound by the GPU, CPU, or Bus.

Are We GPU bound?

You can start with any of the three, but I believe the easiest resource to eliminate is the GPU. The way we determine if we are GPU bound is to determine the percentage of time the GPU is running idle. In theory, if the GPU is running at 100%, then you are GPU bound. If the GPU runs idle, then it is likely spending time waiting for the CPU or bus transfer.

Notice I said “in theory”. In reality, the GPU will never run at 100%. Due to the overhead of syncing that occurs between the GPU and CPU, it will never run at 100%; however, if your GPU is running at 90% or more, chances are very high that you are GPU limited. If your GPU is running at 85% to 90%, it is a little more difficult. At that level, you are walking the borderline between being GPU bound and limited by either CPU or bus.

Are We CPU or bus bound?

Determining if we are CPU bound or bus bound can be a more difficult question to answer. On one hand, almost always, an application not limited by the GPU is CPU limited. On the other hand, for the rare cases where our performance limitation is the graphics bus, it can be difficult to determine.

One sign that we are CPU limited is to adjust the speed of the CPU. If adjusting the CPU speed causes a change in frame rate, we are CPU limited. How we adjust the CPU is difficult to answer and is somewhat dependent on what type of hardware you have. Some machines allow you to adjust the rate in your BIOS. Some don’t. Do not adjust your CPU speed without knowing the territory. Changing the speed of your CPU, especially increasing the speed above the recommended rates, can cause instability and/or can damage your processor?

Another way to determine if you are CPU limited is to comment code in your application that does not change the amount of rendering that occurs. For example, you may be able to reduce the AI processing of a scene without changing the rendering. If you see an increase in framerate, then you are CPU bound.

There are some methods for detecting if you are bus bound, but with a modern game engine these methods are not always easy to implement. Try reducing the byte size of your vertex formats. If your have first determined that the GPU is not the bottleneck and notice a performance increase by reducing vertex size, then you are bus bound. The same applies for index formats. Indices can usually be set to 16 or 32 bit formats. If you have set the formats to 32 bits, try reducing them to 16 bits. If you notice a frame rate increase, then you are bus limited.

If you understand the layout of the graphics pipeline, chances are you will not be bus limited. Later in these readings we will discuss the often non-intuitive “rules” involved in using graphics hardware.

If this seems like a lot of information in one section, don’t worry. We will cover these topics in greater detail later. The main concept is that you don’t want to waste your time optimizing for the CPU if the GPU is the limiting resource. Likewise, you don’t want to waste you time optimizing for the GPU if the CPU is the limiting resources.

Detecting CPU Optimizations

If you have ruled out the GPU and the AGP chances are that you application is CPU bound. Applications can be CPU bound for many different reasons. Your application code can be using all of the CPU cycles performing intensive AI, collision, culling, or physics. You could be using your graphics API incorrectly causing CPU intensive work inside of your drivers. Lastly, your CPU calculations may be executing slowly because of a handful of issues caused by poor use of memory.

Using V-Tune, a tool for machines using Intel processors, we can determine which process is consuming our CPU.

If your machine is running on an AMD processor, we can use Code Analyst to gain insight on how our applications performing on the CPU.

Detecting GPU Bottlenecks

If you have determined that you are GPU bound our process is to move backwards through the graphics pipeline adjusting the performance of each stage. The process uses elimination to determine the offending stage. Although we can do this by adjusting the code directly, tools exist that make the job a bit easier. We will cover detecting and solving GPU issues for both shader and non-shader rendering pipelines.

Both nVidia and ATI offer solutions for detecting bottlenecks on the GPU. NvPerfHud, by nVidia makes easy work of disabling stages of the pipeline for determining the trouble stage.

We can also perform this process by altering our code. For example, to determine if we are frame buffer bandwidth limited, we can adjust color and depth buffers from 32 to 16 bit. If we notice a difference in framerate, then we can conclude that our application is framebuffer bandwidth limited.

Detecting Bus Optimizations

If you have determined that you are bus bound our process requires us to determine ways to use less bandwidth. If you are bus limited, it is likely that you are using your graphics API incorrectly. By using the API more efficiently, we can reduce the amount of bandwidth we are consuming every frame.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Buy Ambien

Cheap Phentermine

Buy Valium

Viagra

Buy Ambien

Phentermine

Xanax

Buy Phentermine

Ambien

generic viagra

Buy Ambien

buy cialis

cialis online

buy cialis

Buy Ativan

Buy Viagra online

Buy Cigarettes

erectile dysfunction

Ambien

generic viagra

cheap cialis

Xanax

buy cialis

cialis

Ambien

viagra

cialis

Buy Cialis

Cheapest Cialis

cialis online

Ambien

buy cialis

Cheap Ambien

Buy Phentermine

buy cialis

Ambien

Cheap Phentermine