Motivation and Problem
Our vision is that shaders should be the easiest part of the graphics pipeline to debug and understand. If a developer wants to know how a pixel was modified by a shader, she should be able to click on the output pixel for a specific frame and see how it is arrived at. It should be straightforward to see how textures are sampled, view shader uniforms and interpolated inputs, and inspect a variable’s runtime values at any program point. She should be able to step through execution to understand control flow. Lastly, it should be simple for her to visualize intermediate runtime states for specific pixels, all pixels within a frames, or over time.
We propose to design and implement a GLSL shader simulator as the key technology to enable these debugging tools for WebGL. Since shader programs run on the GPU, there is no way to pause execution or inspect runtime values in the conventional sense. To obtain this information, we can instead simulate the shader’s execution with a specific set of inputs (exploiting the determinism of shader programs). Runtime data that’s needed by debugging affordances can then be regenerated on-demand by running the shader simulator with the necessary inputs. These inputs (uniforms, varying, gl_Position, textures, etc.) are easy to instrument in modern WebGL implementations. Since shaders are short and must run quickly on hardware, this runtime data can be produced with low latency during debugging sessions.
The most common debugging method is “printf-style” debugging, where the state to inspect is encoded as a color and assigned to gl_FragColor, then later visualized. This has serious limitations, such as supporting only a vec3 of debug output per shader. However, it can run on the GPU and is portable.
Mesa is the original software based graphics stack (C/C++). It has a full parser, optimizer, and software interpreter for GLSL and ARB assembly code.
The paper “Step-through debugging of GLSL shaders" proposes to extend printf debugging for two purposes: to simulate stepping through a shader for the full screen, and stepping through control flow for a specific pixel. They use rewriting techniques to save intermediate values, assign to gl_FragColor/gl_Position, then return early.
We envision glsl-simulator as one piece of a larger debugging story provided by other tools. To this end, we enumerate some of the relevant design constraints and considerations.
We aim to simulate the execution of vertex and fragment shaders (and their interconnection via rasterization), but not other parts of the pipeline such as GL calls to manipulate texture or render state, nor per-fragment operations that require access to the framebuffer and GL state machine. Our rationale is that GL calls are already traced and replayed by debugging tools, while GLSL execution is not. Per-fragment operations such as alpha blending, culling, scissor test, etc can be supported by debuggers by appending post-processing shaders that simulate these hardware-implemented operations. Lastly, it is feasible to simulate GLSL programs in isolation: the shader program's inputs (textures, uniforms, attributes, etc) can be easily taken from captured GL call traces and rendered frames.
Intended Use Cases
We want to support understanding unfamiliar shader programs and debugging catastrophic shader failures (i.e., all pixels rendered black). Both use cases require the ability to inspect arbitrary runtime state at any program point. Less important is the goal of exact simulation: smaller rendering differences (i.e. antialiasing, rasterization, blending) may be useful to point out, but graphics programmers don't expect pixel-perfect results (different graphics hardware produces different results already!). The main value proposition of a simulator is that it can provide insight into control flow and intermediate runtime states.
Embedding a Simulator
We assume that any developer tool which embeds the simulator will want access to semantic information such as types of expressions, line and column numbers, and so on. Embedders should be able reconstruct shader control flow, and insert logging code before and after specific statements or expressions. Unlike traditional debuggers, we do not plan to support pausing the simulator's execution at arbitrary statements; instead, the embedder will be able to present a "paused" line in the user interface, while repeatedly re-simulating the shader with different debugger hooks to obtain the necessary data. For example, if a user steps from one statement to the next statement, the embedder will move its debugger hooks for local variable values to be triggered on the next line.
Lexing and Parsing
The simulator performs very basic semantic analysis to disambiguate user-defined and built-in function calls, and extract input/output variables for displaying in the demo user interface. In future work, typechecking would be performed at this point.
We implement GLSL built-in functions (builtins.js) and GLSL bulit-in types (vec.js, matrix.js) by following GLSL specs. Built-in functions are mostly mathematics such as angle functions, exponential functions, computing minimums and maximums, and computing directions of reflections and refractions. Implementing the mathematics aspects of built-in functions is straight forward. Implementing built-in types (vectors and matrices) involves implementing basic operations on these types such as product, dot product, divide, etc. which is also straight forward. We create a simple test framework to make sure that we are doing what we expected.
The biggest challenge in implementing built-in functions and types is that we need to typecheck the inputs all the time. As we previously mentioned, this is because we didn't typecheck during semantic analysis. For example, in computing min(x, y), we check whether x and y are numbers or vectors. If they are vectors, we need to check whether they are of the same dimension. If not, we throw an error; otherwise we do an element-wise min() operation on each element of the vector. Typechecking at runtime makes our already slow simulator even slower.
access.js exhaustively implements all the ways that vectors and matrices can be accessed. For example, accessing a vector (v) can be done by v.xz which returns a vector whose first element is the first element of v (v) and whose second element is the third element of v (v); accessing a matrix (m) can be done by m.xy which returns a vector whose first element is m and whose second element is m. We summarize the ways that vectors and matrices can be accessed (gets and sets) on this page.
In summary, we have built all the built-in functions and types in order to make our simulator work. But the runtime library could be more efficient given more time to implement another set of design decisions.
Running the Simulator
The simulator has a few important objects: a Shader (vertex or fragment), a Program (consisting of one vertex and one fragment shader), and an Environment. The environment serves as a container for shader inputs (attributes, uniforms, varyings) and outputs (varyings, builtin variables). The program is the main entry point for execution. It supports single (vertex, fragment) shader execution and mass fragment shader execution over an image buffer.
We briefly describe the inputs and the outputs on our demo page. iGlobalTime is the current timestamp relative to the start time, iMouse is the relative position(coordinates) of the mouse, and iResolution decides the size of the outputs. The inputs suggest that the output is varying depending on the current time, and mouse position. For controlled debugging, we allow to inspect results based on fixed inputs. The output is basically gl_FragCoord which represents pixels of output image. We allow inspecting one pixel as you mouse over it. Variables that start with gl_ are builtin inputs/outputs of GLSL shaders.
We have evaluated our GLSL simulator both quantitatively and qualitively. A user study would be ideal, but the implementations would require us lots of time. We have spent so much time on implementations, so we consider user studies as future work. Evaluation
What can the simulator simulate?
Our simulator can simulate any single frame of a single shader, which gives developers more control over potential factors that can cause bugs. To make sure that our simulator works for reasonably complex shaders, we pull a few shaders off from shadertoy, a web site that provides sample fancy shaders. We see that our simulator produces similar results to results on the web site. We will discuss the difference between our simulator's results and expected results and why later.
Can we get intermediate states?
Currently, we rely on the browser's builtin stepping through features (e.g. available at least in Chrome and Safari). In the future, we plan to implement a general purpose instrumentation or logging APIs so as to enable debugging on older versions of browsers that don't provide stepping through functionalities.
Are the computed results correct?
Here, we discuss how our results can be different from the original GLSL shader's and why.
Another reason is that there are intentionally undefined behaviors in the GLSL shader language specification, for example, texture sampler results, antialiasing, and rasterization. For this reason, there is no way to make our results and GLSL shader language's results absolutely the same since GLSL shader language's results can be slight different for different runs.
How fast is generated code?
We evaluate the elapsed time and CPU time of generating the sample shader.
We gauge the elapsed time by running our demo sampler (both vertex and fragment) on a Mac OS X machine with 2.7GHz processor and 16GB memory using Chrome browser Version 39.0.2171.71. The elapsed times are 2.1s (10K pixels), 4.1s (20K pixels), and 8.3s (40K pixels). The elapsed times are almost linear to the amount of pixels, suggesting that the elapsed time spent on code generation is minimal compared to the amount of elapsed time spent on runtime.
We further gauge the breakdowns of CPU time spent on all the functions. We use Chrome's builtin CPU profiler to obtain this results. The sample shader spends 19.36% CPU time on constructing vectors, meaning that constructing vectors are the slowest part of our simulator. This is because the vector constructor implements typechecking and overloading, which can be done alternatively in the translation phase. In comparison, the sample shader spends 4.24% CPU time on clamp(), 3.75% on multiply(), and 3.65% on abs(). While these numbers and the functions can be different depending on the shaders, their relative values suggest how slow one function is compared to each other.
Using semantic information during generation
The simulator prototype does not perform typechecking or other semantic analysis of the GLSL program, except for its inputs and outputs. The generated code does not change based on the types of expressions (i.e., vec3, mat4, or float). Thus, many slow runtime type checks and branches must be performed to compute the correct result. Since we can decide types at translation time, most of these can be omitted, thus removing most nonessential conditional branches from generated code.
Generating optimizable code
Testing simulator results
Our prototype only has a few unit tests for the runtime.js library. To improve performance and track feature completeness, glsl-simulator should have compilation/runtime benchmarks as well as a regression test suite. Correctness of semantic analysis, builtin operators and functions can by tested more systematically by using glsl-simulator as a backend option in the WebGL conformance suite's test harness.
Brian Burg ( Authors and Contributors@burg)
Sophia Wang (@xiaosophiawang)