When building interactive WebGL applications, object selection and picking becomes critical. I implemented an elegant GPU-accelerated solution using color index encoding.
The Traditional Picking Problem
Standard CPU-based picking involves:
1. Cast a ray from the camera through the mouse position
2. Test intersection against every object's bounding box/mesh
3. Return the closest hit object
This approach suffers from:
- **CPU bottleneck**: All geometry must be tested on the CPU
- **Memory transfer**: Geometry data often locked in GPU, expensive to copy to CPU
- **Performance scaling**: Linear cost relative to scene complexity
- **Latency**: Frame lag between input and selection response
Color Index Encoding Solution
Instead, we render the scene to an off-screen texture where:
- **Each object** is drawn with a unique color
- **Color encodes the object ID** (with enough bits for millions of objects)
- **The scene** is rendered only to this selection framebuffer
- **We read back the color** at the mouse position to get the ID instantly
Implementation Details
Encoding Objects to Colors
objectID = 0x00RRGGBB
For example:
- Object 0 → Color 0xFF0001 (red channel = 0, green = 0, blue = 1)
- Object 1 → Color 0xFF0002
- Object 255 → Color 0xFF00FF
- Object 256 → Color 0xFF0100
Using three color channels, we can represent up to 16.7 million unique objects in a single render target.
Rendering the Picking Texture
// Fragment shader for picking texture
uniform int objectID;
out vec4 outColor;
void main() {
float r = float(objectID >> 16) / 255.0;
float g = float((objectID >> 8) & 0xFF) / 255.0;
float b = float(objectID & 0xFF) / 255.0;
outColor = vec4(r, g, b, 1.0);
}
Reading the Selection Result
// Read pixel at mouse position
const pixels = new Uint8Array(4);
gl.readPixels(
mouseX,
canvas.height - mouseY, // WebGL Y-axis is flipped
1,
1,
gl.RGBA,
gl.UNSIGNED_BYTE,
pixels
);
// Decode the object ID
const objectID = (pixels[0] << 16) | (pixels[1] << 8) | pixels[2];
Advantages
- **Instant feedback**: Selection response within 1-2 frames
- **Scalable**: Can handle complex scenes with millions of objects
- **GPU-accelerated**: Leverages existing rendering pipeline
- **Cheap**: Single render pass to an offscreen texture
- **Accurate**: Respects depth and complex geometry
Trade-offs
- **Extra rendering**: Need to maintain separate picking shader
- **Readback cost**: `gl.readPixels()` forces GPU-CPU sync (minor for single pixel)
- **Limited to 16.7M objects**: Sufficient for most interactive applications
- **Picking texture maintenance**: Must update as scene changes
Applications
This technique is ideal for:
- **Game engines** needing rapid object selection
- **3D editors** for modeling and manipulation
- **Data visualization** with interactive elements
- **CAD software** with complex scenes
The color index encoding approach combines the best of both worlds: GPU rendering performance with CPU-side selection logic.