When building interactive WebGL applications, object selection and picking becomes critical. I implemented an elegant GPU-accelerated solution using color index encoding.

The Traditional Picking Problem

Standard CPU-based picking involves:

1. Cast a ray from the camera through the mouse position

2. Test intersection against every object's bounding box/mesh

3. Return the closest hit object

This approach suffers from:

  • **CPU bottleneck**: All geometry must be tested on the CPU
  • **Memory transfer**: Geometry data often locked in GPU, expensive to copy to CPU
  • **Performance scaling**: Linear cost relative to scene complexity
  • **Latency**: Frame lag between input and selection response

Color Index Encoding Solution

Instead, we render the scene to an off-screen texture where:

  • **Each object** is drawn with a unique color
  • **Color encodes the object ID** (with enough bits for millions of objects)
  • **The scene** is rendered only to this selection framebuffer
  • **We read back the color** at the mouse position to get the ID instantly

Implementation Details

Encoding Objects to Colors

objectID = 0x00RRGGBB

For example:
- Object 0 → Color 0xFF0001 (red channel = 0, green = 0, blue = 1)
- Object 1 → Color 0xFF0002
- Object 255 → Color 0xFF00FF
- Object 256 → Color 0xFF0100

Using three color channels, we can represent up to 16.7 million unique objects in a single render target.

Rendering the Picking Texture

// Fragment shader for picking texture
uniform int objectID;
out vec4 outColor;

void main() {
  float r = float(objectID >> 16) / 255.0;
  float g = float((objectID >> 8) & 0xFF) / 255.0;
  float b = float(objectID & 0xFF) / 255.0;
  outColor = vec4(r, g, b, 1.0);
}

Reading the Selection Result

// Read pixel at mouse position
const pixels = new Uint8Array(4);
gl.readPixels(
  mouseX, 
  canvas.height - mouseY, // WebGL Y-axis is flipped
  1, 
  1, 
  gl.RGBA, 
  gl.UNSIGNED_BYTE, 
  pixels
);

// Decode the object ID
const objectID = (pixels[0] << 16) | (pixels[1] << 8) | pixels[2];

Advantages

  • **Instant feedback**: Selection response within 1-2 frames
  • **Scalable**: Can handle complex scenes with millions of objects
  • **GPU-accelerated**: Leverages existing rendering pipeline
  • **Cheap**: Single render pass to an offscreen texture
  • **Accurate**: Respects depth and complex geometry

Trade-offs

  • **Extra rendering**: Need to maintain separate picking shader
  • **Readback cost**: `gl.readPixels()` forces GPU-CPU sync (minor for single pixel)
  • **Limited to 16.7M objects**: Sufficient for most interactive applications
  • **Picking texture maintenance**: Must update as scene changes

Applications

This technique is ideal for:

  • **Game engines** needing rapid object selection
  • **3D editors** for modeling and manipulation
  • **Data visualization** with interactive elements
  • **CAD software** with complex scenes

The color index encoding approach combines the best of both worlds: GPU rendering performance with CPU-side selection logic.