Cached Software Rendering

2020.04.02
updated: 2020.04.03
Introduction
Overview
Command Buffer
Hash Grid
Rendering
Conclusion

Introduction

This write-up details an approach to 2d software rendering useful for typical application UIs where between most frames little-or-none of the screen's content changes. This approach is used in the lite text editor.

The approach provides the performance advantages of dirty rectangles but abstracts any of the mental overhead or code complexity usually associated with them — the application's code can act as if it is redrawing everything each frame while the renderer takes care of only redrawing regions of the screen which have changed. This leads to simpler, less error prone application code without the waste of CPU time which would result from continually redrawing.

Overview

The approach exists in 3 parts: the Command Buffer, Hash Grid and Renderer. For each frame where the application would typically draw, it instead pushes "draw commands" to a command buffer; at the end of the frame the command buffer is iterated and the commands themselves are added to the hash grid. Finally each cell of the hash grid is compared with the previous frame's and the regions for the cells which have changed are redrawn.

Command Buffer

The command buffer is a simple byte array which each draw command is pushed to:

char command_buf[1024*1024];
int  command_buf_idx;

A draw command must contain a rectangle representing the area of the screen which the draw operation would effect when performed (this is used later by the hash grid), as well as any additional information required to draw the item (eg. an image pointer, color, or string for text rendering). A base command at minimum might be represented by the following struct:

typedef struct { int x, y, w, h; } Rect;

typedef struct {
  int type, size;
  Rect rect;
} Command;

This would be used as a base for any other commands which would extend it with additional information they require:

typedef struct { unsigned char r, g, b, a; } Color;

typedef struct {
  Command cmd;
  Color color;
} RectCommand;

Whenever we call the push_rectangle function we would use the arguments passed it to fill the RectCommand struct and push it to the renderer's command buffer; no actual drawing would be done at this point:

void push_rectangle(Rect r, Color clr) {
  RectCommand *c = (void*) &command_buf[command_buf_idx];
  c->cmd.type = RECT_COMMAND;
  c->cmd.size = sizeof(*c);
  c->cmd.rect = r;
  c->color = clr;
  command_buf_idx += c->cmd.size;
}

We would do the same for any other draw commands which occur in the frame.

Hash Grid

The Hash Grid is a 2d grid of cells where each cell is represented by a hash value (a 32bit unsigned integer and simple hash function such as FNV-1a should suffice) which is associated with an NxN pixel region of the screen. There must be enough cells to account for the whole screen, eg. a screen resolution of 1920x1080 and cell size of 100x100 pixels would require a 20x11 grid of cells. A second hash grid of the same size is also stored as we'll need to compare the current hash grid with the one of the previous frame.

unsigned  cell_buf1[CELLS_X * CELLS_Y];
unsigned  cell_buf2[CELLS_X * CELLS_Y];
unsigned *cells = cell_buf1;
unsigned *cells_prev = cell_buf2;

At the end of each frame when the command buffer has been filled by the application's code, we reset the hash grid to an initial state; that is, setting each cell's hash value to the initial value used by our hash function.

for (int i = 0; i < CELLS_X * CELLS_Y; i++) {
  cells[i] = HASH_INITIAL;
}

We then iterate the buffer of commands and for each one we hash the command itself. Each hash value for the cells of the hash grid which the command's rect overlaps would then be updated with the command's hash value:

Command *cmd = (void*)  command_buf;
Command *end = (void*) &command_buf[command_buf_idx];

while (cmd != end) {
  unsigned h = HASH_INITIAL;
  hash(&h, cmd, cmd->size);
  update_overlapping_cells(cmd->rect, h);
  cmd = (void*) (((char*) cmd) + cmd->size);
}

With the implementation of update_overlapping_cells:

void update_overlapping_cells(Rect r, unsigned h) {
  int x1 = r.x / CELL_SIZE;
  int y1 = r.y / CELL_SIZE;
  int x2 = (r.x + r.w) / CELL_SIZE;
  int y2 = (r.y + r.h) / CELL_SIZE;
  for (int y = y1; y <= y2; y++) {
    for (int x = x1; x <= x2; x++) {
      hash(&cells[x + y * CELLS_X], &h, sizeof(h));
    }
  }
}

Rendering

Once all the commands have been iterated the hash grid is complete. We then iterate each cell of the hash grid and compare it to the previous frame's value, if the cells differ we need to redraw the region of the screen that the cell represents.

For each changed region we set our renderer to clip to that region and iterate the draw commands again, for every command where the rect overlaps that cell we perform the command's draw operation:

for (int y = 0; y < CELLS_Y; y++) {
  for (int x = 0; x < CELLS_X; x++) {
    int idx = x + y * CELLS_X;
    if (cells[idx] != cells_prev[idx]) {
      Rect rect = { x * CELL_SIZE, y * CELL_SIZE, CELL_SIZE, CELL_SIZE };
      redraw_region(rect);
    }
  }
}

As a simple optimisation each changed cell shouldn't draw immediately, but instead an effort should be made to merge the rectangle regions for adjacent cells into larger rectangle regions which are then redrawn.

Once we have finished redrawing the changed regions we swap the current frame's hash grid with the previous frame's (such that next frame we would be comparing with this frame's state) and reset the command buffer index to the start of the buffer:

unsigned *tmp = cells_prev;
cells_prev = cells;
cells = tmp;
command_buf_idx = 0;

Although the approach is aimed towards software rendering, it is not tied to it specifically and can be used to minimize redrawing when using hardware rendering; though it's questionable in reality how many situations would be improved by the approach when using hardware rendering.

Conclusion

The result of the approach means that rendering each frame is reduced to pushing a series of commands to a linear buffer and performing a fast hash function. For a typical UI most frames will result in nothing being redrawn, and most redraws will do so to only a small portion of the screen. Consider the difference in redrawing a transparent 512x512 rectangle which would require blending 262,144 pixels (1,048,576 bytes), versus hashing 180 bytes of data (the 40 byte command and 36 hash grid cells; note: the number of cells can be reduced by increasing the cell size if you're typically drawing larger elements). The more costly the drawing operation would have been the more benefit the approach yields.