SPARC Delayed Branching

If a branch or CALL is done on a SPARC, the new address is loaded into the nPC, not the PC.

Effect: The instruction following the branch is executed before the branch takes effect.

The position immediately following any branch or call instruction is called the "delay slot", and the instruction in that position is the "delay instruction".

Example:

Addr  Code
----  ----------------------
1000  addcc %g0,%g0,%g0
1004  be    where
1008  subcc %g0,123,%L1
100C  st    %L1,[%i0+%i1]
...
2000  where: add %L1,%G0,%L2
2004         ld [%i0+%i1],%L3

Effect:

 PC   nPC  What's happening
---- ----  -----------------------
1000 1004  addcc executing, be    being fetched
1004 1008  be    executing, subcc being fetched
1008 2000  subcc executing, add   being fetched
2000 2004  add   executing, ld    being fetched

***ALWAYS FILL THE DELAY SLOT!!***

What can be put into the delay slot?

What MUST NOT be put into the delay slot?

Example 1:

Loop: ...
      ...
      add %L1,%L2,%L1 !Add %L2 to the sum in %L1
      add %L3,1,%L3   !Increment the counter
      ba  Loop        !Back to the Loop again
      nop             !The delay slot

should not be done that way. Instead, use:

Loop: ...
      ...
      add %L1,%L2,%L1 !Add %L2 to the sum in %L1
      ba  Loop        !Back to the Loop again
      add %L3,1,%L3   !Increment the counter (delay slot)

Example 2:

Loop: add   %L3,1,%L3   !Increment the counter
      ...
      subcc %L2,%L3,%G0 !Compare %L2 to %L3
      bne   Loop        !Loop until they are equal
      nop               !The delay slot

should not be done that way. Instead, use:

      add   %L3,1,%L3   !Increment the counter
Loop: ...
      subcc %L2,%L3,%G0 !Compare %L2 to %L3
      bne   Loop        !Loop until they are equal
      add   %L3,1,%L3   !Increment the counter

This one is better, because the number of instructions inside the loop has been decreased by 1. However, this version increments %L3 once too often. If the value of %L3 is not important after the loop is finished, then that it not a problem. If the value of %L3 is important, then you could add the instruction

      sub   %L3,1,%L3 !Undo the last addition.

This looks like a poor solution, but it is better to do an extra instruction once after the end of a loop than an extra instruction every time through the loop.

(There is also another solution to this problem - the "bne,a" instruction - but we will not cover it in this course.)

Example 3: This program contains a bug:

Loop: set   Array,%L4   !Put Array's address in %L4
      ...
      subcc %L2,%L3,%G0 !Compare %L2 to %L3
      bne   Loop        !Loop until they are equal

! -------------------
! Another line has been finished.
! Increment the line counter in %G1
!--------------------

      add   %G1,1,%G1
      ...

The correct code is:

Loop: set   Array,%L4   !Put Array's address in %L4
      ...
      subcc %L2,%L3,%G0 !Compare %L2 to %L3
      bne   Loop        !Loop until they are equal
      nop

! -------------------
! Another line has been finished.
! Increment the line counter in %G1
!--------------------

      add   %G1,1,%G1
      ...

Without the "nop" in the delay slot, the "add" will be done every time through the loop (the add will be in the delay slot). Since we can't put the "subcc" in the delay slot (the branch depends on it), and we can't put the "set" in the delay slot either, we'll have to settle for a "nop".