Branchless Programming: Difference between revisions

(17 intermediate revisions by the same user not shown)

Line 3:

If the CPU predicts incorrectly, it will have to discard all pipelined results and go down the other route.

Furthermore warps on GPUs must all go through both branches if there are threads which go down each branch.

~~This can significantly slow things down.~~

In general, I consider avoiding if statements to be a micro-optimization.

Line 14:

Line 13:

==Switch Statements==

Technically, switch statements are still branches.

However, they can lead to a negligible performance gains if there are lots of cases.

Only convert if to ~~switch if~~ switch statements are significantly more readable in your scenario.

Only convert if statements to switch statements when they are significantly more readable in your scenario.

==Polymorphism==

Line 31:

Line 30:

This is one way to simulate switch statements.

Technically these do not have branching but will have a memory read which could be just as slow as a single branch.

<~~pre~~>

# Cache the following at the start of your program or in the constructor

# Assume we have some functions batch_norm(), group_norm()

Line 39:

Line 38:

"identity": identity

}

# At runtime, you can do:

x = my_cache[operation](x)

</~~pre~~>

</syntaxhighlight>

==Booleans==

Multiplications by booleans can be used instead of switch statements.

~~==Multiplications==~~

Rather than doing

<~~pre~~>

if age > 10:

if (age > 10) {

x = ~~x +~~ 10

x += 10;

else:

} else {

x = ~~x -~~ 10

x -= 10;

</~~pre~~>

}

</syntaxhighlight>

you can do

<~~pre~~>

y = 2 * (age > 10) - 1

x = x + (20 * (age > 10) - 10);

~~x = x +~~ 10 * y

// or

y = age > 10

y = age > 10;

x = x + 10 * y - 10 * (~~1-y~~)

x = x + 10 * y - 10 * !y;

</~~pre~~>

</syntaxhighlight>

Similarly, instead of doing

if (age > 10) {

x++;

}

</syntaxhighlight>

you can just write <code>x += age>10;</code>.

These are micro optimizations but they ~~are sometimes~~ useful for speeding up GPU shaders.

These are micro optimizations but they could be useful for speeding up GPU shaders in some cases.

In high-level programming languages such as Python, these tricks can actually slow down your code<ref name="branchless-slower-python">https://stackoverflow.com/questions/72118249/why-are-the-branchless-and-built-in-functions-slower-in-python</ref>.

==Resources==

* [https://francescocirillo.com/pages/anti-if-campaign Anti If Campaign]

* [https://code.joejag.com/2016/anti-if-the-missing-patterns.html Anti-If: The missing patterns]

==References==

@@ Line 3: / Line 3: @@
 If the CPU predicts incorrectly, it will have to discard all pipelined results and go down the other route.
 Furthermore warps on GPUs must all go through both branches if there are threads which go down each branch.
-This can significantly slow things down.
 In general, I consider avoiding if statements to be a micro-optimization.
@@ Line 14: / Line 13: @@
 ==Switch Statements==
 Technically, switch statements are still branches.
 However, they can lead to a negligible performance gains if there are lots of cases.
-Only convert if to switch if switch statements are significantly more readable in your scenario.
+Only convert if statements to switch statements when they are significantly more readable in your scenario.
 ==Polymorphism==
@@ Line 31: / Line 30: @@
 This is one way to simulate switch statements.
 Technically these do not have branching but will have a memory read which could be just as slow as a single branch.
-<pre>
+<syntaxhighlight lang="python">
 # Cache the following at the start of your program or in the constructor
 # Assume we have some functions batch_norm(), group_norm()
@@ Line 39: / Line 38: @@
    "identity": identity
 }
+# At runtime, you can do:
 x = my_cache[operation](x)
-</pre>
+</syntaxhighlight>
+==Booleans==
+Multiplications by booleans can be used instead of switch statements.
-==Multiplications==
 Rather than doing
-<pre>
+<syntaxhighlight lang="cpp">
-if age > 10:
+if (age > 10) {
-   x = x + 10
+   x += 10;
-else:
+} else {
-   x = x - 10
+   x -= 10;
-</pre>
+}
+</syntaxhighlight>
 you can do
-<pre>
+<syntaxhighlight lang="cpp">
-y = 2 * (age > 10) - 1
+x = x + (20 * (age > 10) - 10);
-x = x + 10 * y
 // or
-y = age > 10
+y = age > 10;
-x = x + 10 * y - 10 * (1-y)
+x = x + 10 * y - 10 * !y;
-</pre>
+</syntaxhighlight>
+Similarly, instead of doing
+<syntaxhighlight lang="cpp">
+if (age > 10) {
+  x++;
+}
+</syntaxhighlight>
+you can just write <code>x += age>10;</code>.
-These are micro optimizations but they are sometimes useful for speeding up GPU shaders.
+These are micro optimizations but they could be useful for speeding up GPU shaders in some cases.
+In high-level programming languages such as Python, these tricks can actually slow down your code<ref name="branchless-slower-python">https://stackoverflow.com/questions/72118249/why-are-the-branchless-and-built-in-functions-slower-in-python</ref>.
 ==Resources==
 * [https://francescocirillo.com/pages/anti-if-campaign Anti If Campaign]
+* [https://code.joejag.com/2016/anti-if-the-missing-patterns.html Anti-If: The missing patterns]
+==References==
+<references />