I have an embedded application with a time-critical ISR that needs to iterate through an array of size 256 (pr
All the following instructions do the same thing: set %eax to zero. Which way is optimal (requiring fewest mac
While writing an optimized ftol function I found some very odd behaviour in GCC 4.6.1. Let me show you the cod
Background: While optimizing some Pascal code with embedded assembly language, I noticed an unnecessary MOV i
I wrote these two solutions for Project Euler Q14, in assembly and in C++. They implement identical brute forc
I was looking for the fastest way to popcount large arrays of data. I encountered a very weird effect: Changin
I am doing some numerical optimization on a scientific application. One thing I noticed is that GCC will optim
This is the best algorithm I could come up. def get_primes(n): numbers = set(range(n, 1, -1)) primes
I'm looking for the fastest way to determine if a long value is a perfect square (i.e. its square root is
I was implementing an algorithm in Swift Beta and noticed that the performance was very poor. After digging de
The following are two methods of building a link that has the sole purpose of running JavaScript code. Which i