This is the context. The meat of the issue is Dlang has a very easy to use way to let the compiler decide how operations on arrays should be done based on what is available to it at compile time: Array Operations . Testing the performance gains was done according to Other Dev tools: Valgrind Helper . Screenshots are of KCachegrind.
The code:
module tests.vector_operations;
void main(){
    test_loop();
    test_vector();
}
void test_vector() {
    int[1000000] test;
    int[] result;
    result[] = test[] + 1;
}
void test_loop() {
    int[1000000] test;
    int[1000000] result;
    for (size_t i = 0; i < test.length; i++) {
        result[i] = test[i] + 1;
    }
}
Compiling, running valgrind, demangling
$ dmd -g vector_operations.d $ valgrind --tool=callgrind --dump-instr=yes --collect-jumps=yes --callgrind-out-file=callgrind_out ./vector_operations $ ddemangle callgrind_out > callgrind_out.demangledInspecting the results in KCachegrind:

