I am absolutely happy with Scala and just love it :)
But sometimes I really want to go a bit more "low level", without a JVM and using "cool" CPU-Features like SSE etc.
So what would be a good second language besides Scala?
It should be:
So basically I want a Scala where I can just throw in inline assembler when I want to :) I assume, that such a language does not exist, but maybe there are some that come close.
So what would be a good choice? C++?, D?, OCaml?
I programmed a bit in C++ (15 Years ago) and very little with OCaml. In both cases, I only solved a few problems and never got very "deep" into the language itself.
You're pretty much describing D.
Compiled to machine code: Check. There is an experimental .NET VM implementation, but all three major implementations (DMD, LDC, GDC) compile directly to native code and the language is designed to make native compilation feasible.
Easy usage of C libraries: D supports the C ABI and all C types. Pretty much all you have to do is translate the header file and link in the C object file. This can even be partially automated.
Possible to program very close to the hardware: Check. D is what I'd call an idiomatic superset of C. It does not support every piece of C syntax, its module system is completely different, static arrays are value types in D2, etc. However, for any construct in the C language proper (i.e. excluding the preprocessor) there is an equivalent construct in D or the standard library. For any piece of C code (excluding preprocessor abuse) there is a canonical D translation that looks roughly the same and should generate the same assembly language instructions if you're using the same compiler backend. In other words, every C idiom (excluding preprocessor abuse) can be translated to D in a straightforward way.
The reference implementation of D also supports inline ASM, so you can mess with SSE, etc.
Possible to program in a very highlevel-way when I want to: Check. D is designed to be primarily garbage-collected language (though you can use manual memory management if you insist and are careful not to use library/runtime features that assume GC). Other than that, high-level programming is mostly implemented via template metaprogramming. Before you run away, please understand that template metaprogramming in D is greatly improved compared to C++. Doing template metaprogramming in D vs. C++ is like doing object oriented programming in C++ vs. C. In D template metaprogramming is designed into the language, whereas in C++ there are just enough features that you can use clever hackishness to make it barely work. The same could be said for object-oriented programming in C++ vs. C. The std.algorithm and std.range modules of Phobos are good examples of the high-level subset of D.