Serengeti logo BLACK white bg w slogan
Menu

Speed comparison of C and C# binaries - Part 1

Saša Barišić, Mid Software Developer
07.03.2023.

The theme of this post is the speed comparison between two binaries, both targeting x64 Windows operating system. One is written in C# (.NET Framework 4.8), the other is in C (Legacy MSVC Standard), compiled with Microsoft's Visual C compiler. All benchmarks are done on Intel Core i7-8550U CPU.

Both .dll binaries export the same API and can be used from any other external program. You might be familiar with how something like that can be done in a native language like C. It's easy. All you need to do is the following:

__declspec(dllexport) void test_func() {
    // DO SOMETHING
}

And any other program wanting to use the exported function can do that by using the WinAPI functions LoadLibrary/GetProcAddress, like this:

void main() {

    void* module = (void*)LoadLibrary("test.dll");

    void (*test_func)() = (void (*)())GetProcAddress(module, "test_func");

    test_func();

}

And C#? Isn't C# backed by some sort of intermediate language running on some sort of virtual machine? Well, it is! But that doesn't mean you can't export functions from dll files written in C# and use them in a pure-C program. No, you don't even have to set up the .NET runtime yourself to handle that.

.NET exe programs export a native stub function (main() basically) which bootstraps the .NET runtime in the background, and the .NET runtime actually handles all the custom "VM" stuff. It does Just-in-Time compilation, optimization and much more. This can be very powerful, as things like vector, matrix operations can be just-in-time compiled to the platform-specific most optimized native code (Intel vs AMD, AVX512, AVX2, AVX, SSE4.1, SSE2, SSE, and so on).

The same can be done with any other static function, the .NET intermediate language already supports instructions for adding functions with their appropriate stubs to the dll export table. The catch is that the C# language itself does not support attributes like that by default, only DllImport.

The tool used to add DllExport functionality in this demo is the following: https://github.com/3F/DllExport

Demo

The demo application is a simple software renderer. It loads a model of a cup in obj format, a texture from a png file, some math magic happens, and you get a stream of pixels out. The resulting image is rendered in a window using GLFW. GLFW also helps with keyboard input, window creation, event handling, and so on.

image 22

The launcher uses either the C binary or the C# binary, depending on the prompted input at the start of the program.

It binds the exported functions from the dll files to a set of delegates (basically, function pointers). After that, the main launcher program does all the model loading and proper API usage to get identical output from both of them on the screen.

There is almost no difference between the C and C# version, it's basically a copypasted code with minor differences, just to make it compile. This is another reason why you can see actual old-school function pointers in .NET code.

Benchmarks

The whole solution was compiled in debug mode with all optimizations turned off, and in release mode with all optimizations turned on. All benchmark numbers are taken at the 1000th rendered frame to make it consistent. Stats were printed to the console once every 2 seconds.

Debug mode, wireframe

The performance between these two was basically identical. Both rendering backends delivered the 1000th frame in about 26 seconds.

image 10
image 14
image 12

Release mode, wireframe

Very similar performance, the native C version is about 4% faster here. It's still not worth sacrificing all the new language features for this small performance gain.

image 15
image 16

Debug mode, full rendering

Interesting! With no optimizations by the compiler, the .NET version is actually faster. C binary rendered the 1000th frame in 692 seconds with 1.45 FPS average, while C# did the same in about 422 seconds with 2.38 FPS average.

image 17
image 18

Release mode, full rendering

And now the actual results that everybody is interested in. 1000th frame in C delivered in 41 seconds. C# took twice as long.

image 19
image 20

Conclusion

Example problem in this demo is written to maximize a single CPU core to the limits. Functions which are very hard to parallelize, but take a lot of processor time to complete and hog main application performance, might benefit from being rewritten in a more low-level language like C. In some future blogs, we're going to look into simple optimizations we can do to the C# code to make it just as fast, if not even faster than the C equivalent. After all, nobody uses pointers in modern .NET code these days.

Let's do business

The project was co-financed by the European Union from the European Regional Development Fund. The content of the site is the sole responsibility of Serengeti ltd.
cross