Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iOS. Metal rendering. .Net worse performance than Xamarin when use IMTLDevice.CreateBuffer #21290

Closed
Alexandr-Gurinovich opened this issue Sep 24, 2024 · 9 comments
Labels
need-info Waiting for more information before the bug can be investigated

Comments

@Alexandr-Gurinovich
Copy link

Apple platform

iOS

Framework version

net8.0-*

Affected platform version

17.11.4.40609, dotnet 8.0.401

Description

In my Xamarin.iOS project I use Metal for rendering.

After migrating from Xamarin to .Net8 I noticed some performance issues with rendering. In particular I seen FPS drop on low devices, higher energy consumption and device heating increased (according to users complaints).

I profiled render loop and found that IMTLDevice.CreateBuffer(IntPtr pointer, UIntPtr length, MTLResourceOptions options) executed much longer after migration to .Net8.

Steps to Reproduce

  1. have an application that calls IMTLDevice.CreateBuffer(IntPtr pointer, UIntPtr length, MTLResourceOptions options) several times every frame written on Xamarin platform
  2. migrate project to .Net8
    Result: Render loop works slower and cause performance issues

Did you find any workaround?

Probably Xamarin implementation have some cacheing layer when Buffer is created. Sample solution attached in sample. Caching native constructor saves around 35% of execution time.

MetalSample[999:257283] Time spent with regular CreateBuffer call: 100.1 ms
MetalSample[999:257283] Time spent with constructor cached: 66.2 ms

MetalSample.zip

Relevant log output

No response

@Alexandr-Gurinovich Alexandr-Gurinovich changed the title iOS. Metal rendering. .Net worth performance than Xamarin when use IMTLDevice.CreateBuffer iOS. Metal rendering. .Net worse performance than Xamarin when use IMTLDevice.CreateBuffer Sep 24, 2024
rolfbjarne added a commit that referenced this issue Sep 25, 2024
…eturn retained objects.

Instead of doing this:

```cs
var obj = Runtime.GetINativeObject<T&gt(handle, false);
NSObject.ReleaseDangerous (handle);
```

we now do this:

```cs
var obj = Runtime.GetINativeObject<T&gt(handle, true);
```

Less generated code, and better performance too.

This was found while investigating #21290.
@rolfbjarne
Copy link
Member

Can you try adding the following to your csproj to see if it makes a difference?

<PropertyGroup>
    <MtouchExtraArgs>--require-pinvoke-wrappers=true</MtouchExtraArgs>
    <Registrar>managed-static</Registrar>
</PropertyGroup>

You didn't provide a Xamarin sample, so I can't compare your .NET project to see if these changes makes it reach the Xamarin performance.

@rolfbjarne rolfbjarne added the need-info Waiting for more information before the bug can be investigated label Sep 25, 2024
@rolfbjarne rolfbjarne added this to the Future milestone Sep 25, 2024
@Alexandr-Gurinovich
Copy link
Author

Alexandr-Gurinovich commented Sep 25, 2024

<PropertyGroup>
    <MtouchExtraArgs>--require-pinvoke-wrappers=true</MtouchExtraArgs>
    <Registrar>managed-static</Registrar>
</PropertyGroup>

Hi, I've tested additional parameters. Did not observe any changes in performance.

Attaching xamarin and updated dotnet samples. I'm unable to run xamarin sample on my local machine now so apologize for not providing compare logs.

dotnet_and_xamarin.zip

Please give a link to a documentation regarding require-pinvoke-wrappers parameter.

@microsoft-github-policy-service microsoft-github-policy-service bot added need-attention An issue requires our attention/response and removed need-info Waiting for more information before the bug can be investigated labels Sep 25, 2024
@rolfbjarne
Copy link
Member

I've found a minor performance improvement we can do (#21297), and that cuts down a little bit of the difference between your factory implementation and the built-in Runtime.GetINativeObject implementation.

Additionally, your profiling is also not entirely accurate, there will be one more loop for one of the scenarios than the other, skewing the results somewhat.

Something like this solves that:

var counter = 10000;
var factoryItems = new IDisposable [counter];
var apiItems = new IDisposable [counter];
Stopwatch sw;

// test CreateBuffer call performance
var apiTime = Stopwatch.StartNew();
for (var i = 0; i < counter; i++)
{
   apiItems [i] = _device.CreateBuffer(bufferPtr, (UIntPtr)size, MTLResourceOptions.CpuCacheModeDefault);
}
apiTime.Stop ();

// test CreateBuffer call with cached constructor performance
var factoryTime = Stopwatch.StartNew();
for (var i = 0; i < counter; i++)
{
    factoryItems [i] = _cachedFactory.CreateBuffer(bufferPtr, (UIntPtr)size, MTLResourceOptions.CpuCacheModeDefault, _device.Handle);
}
factoryTime.Stop ();


for (var i = 0; i < counter; i++) {
    factoryItems [i].Dispose ();
    apiItems [i].Dispose ();
}

Console.WriteLine($"Time spent with regular CreateBuffer call: {(long) apiTime.Elapsed.TotalMilliseconds} ms");
Console.WriteLine($"Time spent with constructor cached: {(long) factoryTime.Elapsed.TotalMilliseconds} ms");

With these two changes (and the ones in the csproj I previously mentioned) the Runtime.GetINativeObject implementation is 85-90% of your factory implementation according to the timing in the code.

Interestingly the numbers I get in Instruments are different, the Runtime.GetINativeObject implementation is 96-97% of your factory implementation:

Screenshot 2024-09-25 at 13 54 54

Digging further into each of these callstacks reveal that the difference in time is actually low-level iOS code, so I have no idea what's going on here.

@rolfbjarne
Copy link
Member

Attaching xamarin and updated dotnet samples. I'm unable to run xamarin sample on my local machine now so apologize for not providing compare logs.

dotnet_and_xamarin.zip

I modified these a bit to make sure they test exactly the same thing: https://gist.github.com/rolfbjarne/1c6239078d9cfd186c4e9d8f145ae9c4

And the output I get with the Xamarin sample is something like this:

[...]
Time spent per second with regular CreateBuffer call: 95.6878 ms. Buffer Count: 21290 Time per buffer: 0.0782467167684359
Time spent per second with regular CreateBuffer call: 94.4411 ms. Buffer Count: 22490 Time per buffer: 0.0782709515340151
Time spent per second with regular CreateBuffer call: 93.2856 ms. Buffer Count: 23690 Time per buffer: 0.078243955255382
Time spent per second with regular CreateBuffer call: 95.4141 ms. Buffer Count: 24890 Time per buffer: 0.0783050783447168
Time spent per second with regular CreateBuffer call: 96.7903 ms. Buffer Count: 26090 Time per buffer: 0.0784133269451897

and the output I get with the .NET sample is:

[...]
Time spent per second with regular CreateBuffer call: 65.4067 ms. Buffer Count: 30710 Time per buffer: 0.05200690003256268
Time spent per second with regular CreateBuffer call: 66.06529999999997 ms. Buffer Count: 31910 Time per buffer: 0.0521215042306487
Time spent per second with regular CreateBuffer call: 64.91640000000002 ms. Buffer Count: 33110 Time per buffer: 0.05219310178193899
Time spent per second with regular CreateBuffer call: 66.15220000000002 ms. Buffer Count: 34310 Time per buffer: 0.05229570970562518
Time spent per second with regular CreateBuffer call: 64.8674 ms. Buffer Count: 35510 Time per buffer: 0.052355201351731905

In other words: the .NET version is performing better than the Xamarin version.

Could there be something else in your .NET app that's slower than the Xamarin app now?

@rolfbjarne rolfbjarne added need-info Waiting for more information before the bug can be investigated no-auto-reply For internal use and removed need-attention An issue requires our attention/response labels Sep 26, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot removed the no-auto-reply For internal use label Sep 26, 2024
@Alexandr-Gurinovich
Copy link
Author

I'm really surprised by these results. I think the sample I prepare do not reflect the problem I faced in my project. I apologize for this experience.

As I said in issue description, on my project the call CreateBuffer executed much longer after project was migrated to net8 and the workaround I provided just returned the performance back. Providing the statistics from Xcode Organizer:
image

The same results I see for FPS in internal monitoring tools.

@microsoft-github-policy-service microsoft-github-policy-service bot added need-attention An issue requires our attention/response and removed need-info Waiting for more information before the bug can be investigated labels Sep 27, 2024
@rolfbjarne
Copy link
Member

Are these numbers with these additions to the csproj?

<PropertyGroup>
    <MtouchExtraArgs>--require-pinvoke-wrappers=true</MtouchExtraArgs>
    <Registrar>managed-static</Registrar>
</PropertyGroup>

@Alexandr-Gurinovich
Copy link
Author

Are these numbers with these additions to the csproj?

<PropertyGroup>
    <MtouchExtraArgs>--require-pinvoke-wrappers=true</MtouchExtraArgs>
    <Registrar>managed-static</Registrar>
</PropertyGroup>

Without additions to the csproj.

I will do measurement with them added to my real project but it will take time.

@rolfbjarne
Copy link
Member

I will do measurement with them added to my real project but it will take time.

OK, sounds good.

If this issue ends up auto-closing before you get the results, feel free to comment once you've got the results and we'll reopen.

@rolfbjarne rolfbjarne added need-info Waiting for more information before the bug can be investigated no-auto-reply For internal use and removed need-attention An issue requires our attention/response labels Sep 27, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot removed the no-auto-reply For internal use label Sep 27, 2024
rolfbjarne added a commit that referenced this issue Oct 4, 2024
…eturn retained objects. (#21297)

Instead of doing this:

```cs
var obj = Runtime.GetINativeObject&lt;T&gt(handle, false);
NSObject.ReleaseDangerous (handle);
```

we now do this:

```cs
var obj = Runtime.GetINativeObject&lt;T&gt(handle, true);
```

Less generated code, and better performance too.

This was found while investigating #21290.
Copy link
Contributor

Hi @Alexandr-Gurinovich. Due to inactivity, we will be closing this issue. Please feel free to re-open this issue if the issue persists. For enhanced visibility, if over 7 days have passed, please open a new issue and link this issue there. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need-info Waiting for more information before the bug can be investigated
Projects
None yet
Development

No branches or pull requests

2 participants