Massive GPU bottleneck on frame copy operation on Linux

Minerscale

New Member
Nov 14, 2025
2
4
Hi,

I was testing the game under proton on my Archlinux machine on the latest NVIDIA graphics drivers on a 3070ti (I understand that Linux isn't supported, but I figure I would report it anyway since I spent some time tracking down where the bug might be). It works pretty great! Except that I'm seeing about half to a third of the performance I would expect.

This lead me to a renderdoc frame timing bender to see if I could see any obvious problems. I found two draw commands right next to each other that took 5.7ms and 6.2ms respectively. They were right next to each other (Colour pass #4 and Colour Pass #5, if this means anything to you). Even funnier, it appears the command was a copy operation offending shader is 'Content/Core/Shaders/Copy.frag'. I'm not exactly sure why it would be the case that these operations are particularly slow.

Maybe it's better to copy the images some other way? Perhaps without using the fancypants subpassInput? I've thought about it for some time and I've decided I have no idea, but I hope that this is enough information for you all to possibly know what's going on.

This is in the default scene with just Earth in it by the way, I believe it's replicatable with all graphics settings.

I hope this is useful and thank you for your time. Thanks for your beautiful work so far too, it's gorgeous.

If you need any more information feel free to ask.
 

Attachments

  • timings.png
    timings.png
    168.9 KB · Views: 0
Upvote 0
Upon further reflection and testing I've found that there's a bit over 2GiB/s of upload over PCIe to my graphics card, since I use an eGPU, I expect that this is pretty much almost all the bandwidth available, and explains the thrashing of the GPU. As a result, I think the deeper reason for this problem is that I think some framebuffers are accidentally being created with their vulkan memory being HOST_COHERENT or HOST_CACHED instead of DEVICE_LOCAL, causing some frame data to live in local ram instead of on the GPU, causing catastrophic slowdowns. This is especially true on my hardware due to the eGPU, seriously slowing down specifically RAM -> VRAM transfers.

Anyway, something to look into.
 
Yeah, this would also explain why some of us laptop RTX 4060M + Ryzen 7840HS have system slowdowns, because the thing is using a huge amount of PCIE bandwidth, especially when it crashes and tries to unload itself. Hopefully some of you could bring this up to the Devs on the discord somehow