High load on RTX Ada, low on Intel Arc

DusX

My laptop for example has two sets of ports. One on the left connects directly with the NVidia card, while the one on the back of my machine connect thru the intel card.
If you look at the PhysX section of your NVidia control panel, you can see which physical ports connect and how (some laptops allow the re-assigning of these via GPU settings, for instance my machine allows for 3 modes, active power saving that switches between gpus, low power usage which doesn't use the dedicated gpu at all, and only dedicated gpu where the intel gpu isn't used ever.)

So the PhysX section of the control panel will indicate if any connected display is running through the intel gpu or not, and will indicate which port facilitates the connection (these only update for me after I make a change and reboot my system). Unfortunately not all PCs offer the same variety of setups.

kfriedberg

@dusx

Many adjustments and reboots later, and still no luck. I've found the diagram of connections, and whether I connect to the HDMI or USB-C ports, they all show as connected to the Intel Arc, with only PhysX pointing to the Nvidia RTX2000. I've played with all the performance/quality options I could find in the Nvidia panel, the Intel Arc panel, and Windows power/graphics settings, and all the connections still go to the Arc card.

I tried disabling the Arc card in Device Manager. Windows still knew that the Nvidia card existed, but refused to send anything to it, disabled outputs except for the laptop screen, and thought that everything was showing on a Basic Display Adapter.

Any further ideas? This has me stumped.

mark_m

@kfriedberg

have you talked to Dell support? In the past I have found them to be knowledgeable when I had issues (not related to yours) with a Precision mobile workstation.

kfriedberg

@mark_m

Not yet, but a good idea after I do some more testing. My latest switch to the Game Ready series of drivers seems to have stabilized load at under 30%, but that's still well above the Arc's 4%. If I have time before vacation I'll compare to a desktop with a GeForce/UHD combo, otherwise more results to come in a few weeks.

DusX

@kfriedberg said:

30%

It may be that there is some delivery delay going thru the arc 2 the Nvidia gpu. However a heavy patch that maxes the Intel arc only processing may be easily handled using the arc 2 Nvidia path. I would suggest running some heavy test files where the Intel processing maxes out, then see how high that goes on the Nvidia.

kfriedberg

It certainly took longer than expected to get back to it, but here's some load results from a different computer.

Specs:
Dell Optiplex Tower Plus 7020
Windows 11
Core i7-14700
32 GB RAM
512 GB SSD
Intel UHD Graphics 770
NVIDIA GeForce RTX 4060 8 GB

Under three conditions on each test:
Idle, not showing stages (same result on empty sketch or using an existing show)
Empty sketch, showing stages
Running the show that I was testing on in the first post

In these tests the primary control output was plugged into the RTX, and the stage output was plugged into the given card. Also on these tests, Windows graphics settings to tell Isadora to work on Power Saving (Intel) or High Performance (RTX) had no effect on the results.

Intel idle: 0.2%
Intel show stages: 55%
Intel running show: 34-64%

RTX idle: 0.2%
RTX show stages: 33%
RTX running show: 18-37%

Here's a different set of results. This time the primary output showing and controlling the Isadora sketch is plugged to the Intel output. The stage output is plugged into the given card. But now the Windows graphics settings are doing something.

Intel Power Saving idle: 0.2%
Intel Power Saving show stages: 1.2%
Intel Power Saving running show: 1.3-3.5%

Intel High Performance idle: 0.2%
Intel High Performance show stages: 77%
Intel High Performance running show: 65-94%

RTX Power Saving idle: 0.3%
RTX Power Saving show stages: 14%
RTX Power Saving running show: 10-17%

RTX High Performance idle: 0.2%
RTX High Performance show stages: 33%
RTX High Performance running show: 22-54%

It's tough to say what all that means behind the scenes, other than that the graphics pipeline is complicated. The rules seem to be:
1) Don't mix graphics cards
2) Prefer Intel to Nvidia
3) If you have to mix graphics cards, have the control output on Intel and the stage output on Intel as much as possible, then further stages on Nvidia

Fred

@kfriedberg

These are logical results. One important point is that if you use two graphics cards (ie a control screen plugged into one and stages plugged into another), the data (including every frame of image or video) must reside on both cards. In Isadora, the control screen has access to the video streams (if you hover over the connection, it provides a real-time preview) that are sent to the stage outputs. Whenever this happens, the data must be copied from one GPU, passed through the CPU, and then copied to the other GPU.

Downloading from a GPU is always very slow and uses a lot of resources, whereas uploading is much quicker because it is a more common accelerated function. Typically, data is calculated on the CPU and then sent a single GPU for rendering—this covers most GPU use cases. In your setup though, every piece of video data needs to be on both cards. This means uploading to one card, performing the render, downloading to the CPU and uploading to the other card. Lots of extra work

Integrated GPUs do not need to be fed via a PCI slot, so the connection can be a bit faster (which explains some of your results). NVLink used to be a way to avoid the overhead of this kind of copying by linking GPUs directly, but it is now only available on professional cards and requires software to be programmed to make use of it. There are some very specific tools in TouchDesigner, under strict limitations, that allow two Quadro cards to take advantage of NVLink. However, this is limited to two Quadro cards and applies only in certain scenarios.

Overall, the integrated GPU in your system is much less powerful. To make the best use of your computer, I suggest using a single GPU unless you know for certain that you can avoid transferring data between them (you cannot with isadora or any other software really), or you know that the cost of moving all the data around will be worth it for some reason. Isadora cannot stop data moving between cards (maybe if the previews and thumbnails were disabled but its very counter intuitive)—almost no media software can do that. Disable the integrated card in the BIOS and only use the GeForce card. Also try run the tests again propoerly, disabling each card in turn to see what each actually do and to see more logical results, free of the overhead from copying data. If you need more outputs from your GeForce, you might consider a Datapath device or a video wall splitter.

In short, this is not a limitation of Isadora—this is simply how computers work. If you look at high-end media servers like Disguise, they typically use a single card with specialised internal splitters to provide multiple outputs.

As an aside, outputting video via ArtNet to LED pixel displays requires downloading the textures from the GPU and then creating network packets on the CPU, which is slow on any system. However, the unified memory in Apple Silicon chips means this is very fast because no download is needed; the GPU and CPU share the same memory space. This also brings massive benefits for uploading data from the CPU to the GPU. For example, in a normal rendering pipeline or a non-GPU-accelerated video codec decode, once the CPU finishes decoding a frame, it is instantly available to the GPU without any extra steps. This is a big advantage over PCIe-connected GPUs when large amounts of data need to move around.

TLDR dont use 2 different grpahics cards at once- its not isadoras fault that this is slow.

kfriedberg

@fred said:

Disable the integrated card in the BIOS and only use the GeForce card.

Done. Now integrated isn't in Device Manager so as far as Windows is concerned only the GeForce card exists. And results are identical to the case above where both outputs are plugged into the RTX:

RTX idle: 0.2%
RTX show stages: 33%
RTX running show: 18-37%

This is still unexpected when compared to effectively only using the Intel card from the previous run

Intel Power Saving idle: 0.2%
Intel Power Saving show stages: 1.2%
Intel Power Saving running show: 1.3-3.5%

I'm no longer mixing GPUs, and from what you've said the path off the CPU to the PCIe slot is less efficient, but in theory the GeForce is significantly better for graphics than the integrated. So the numbers showing that it's that much worse for Isadora purposes is surprising.

So now I know what to do - use integrated graphics, and don't bother spending money on a GPU next time. But for my own sanity and future readers of this thread, what's happening here?

Fred

@kfriedberg In general the RTX card does have more power and here are a few questions - sorry if they are obvious.

1. Do you have the same numbers of monitors connected to the RTX as the integrated GPU when you do these tests? Just connecting monitors to outputs and having them active means a lot of GPU resources need to be allocated.

2. Where do you get these statistics from? What tool are you using? The percentage use is a bit vague, you will also want to look for clock speed and memory use - each time you do amy processing it will use memory so the Geforce card will start to shine as you add effects, layers and mapping. GPUz has some more comprehensive stats: https://www.techpowerup.com/gp... MIS afterburner is also popular https://www.msi.com/Landing/af... I should have paid more attention to your earlier posts, but windows task manager has a very bad reputation for reporting GPU usage - a quick google search will reveal a lot of discussions.

3. Do you see a difference in performance (not reporting) between the two cards - if you start adding more content, layers and processing, which card starts to reduce frame rate first - these real world tests can be more relevant than the percentage use, which doesn't tell us what the bottleneck is.

4. From the start of this thread you have been discussing the percentage use scores from windows, which are kind of voodoo. Did you embark on this research to understand the percentage use scores or because you found a real world performance difference?

With these perspectives, especially real world load tests and knowing which card starts to limit frame rates first will give a better perspective. In general as a resource the Geforce card will offer a lot more power for a lot of different tasks - need to edit, render or prep video on that machine - the Geforce will smash the integrated card- https://www.videocardbenchmark...

I suspect that you will see better performance, ie actual ability to process video (rather than a percentage report of resource use) from the Geforce card. The original post you made about the laptop is more complicated, they share some hardware, and each get access to specific outputs and are not easy to control. The desktop is much more flexible with configurations.

DusX

I have seen these same general numbers, and Fred has really covered most of what you need to understand here.
What I will add (and I think Fred was getting to this) is that once you start adding effects, rendering, and mappings the GPU load will increase very quickly, and the Dedicated card will increase much more slowly than the Integrated card.
In generally (a loose estimate) I can run double the effects, rending, and mapping over my dedicated card (or more) as compared to my integrated. The integrated will show a LOAD over 100% in Isadora rather quickly, which means the processing is taking longer than the available time for frame delivery.

My laptop has settings allowing me to use only the dedicated graphics card or a variety of other power saving options. I generally build projects using one of the hybrid settings, so I have access to both gpus (Isadora runs on the dedicated due to the system settings, while browsers etc may run on the integrated), I switch to dedicated only when I am doing a performance/show.

kfriedberg

For Fred's questions:

1. Always the same number of monitors connected, one control and one stage, and reconnecting them as needed on the desktop. On the original laptop, there's the laptop's built-in screen for control and an external monitor for stage.

2. Except for the one sentence about task manager and resource monitor in the original post, all the percentages and usage stats so far are from Isadora's load meter.
Going with GPU-Z, the results are interesting in different ways. When on the RTX, GPU-Z GPU Load roughly matches Isadora load, at around 30% on average with my test shows. When on Intel, GPU-Z GPU load is about 10x the Isadora load, averaging 15% and 1.5% respectively.

3. I'll try that test hopefully towards the end of this week.

4. Real world performance difference. The original reason was because of an actual problem - my TD was doing all the things we'd assumed worked when running a show and it was stuttering to a freeze. Since these laptops I'd spec'd should handle shows even better than the desktops from 6 years ago, that was weird. Then discovering that forcing Isadora to the integrated graphics - against the advice of one of the Isadora troubleshooting articles - made it run essentially perfectly, made me think that more experienced users might have some insight as to what we were doing wrong.
At this point knowing that the numbers we're seeing are normal and finding out the best way to set up our hardware for the future is the goal.

Now that DusX has confirmed my numbers I'm feeling more confident about what I'm seeing. And once I'm able to test both your theories by throwing a lot of layers and effects at the cards I'll have a good idea of which numbers are telling the truth.