Isadora, 2013 Mac Pro, dual GPUs and CPU isssues
-
@fred
thank you Fred, I am still working on a solution with the Mac Pro, but a new PC with the GTX10180 will arrive tomorrow. This project is only using three HD video projectors, so I can use the 4th output from the GTX1080 as a control screen. The Isadora patch currently uses a 3840x1080 stage preview stage with Syphon before splitting it into three stages for the projectors, so it appears using the built-in video output for a control screen is not an option.The three video projectors for development are now running at 1920x1080, but this will change to 1920x1200 when I connect to the large projectors. I have never used an MST hub, but I have now read about it. Is it correct that I can attach an MST hub to a displayport on the GTX1080 and get two independent 1920x1080 signals, and then I can drive 4 HD projectors and one control monitor?
many thanks,Don
-
@dritter Yes the MST hub will appear as separate monitors that come from a single disaplay port output
-
Thank you Fred, I will try it. I now have the PC and have been working on it all day. Unfortunately I have many problems, which I will post in two minutes.
-
@dritter said:
@fred
@bonemapFred and Bonemap,
After 24 hours of trouble-shooting, I was able to get my Mac-Isadora patch working on a 3.5GHz 6-core PC with a GTX1080ti GPU. In one mode of playback, the patch plays 90 3D texture mapped objects, 90 3D particles (each with a particle count of 50) and 6 3D Model particles (each with a particle count of 15), all controlled with live xyz data from 3 Kinects and composited onto a 3840x1080 video (Prores or MP4) that output to three mapped and blended video HD projections. The maximum frame-rate for the PC with this configuration was 18.7, and the Mac Pro with a Radeon D700 was 11.4. Comparing various tests with this setup, the PC was 45-79% faster than the Mac Pro, but the Mac was much more stable. Does this seem what you expected?
Sincerely,
Don
-
Hi Don,
It is really great that you have done this comparison on your particular set of variables, actors and processes as it is very close to the kind of work that I am also developing in Isadora. I believe that you are being very ambitious with the amount of realtime data processing being attempted and I would say that getting a frame rate around 18fps is a great and encouraging result. There is no question that the Nvidia GPU is superior for purpose.
I would be interested to hear how the patch set-up performs over time, particularly if the amount of OSC data creates a bottleneck and performance issues (other than fps) become a factor. I am experiencing something happening that contributes to a slow down of the OSC data that is not associated with frame rate. The issue only occurs after the system has been running for some time.
Thanks for posting about your experiences with the development of your work. There are numerous comparative reviews of AMD vs Nvidia mostly focused on gaming performance and the Nvidia gpu's are clearly outperforming the AMD graphics for realtime rendering. With a new series of very expensive cards including the Nvidia GeForce RTX 2080 Ti, the gap in performance and cost appears to be widening further still.
Best WIshes
bonemap
-
@bonemap
Hello Bonemap, thank you for you various comments and for sharing you patches. They were very helpful. Good luck with you projects.
sincerely,
Don -
@dritter good to hear you made some progress. I was thinking about @jhoepffner 's reply regarding vertices and I had a chance to do some checking and reading. I although vertices are computed on the CPU they are then uploaded to the GPU. For a 3D scene the rendering system then does raycasting on a per vertex basis to calculate what would be seen in a 2D perspective of your 3D scene. The more vertexes the more power needed, and the more time needed (= a lower frame rate). This is such a thing that game developers have a vertex budget for characters and scenes. We are not using a game engine, but the pipeline for showing 3d models is exactly the same. Here is one of many articles about optimising assets and scenes for the GPU
https://docs.unity3d.com/Manua...
Optimisation of assets is a very serious task and, done well makes the difference between a playable game and something that eats frame-rates for no reason. Done well the optimisation will not look any different. It is just that when creating a model (mesh) artists will use settings and tools for creating that often have many extra vertices that are not needed to show the shape you want, many extras are made using bending tools and uniform mesh sizes. However when it is time for the GPU pipeline for playback we do not need them. The same can be applied to your work with Isadora.
I would guess you can get a lot more power out of your system (considering what I have seen the 1080ti do) by optimising your models, and reducing the number of vertexes as much as possible. You can see some hints here https://help.sketchfab.com/hc/...but in general mesh labs decimation tool can be the first thing to try.
With a bit of work I am sure you can push past 25fps or higher.
And an edited note: power of 2 texturing is also much more efficient, even if it means a larger texture file. This means the dimensions of your texture should be powers of 2, like 128, 512, 1024 etc. This can really reduce the load on the GPU as well.
-
hi,
You say : 90 3D texture mapped objects
Can you tell us what your textures are ?
90 different textures ?
Image files ? Size ? Format ?
Thanks
Mehdi
-
I completely agree with Fred sentence. It's necessary to ask Mark and other developers to implement object instancing in Isadora. With it you dramatically reduce the bottleneck between CPU and GPU because you only send one group of vertices and after you multiply it, changing position, rotation, texture etc. for each instance. I think is what is used for 3D particles but the actor is very difficult to use for a precise behavior.
I ask for it many times, a collective demand could be filled?
Jacques
-
@dritter can you provide a sample do one of your models (say one of the biggest most complex ones). Maybe we can have a go optimising it. Or you can also look at some more targeted tools (maybe the designer who made the models can output some simplified versions).
http://www.mootools.com/plugin...
https://www.okino.com/conv/pol...
And many more
-
Hello Mehdi, my texture maps for the 3D objects and 3D particle objects are PNG files(256x512 resolution) and for 2D particles they are PNG files with an alpha channel(60x60 resolution). I use the same texture files repeatedly on multiple objects. I tried having independent texture files for each object under OSX, but I did not see any difference in performance. Other users have found using independent texture maps will increase performance.
Don -
@dritter 64*64 resolution instead of 60*60 can give you a performance boost.
-
Hi Fred, many thanks for your suggestions and links. I will plan on using independent texture map files and objects under Windows and also change the 2D maps to 64x64 as you suggest. 95% of my 3D object are spheres, each having 32 segments and made in 3DS Max. I tried Meshlab a few days ago, but I am not familiar with that sofware and was unable to reduce it while keeping its roundness. I would greatly appreciate if you could look at the .3ds files and will send then to the email address listed on your website.
sincerely,
Don -
I am also interested by your problem, can you also send me some samples of the mesh and the textures?
Thank you, Jacques
-
Hello Jacques, I have just emailed you two examples.
merci,
Don -
Hi Don,
I assume the 3ds configuration you are attempting is the one represented in the images shown previously in another of your posts?
I still believe you can find efficiency by using the single 3D Model Particles actor and the ‘group index’ parameter with one 3ds file with all the geometry for one of your puppets. Using the 'group index' in such a way is a new approach that I have begun to develop and my tests indicate its efficiency manipulating multiple 3D geometry. This in association with optimising your models as much as you can tolerate, will give you the maximum performance with what you are attempting.
Jacques is right though, it is not trivial to setup and control the geometry through the 3D Model Particles actor. However, I have demonstrated, to myself at least, that it is possible to do it with the user actors that I have shared previously in a tutorial package.
Good luck with the continuation of your project development.
Best wishes
Bonemap
-
@dritter Hi, I had a quick look at one of the files, the ballRod 3ds, The object started with 1042 vertices and 1384 faces. After some light decimation I got 788 vertices and 899 faces, so about a 20 reduction, with no real artefacts. If the 3d rendering is the bottlneck you should get 15-20% higher framerate if you could achieve this for all models. I was using meshlab and the decimate function (Quadratic edge collapse decimation with texture, if you use without texture the texture map coordinates will get messed up and make it look super low poly), and set the target number of vertices to about 8% of the total vertices. This for sure could be done better, and with better looking results. If you can get the system running at or above your target framerate it would mean you are not pushing your system and heating it up (as well as having a smoother image). I would guess with a bit of work on all these models you can get a much better result.
I emailed you the decimated model.
-
Hi Don
I have re-threaded my patch to work with mesh instances using the 3D model Particles actor. This is not using the 'group index' because it is just the one model rather than a number of models in a single 3ds file. I thought I had better capture the test on video as evidence of the outstanding result in terms of improved frame rate. I was able to instance over 90 representations of the same complex 3D model of a tree, add lighting and assign coordinates from Ni Mate OSC getting a range of 50 - 60 fps. If I am able to achieve this frame rate on a Mac Pro 2013 with an AMD graphics, I imagine the same technique through a top flight Nvidia card will be phenomenal.
best wishes
bonemap
-
thanks bonemap, it looks good.
-
Thank you Jacques, Fred and Bonemap for your suggestions regarding optimizing the patch. I have spent the last few days comparing the results on a PC(with GTX1080ti GPU) and a Mac Pro(with dual D700 GPU). Most of this time was spent troubleshooting Isadora on the PC, which is now somewhat useable but very unstable.
During the tests I attempted to determine optimal use of this media: a 3840x1080 @29.97 background video, eighty-seven 3D objects each with 512x256 texture maps, six 3d particles objects(15 particle count), and forty-five 2D particle objects (42 particle count), all controlled by 135 OSC streams from three Kinects. The methods for optimization were:1. Optimizing 3D objects by reducing the number of vertices (each sphere was reduced from 988 to 203 vertices)
2. Use independent 3D object files and texture maps for the ninety-three 3D objects
3. Compare different codecs for the background video: [MOV ProRes 422HQ and 422LT] vs [MOV/MJPEG@75%]. I could not test the HAP codec because I do not have software to create HAP AVI files.
The frame rate for each test was determined by calculating the average fps from 100 samples, each sample being one second apart. The control data for the test was live OSC data from 1 Kinect connected to a Mac Book Pro running NI Mate 2.14, or three simultaneous OSC recordings of a Kinect playing back from within the patch. I was not in the large studio this weekend, so I did not have the physical space or humans to test 3 live Kinects simultaneously.I did about 40 different comparisons, but for me these are the most relevant results:
Test 1: all 3D objects optimized + playback from 3 OSC recordings + no background video: [mac:13.7fps] vs [pc: 21 fps] difference: 53%
Test 2: all 3D objects not optimized + playback from 3 OSC recordings + Prores422HQ background video: [mac: 11.2fps] vs [pc: 16.7 fps] difference: 49%
Test 3: all 3D objects not optimized + playback from 3 OSC recordings + MJPEG-75% background video: [mac: 11.8fps] vs [pc: 19.6. fps] difference: 66%
Test 4: all 3D objects optimized + playback from 3 OSC recording + MJPEG-75% background video: [mac: 12 fps] vs [pc: 20.5 fps] difference: 75%
Test 5: all 3D objects optimized + 2D particle animation + 1 live Kinect OSC sent to all 3D and 2D objects + MJPEG-75% background video: [mac: 10.6 fps] vs [pc: 17.8 fps] difference: 68%
I was surprised that using MJPEG for the background video was providing a significant better frame rate than Prores422HQ(and also 422LT) on the Mac and PC. I was also surprised that using independent 3D object files and texture maps for the ninety-three 3D objects on the Mac did not improve its frame rate, but it did on the PC. I tested this last result a few times.Thanks again,
Don