Author Topic: Rendering performance issue - Corona Bitmap vs Max Bitmap  (Read 8573 times)

2018-01-09, 17:36:32

Fluss

  • Active Users
  • **
  • Posts: 553
    • View Profile
Hi guys,

I'm actually encountering some performance issues on the project I'm working on.

I'm working on a small animation and I've noticed some weird render times. Corona 1.7.2 - Max 2018

The animation is rendered on two PCs :

Main workstation : dual xeon 2696v4 44cores/88threads @ 2.2GHz - turbo : 2.8GHz - 64GB RAM - corona benchmark 1.3 : 36sec.
Slave : dual xeon 2683v3 28cores/56threads @ 2.0GHz - turbo : 2.5GHz - 64GB RAM - corona benchmark 1.3 : 57sec.

While rendering (submitted to deadline), I've noticed that render times on my workstation were equal or slower than the slave, which shouldn't as the workstation is way more powerful :
(CCO: workstation / vizDR : Slave)



So I opened the scene on both computers and made some renders, here are the results :

workstation :

slave:


I created a fresh new scene with a subdivided plane, some random extrudes, and basic material with bump and sss and I got the expected rendertimes  :
workstation :

slave :


So it seems to be scene related but I don't manage to find what is wrong, any hint on what is going on ?

« Last Edit: 2018-01-30, 10:24:15 by maru »

2018-01-10, 10:54:29
Reply #1

Fluss

  • Active Users
  • **
  • Posts: 553
    • View Profile
OFFTOPIC: How can I embed the uploaded images directly in the topic?

edit : done by copying the image address after first post submission, wonder if there is a better way to do it directly



I forgot to mention that I tried to import the gearVR into the test scene and i get similar results (workstation and slave on par render time wise). I assume that there is not a lot of GI calculation in the gearVR scene which is not profitable for high core count but this does not explain why i get worse performance with a higher clocked and higher core count CPU on the exact same scene.
Something really strange here.


« Last Edit: 2018-01-10, 12:08:27 by Fluss »

2018-01-10, 11:20:03
Reply #2

maru

  • Corona Team
  • Active Users
  • ****
  • Posts: 12780
  • Marcin
    • View Profile
To sum up:

-in a simple test scene (extruded plane) you get the expected rendering performance - the workstation is faster than the slave
-in your production scene, the two computers are getting similar render times, or the workstation is slower, which is not expected

Some questions:
1) You mentioned using Deadline - can you explain how exactly you are using it? Are the two computers rendering two completely separate frames, or are they both rendering one frame? Is there some kind of region/tile rendering involved?
2) Are you using the newest version of Deadline?
3) In the "simple test" and "production scene" examples - is there some network rendering involved (so the image rendered on the slave is merged with the image rendered on the workstation), or are the computers just rendering two images on their own?
Marcin Miodek | chaos-corona.com
3D Support Team Lead - Corona | contact us

2018-01-10, 12:04:58
Reply #3

Fluss

  • Active Users
  • **
  • Posts: 553
    • View Profile
To sum up:

-in a simple test scene (extruded plane) you get the expected rendering performance - the workstation is faster than the slave
-in your production scene, the two computers are getting similar render times, or the workstation is slower, which is not expected


Exactly that

Some questions:
1) You mentioned using Deadline - can you explain how exactly you are using it? Are the two computers rendering two completely separate frames, or are they both rendering one frame? Is there some kind of region/tile rendering involved?
2) Are you using the newest version of Deadline?
3) In the "simple test" and "production scene" examples - is there some network rendering involved (so the image rendered on the slave is merged with the image rendered on the workstation), or are the computers just rendering two images on their own?

1) submitted the animation job with deadline script then running the deadline client on both computers. So each computer was rendering different consecutive frames. Standard render, pass limit, no region/tile etc...

2) Deadline 10.0.1.1

3) The images shown as an example were rendered in single frame on each computer, the computers just rendering the same image (same file, same frame #) on their own. No deadline, no DR here. So the issue happens whether this is a network job or not.
« Last Edit: 2018-01-10, 12:20:39 by Fluss »

2018-01-10, 12:26:57
Reply #4

PROH

  • Active Users
  • **
  • Posts: 1219
    • View Profile
Hi. Very interesting. Could this be caused by the bitmaps used in the GearVR scene? Are the bitmaps placed on a shared server? What happens if the environment and materials from the GearVR scene is used in the "simple" scene?

2018-01-10, 12:41:00
Reply #5

Fluss

  • Active Users
  • **
  • Posts: 553
    • View Profile
Hi. Very interesting. Could this be caused by the bitmaps used in the GearVR scene? Are the bitmaps placed on a shared server? What happens if the environment and materials from the GearVR scene is used in the "simple" scene?

Indeed, bitmaps are stored on a shared server. Same issue with no environment at all. And I get similar results with default corona material in override materials slot :
(Scene parsing took a bit longer on the slave. Only 2s difference in rendering time)

2018-01-10, 13:00:34
Reply #6

PROH

  • Active Users
  • **
  • Posts: 1219
    • View Profile
Hi. I think that there is a problem with your last test, since all textures/bitmaps are still referenced in the file. So you should either remove all texture references (including the material editor) in the GearVR scene, or use these bitmaps in the simple scene.

2018-01-10, 15:03:26
Reply #7

Fluss

  • Active Users
  • **
  • Posts: 553
    • View Profile
Hi. I think that there is a problem with your last test, since all textures/bitmaps are still referenced in the file. So you should either remove all texture references (including the material editor) in the GearVR scene, or use these bitmaps in the simple scene.

Redone the test without any map, pretty self-explanatory :

workstation:

slave:


We can see some improvement without any texture, but there is no more HDRI.
« Last Edit: 2018-01-10, 15:07:43 by Fluss »

2018-01-10, 15:21:36
Reply #8

Fluss

  • Active Users
  • **
  • Posts: 553
    • View Profile
hmmm that's interesting. I replugged the HDRI into the corona env slot and the performance gap between the two computers almost disappear :

workstation :

slave :

2018-01-11, 09:55:13
Reply #9

maru

  • Corona Team
  • Active Users
  • ****
  • Posts: 12780
  • Marcin
    • View Profile
There is a massive drop in rays/s when you enable the HDRI.

-How are you loading the HDRI? What kind of loader are you using? Corona bitmap, max bitmap, Vray HDRI?
-The same about any other textures: what loader?
-What happens when you switch everything to Corona Bitmap? What happens when you switch to Max Bitmap?
-Do not use Vray HDRI. It is not fully supported yet, and can often cause problems.
Marcin Miodek | chaos-corona.com
3D Support Team Lead - Corona | contact us

2018-01-11, 13:54:42
Reply #10

PROH

  • Active Users
  • **
  • Posts: 1219
    • View Profile
+ What happens if all textures are loaded from local drice?

2018-01-16, 17:46:20
Reply #11

Fluss

  • Active Users
  • **
  • Posts: 553
    • View Profile
There is a massive drop in rays/s when you enable the HDRI.

-How are you loading the HDRI? What kind of loader are you using? Corona bitmap, max bitmap, Vray HDRI?
-The same about any other textures: what loader?
-What happens when you switch everything to Corona Bitmap? What happens when you switch to Max Bitmap?
-Do not use Vray HDRI. It is not fully supported yet, and can often cause problems.

I was using the standard 3ds max bitmap loader for all textures. I switched the HDRI to Corona bitmap and BOOM! From 48s to 24s on the workstation! And from 51s to 35s on the spawner!
Holy cow! I was not aware there was such a substantial gap between those two loaders! I assumed it was pretty much the same and used to keep working with standard 3ds max loader by habit! That's a huge time saver here !

2018-01-16, 20:06:59
Reply #12

PROH

  • Active Users
  • **
  • Posts: 1219
    • View Profile
Thanks for sharing this. There's definitely something I'll have to test myself here :)
Thanks!

2018-01-17, 16:24:09
Reply #13

Fluss

  • Active Users
  • **
  • Posts: 553
    • View Profile
Thanks for sharing this. There's definitely something I'll have to test myself here :)
Thanks!

I don't know if there is such a performance gap between loaders on every use cases but that's definitely worth a try.

Back to the original production scene, with everything converted to Coronabitmap (using the Corona converter) :

Workstation :

Slave :




Let's try to summarise my tests and draw some conclusion :

workstation : dual xeon 2696v4 44cores/88threads @ 2.2GHz - turbo : 2.8GHz
Slave : dual xeon 2683v3 28cores/56threads @ 2.0GHz - turbo : 2.5GHz


Corona benchmark 1.3
Workstation : 36s with turbo all core @ 2.6GHz
Slave : 57s with turbo all core @ 2.5GHz
Performance index : 1.58 ( Slave render time / WS render time )

Vray benchmark 1.0.6
Workstation : 26s with turbo all core @ 2.8GHz
Slave : 44s with turbo all core @ 2.5GHz
Performance index :1.69

Cinebench r15
Workstation : 5288pt with turbo all core @ 2.8GHz
Slave : 3273pt with turbo all core @ 2.5GHz
Performance index : 1.62 ( WS score / Slave score )



My test Scene Corona light planes only
Workstation : 176s with turbo all core @ 2.6GHz
Slave : 272s with turbo all core @ 2.5GHz
Performance index : 1.55

My test Scene HDRI only (with Corona Bitmap)
Workstation : 214s with turbo all core @ 2.6GHz
Slave : 309s with turbo all core @ 2.5GHz
Performance index : 1.44

My test Scene HDRI+light planes (with Corona Bitmap)
Workstation : 203s with turbo all core @ 2.6GHz
Slave : 314s with turbo all core @ 2.5GHz
Performance index : 1.55



My production Scene HDRI+light planes (with standard Bitmap)
Workstation : 110s with turbo all core @ 2.6GHz
Slave : 112s with turbo all core @ 2.5GHz
Performance index : 1.02

My production Scene HDRI+light planes (with Corona Bitmap)
Workstation : 50s with turbo all core @ 2.6GHz
Slave : 64s with turbo all core @ 2.5GHz
Performance index : 1.28



Conclusion
_max standard bitmap loader has a tremendous impact on performances
_Corona trigger AVX downclocking on xeon V4, not on xeon v3. What's more, other benchmarks don't trigger AVX downclocking at all which seems to throw more computing power according to the performance index (not sure about that as max is not involved in benchmarks)
_There is probably still something in my production scene that slow down rendering as the performance index is quite lower here
« Last Edit: 2018-01-17, 16:41:04 by Fluss »

2018-01-25, 10:20:20
Reply #14

maru

  • Corona Team
  • Active Users
  • ****
  • Posts: 12780
  • Marcin
    • View Profile
I am moving this to bug reporting as it would be "nice to" look into this and think if some more performance could be squeezed out.

By the way, have you tried rendering:
-locally on master with Cbitmap
-locally on master with max bitmap
-locally on slave with Cbitmap
-locally on slave with max bitmap?

What I mean is if there is any difference between the Cbitmap and max bitmap depending on network rendering on/off.
Marcin Miodek | chaos-corona.com
3D Support Team Lead - Corona | contact us