Author Topic: DR render problem.  (Read 1297 times)

2018-08-16, 13:45:21
Reply #15

nicolasZ

  • Active Users
  • **
  • Posts: 16
    • View Profile
@dj_buckley : Could you please, do the same test like me to see your result ?

Master set 1 on #of theards in system settings  and 1 node

Thanks

2018-08-17, 04:58:46
Reply #16

thehay95

  • Users
  • *
  • Posts: 1
    • View Profile
I have encountered the same problem
chuyen nha tron goi, dich vu giup viec nha, dich vu chuyen nha chuyen nghiep

2018-08-21, 18:05:11
Reply #17

LorenzoS

  • Active Users
  • **
  • Posts: 53
    • View Profile
Hi all,
it seems i have similar problem on DR.
Workstation:Ryzen Threadripper 1950X on win 10, 64 bit.
3 node i7 2700k on win 7, 64bit.

The contribution of the 3 nodes is too low than it should be, sometime i notice also long parsing time.

2018-08-29, 16:39:15
Reply #18

nicolasZ

  • Active Users
  • **
  • Posts: 16
    • View Profile

2018-08-29, 20:30:50
Reply #19

cgifarm

  • Active Users
  • **
  • Posts: 55
  • Your Brand New RenderFarm
    • View Profile
    • CGIFarm
Hey guys,

I see your render times are very small and here is my experience with distributed render on our render farm:

We use distributed render on frames which take longer than 30 minutes to render. I specify "Render" because
it might take up to 12 - 20 mins just to load the scene in memory, and that doesn't count as rendering.

If you are working with a scene which hase lots of assets, all those assets must be loaded in memory, transported over
network (network can be bottleneck as well as you mentioned you are using only 100 MB switches.)

My recommendation is to use scenes which takes longer to render for distributed render, otherwise I recommend submitting
1 frame per machine if it's 6-10 mins renders.

When using DBR server you should not expect rendering twice faster and that's because the Master machine is sending "pixel" coordinates
for each node to render then it receives the data and adds that information to the main file. This is happening over network for each job sent
to the nodes. While this is very fast on local PC with the CPU communicatingwith RAM, over network is a different story.

Good luck!

Alex
Working on a Renderfarm Platform - checkout our website cgifarm.com and our cost calculator : https://www.cgifarm.com/renderfarm-cost-calculator

2018-08-29, 22:34:15
Reply #20

dj_buckley

  • Active Users
  • **
  • Posts: 184
    • View Profile
Makes sense, but when you only have two PC's which are connected in the same room through a small (but fast) switch.  I'd expect to see some pretty instant speed improvement when the node kicks in, especially when the node is twice as fast as the host to render a single frame.  I'm also rendering images which take 5 hours+. 

Admittedly I need to find a day where I can leave a render going on one machine until completion and then render again with DR on and a do a full comparison to completion.  It's just not always practical in a production environment.  It was obvious in VRay when you could see buckets, it's guesswork in Corona (any progressive renderer) without actually sitting there for a full render to complete as described above.

2018-08-29, 23:08:17
Reply #21

cgifarm

  • Active Users
  • **
  • Posts: 55
  • Your Brand New RenderFarm
    • View Profile
    • CGIFarm
Makes sense, but when you only have two PC's which are connected in the same room through a small (but fast) switch.  I'd expect to see some pretty instant speed improvement when the node kicks in, especially when the node is twice as fast as the host to render a single frame.  I'm also rendering images which take 5 hours+. 

Admittedly I need to find a day where I can leave a render going on one machine until completion and then render again with DR on and a do a full comparison to completion.  It's just not always practical in a production environment.  It was obvious in VRay when you could see buckets, it's guesswork in Corona (any progressive renderer) without actually sitting there for a full render to complete as described above.

The best thing is to see how many passes your DR node did and subract that from the total amount. Then you will know which node rendered how much of the image.

You need to keep an eye on the stats while it renders or setup a screenshot program to take screenshots every minute or so.

It's a paint to test, but ya, vray with bucket shows machine name while rendering, but this is different, because each machine gets the full image to work on for a full pass.

It's better to setup test scenes which are known to render for 1 hour on one machine for example then do test on DR. In my experience is like it will be about 80% of the capacity of the second machine because of the extra waiting time for network traffic and job management from the master. Takes time to do these tests, but it's good to know your tools and what to expect
so you know how to set the deadlines and costs for the clients, for me an extra 30% of what I estimate is my golden number which I add all the time, even if I do a great work planning, it's going to be always something which requires more attention if it's a new project.

I would also recommend having the master machine the strongest if it's also doing rendering, otherwise it won't have enough cores to distribute the job to the slave, and slave might be waiting much longer for a job.

In some situations you can setup the master machine not to do any rendering at all if you use more DR nodes, so it will just spawn jobs for the nodes and compute the final image.

Good luck!
Working on a Renderfarm Platform - checkout our website cgifarm.com and our cost calculator : https://www.cgifarm.com/renderfarm-cost-calculator

2018-08-29, 23:17:22
Reply #22

cgifarm

  • Active Users
  • **
  • Posts: 55
  • Your Brand New RenderFarm
    • View Profile
    • CGIFarm
Hi all,
it seems i have similar problem on DR.
Workstation:Ryzen Threadripper 1950X on win 10, 64 bit.
3 node i7 2700k on win 7, 64bit.

The contribution of the 3 nodes is too low than it should be, sometime i notice also long parsing time.

Hi,

Long parsing time can be when all 3 nodes are copying files at the same time from network. Please check your network speed, hard drive
or lack of enough ram on the network storage can make things worst.

If you are using the master as network sharing and also renders at the same time, it can affect performance a lot. Best thing is to use dedicated
sharing server for the assets and at least 1GB network connection if not 10GB fiber optics to make sure network is not bottle neck when doing DR.

To test network performance, you can test with 1 extra node, check parsing time, then add the other 2 etc.

You also need to take into the account that a render node will start a 3ds max instance and load the scene from scratch. On your workstation
the scene is loaded and you might be deceived that it starts rendering right away, and the others are still parsing.

Not sure what your real problem can be, but I just put some of my thoughts based on the little info provided.

Good luck!

Alex
Working on a Renderfarm Platform - checkout our website cgifarm.com and our cost calculator : https://www.cgifarm.com/renderfarm-cost-calculator

2018-09-03, 10:32:54
Reply #23

LorenzoS

  • Active Users
  • **
  • Posts: 53
    • View Profile
I Alex,
thanks for replay.
changing the switch 10/100/1000 a notice improvement.
I will check the other aspects you suggested.


2018-09-03, 11:27:02
Reply #24

cgifarm

  • Active Users
  • **
  • Posts: 55
  • Your Brand New RenderFarm
    • View Profile
    • CGIFarm
I Alex,
thanks for replay.
changing the switch 10/100/1000 a notice improvement.
I will check the other aspects you suggested.

You're welcome!
Working on a Renderfarm Platform - checkout our website cgifarm.com and our cost calculator : https://www.cgifarm.com/renderfarm-cost-calculator

2018-09-21, 12:07:04
Reply #25

Luis.Goncalves

  • Active Users
  • **
  • Posts: 33
    • View Profile
Guys. For me, something that helped a lot with this issue is going to the System tab and in the synchronization interval lower the value from 60s to 5s

I've done tests with this and here are my results

Synch 1s = 2:54m
Synch 5s = 3:02m
Synch 10s = 3:44m
Synch 60s = 5:26m

I've also sent the same image with the backburner stripes and the stripe that took the longest was  3:47m so anything lower than that (with the DR) means that the renderfarm is being well used.
« Last Edit: 2018-09-21, 12:38:02 by Luis.Goncalves »

2018-12-06, 08:07:11
Reply #26

Anadt

  • Users
  • *
  • Posts: 1
    • View Profile

2018-12-06, 13:24:20
Reply #27

hrvojezg00

  • Active Users
  • **
  • Posts: 217
    • View Profile
    • www.as-soba.com
Seems like we have the same problem. This wasn`t the case with Corona 1.7 or 2, but 3 has alot of issues with DR which is slowing us down alot! Corona team, please make progress on this issue.

Thanks,
Hrvoje

2018-12-17, 12:19:56
Reply #28

maru

  • Corona Team
  • Active Users
  • ****
  • Posts: 8239
  • Marcin
    • View Profile
We are working on improving DR. Unfortunately, as you know, network-related issues are complex, so it will take time.
Meanwhile, I would appreciate it if you could report your issues over at the support portal: https://coronarenderer.freshdesk.com/support/tickets/new
Do not forget to include DR logs and Backburner log as they may prove useful: https://coronarenderer.freshdesk.com/support/solutions/articles/12000002065

What you can try:

1) Make sure that you are running the same version of Corona on all PCs (e.g. Corona Renderer 3) and the corresponding version of DR server on all nodes (e.g. DR server 3). Keep in mind that in Corona Renderer 3 DR server is by default installed into C:\Program Files\Corona\DR Server and in older versions it was just C:\Program Files\Corona\

2) After installing newer version of Corona and DR server (e.g. 2 > 3) your antivirus and firewall software may treat it like new software, so you may need to include it as exception once again, otherwise it will block it. What works for some users is also adding firewall rules manually: https://coronarenderer.freshdesk.com/support/solutions/articles/12000050816