Sister blog of Physicists of the Caribbean. Shorter, more focused posts specialising in astronomy and data visualisation.

Sunday 30 June 2019

Unleash the render from hell

Ever since I got a VR headset I've been meaning to make more content for it but never quite manage to get started. Now I'm trying to make amends, but the process has been considerably... less smooth than I would have liked. So come, CGI enthusiasts of the internet, and let me regale you with tales of daring renders, dashing rogues, beautiful princesses and murder on the high seas ! Or, well, some of those, anyway...

(This is also going to be, if not a fully-fledged tutorial, then at least a highly practical guide with many hints and tips for those interested in doing VR content in Blender, complete with Python scripts.)

One of the first things I wanted to do was to recreate the ALFALFA Sky videos I did in the glory of 360 3D VR. Here's one of the originals for reference :


Clearly very well-suited indeed to VR. A bunch of galaxies, scientific authenticity, and an incredibly simple setup. Great ! Let's just update the file and turn on VR, right ? Wrong. In practise, this became something of an albatross, but unlike the Ancient Mariner I would happily see it dead.


Galaxies in 3D that are really in 2D

Making the 2D version was trivial. I even used the venerable Blender 2.49 for the first one, since I was much more comfortable with its Python operations and materials settings at the time. It just needs a few things :
1) The ALFALFA catalogue itself, found on the survey website.
2) A query to the SDSS to get the optical size of each galaxy. I had to split the galaxy table into seven because the SDSS doesn't large "large" queries, but honestly in this era 30,000 positions and radii shouldn't be large by anyone's standards, let alone the SDSS.
3) Using the catalogue file from step two I obtained images of the optical counterparts of each detection via this script. The optical size isn't terribly accurate but it's good enough.
3) A Blender script to produce a plane mesh for each galaxy, with a shadeless material and the correct image texture, scaled according to the optical size of the galaxy. The old 2.49 script can be found here and the version for Cycles in 2.79 (may need minor changes to run in 2.8) is here.
4) A constraint applied to each galaxy object to face the camera.

Okay, there are quite a few steps to follow, but none of them are very difficult or take very long. And that's really it. To do this in VR would, you'd think, simply be a matter of using the correct camera settings. Strictly speaking this is true, but there are a number of whacking great ugly-as-being-hit-in-the-face-with-a-sock-full-of-Ebola complications.


Galaxies in proper 3D

The first is that Blender only supports spherical stereo rendering using Cycles. And believe you me, using Cycles is necessary. Yes, you can render conventional side-by-side 3D in Blender Internal, but spherical stereo is different. The thing is that when you look around, the position of your eyeballs changes as you rotate. The Cycles Spherical Stereo mode accounts for this, but doing it in BI requires ancient eldritch knowledge the like of which has long since gone out of the world. Maybe if you sacrifice enough chickens to B'aalzathak, Demon God of Renders, you can make it work, but don't. No really, don't. Just bite the damn bullet and accept that spherical stereo rendering requires Cycles.

Fortunately it was quite easy to produce the script to make Cycles-compatible galaxy meshes and materials. You can find them in the above list.

You would think that's the hard part. You would think that learning how to do the Cycles materials and the new internal Python syntax in Blender was difficult enough, and that surely by now we should have reached the green fields and sunny uplands of VR utopia. You'd be wrong, poor fool.

Now normally to render VR content all you have to do is is enable "spherical stereo" for the camera and its associated options, and set which "view" you want to render. For some reason Blender still* can't render left and right views and automatically composite them, so you have to render each one separately and composite them together later. This is very easy - here's an example of how to do this using Blender's sequencer.

* Caveat - we'll get back to this later.

Where it gets complicated is not with VR rendering, but with a particularly horrid combination of local circumstances. Although my gaming laptop is powerful enough, I tend to use it quite a lot. So I don't like to use it for much in the way of system-hogging rendering unless it's not going to take too long (I'm also a little worried about leaving it on continuously). Fortunately, my desktop in work is even more powerful, so I use my laptop at home to create files and my work machine to render them. It can chug away for days on end without breaking a a sweat, if need be.

But I can't do that here. See, there was a stupid limit in Blender <= 2.78 where Cycles couldn't handle more than 1,024 image textures. That's been removed in 2.79, but the glibc version my work machine uses is too old to support 2.79. Can I fix it ? I guess so... but as far as I can tell this is a big job and could end up with me needing to reinstall Linux from scratch. And I really don't want to do that while I'm at the revising stage of a paper. Afterwards, fine, but not during the process. That would be silly.

(Quite honestly I haven't got a soddin' clue what a "glibc" is. When anyone tells me about libraries and suchlike, my eyes glaze over and I come out in a nasty rash).

So if a straightforward render on my laptop is out, and my work PC is unusable, what are the options ?


1) Get it rendering faster on the laptop
This is an incredibly simple scene - just a bunch of transparent textured planes. So absolutely all settings can be set to the bare minimum, with even Samples set right down to 1. And once rendering begins, it's blazing fast. The problem is not the objects themselves but simply their vast number, which Blender does not like. Not one bit.

One thing I notice about Cycles is that it spends friggin' ages on the "synchronising objects" stage before even starting to render anything. Typically the total render time is about 5.5 minutes or so (per half-frame, that is, for a Left or Right image, so the actual time is double that), of which almost all consists of synchronising objects, updating shaders and loading images. Only a few seconds are needed for the render itself. Frustratingly, in Preview mode there's no need to do the preliminary stages, but you can't screen capture that like you can with OpenGL view for BI. So you have to render.

Somehow I found that there's a way to significantly decrease the synchronising speed : create a blank file with linked (not appended !) copies of all objects. I have no idea how or why, but it works - it gets the total rendering speed down to maybe 3.5 minutes per half-frame. Rendering the linked file in 2.8 gets things down to 2 min 50 seconds per half-frame, with the synchronisation stage almost eliminated but the "updating shaders" phase taking somewhat longer.

The penalty for this is that it makes the file even slower to open. It was already slow, maybe 10 minutes or so, but now that approximately doubles. Not nice, but worth it for the render times. And in 2.8 the files open very much more quickly, so that's not much of a factor.

Other tricks are less successful. GPU rendering isn't an option due to memory limitations. Command line rendering is more annoying, because Blender insists on printing out the status of the object synchronisation. And because there are 30,000 objects, the print statements cause a significant slowdown, making it actually slower than rendering with the GUI. Googling how to suppress the command line output for Blender didn't help because all that comes back was for Linux. Eventually I realised I should be less specific and look for how to suppress terminal output in general under Windows. That turns out to be easy : just add >$null to the end of a command sequence in PowerShell (I don't think this works in the regular command prompt).

Does this actually help with render times ? Annoyingly, no it does not. Command line rendering is still slower, inexplicably.

Another trick I found online somewhere was to use PNG images instead of jpegs. This does reduce the synchronisation stage (for some reason), but significantly increases the updating shader time, and the two approximately balance out so it wasn't worth doing. But I found it quite frustratingly difficult to figure out how to access Cycles nodes (e.g. image texture nodes) via Python, so in case anyone wants it, here's a script that will go through a file and replace Cycles image textures with PNG versions (if you need to convert the files in the first place, this script can be easily modified - but watch out as it'll overwrite existing files).


2) Make it render in 2.78 and run it in work
Option 1 having only limited success, is there any hope at all for rendering the file using a version of Blender my beefy work desktop can handle ?

The answer to that one is "yeah but don't". The only way to overcome the texture limit is to render in passes, about 30 per frame. So I wrote a script that hides (or better yet changes the layer) of the most distant 1,000, 2,000, 3000 etc. objects, progressively rendering each set of galaxies such that compositing is a simple matter of alpha-overing every different pass. This does work. The problem is there's a huge speed penalty, such that I estimate the total rendering time to be about six weeks. And if anything went wrong, that'd be bad.

While removing objects from view (this works best on layers rather than hiding them) does cause a big decrease in the synchronisation time, there are two reasons this doesn't help. The first is the increased number of passes. The second is that each rendered image must be saved with transparency, and to get a decent-looking result (in 2.79 - this is not so much an issue in 2.8) necessitates using at least 15-20 samples, slowing down the rendering stage substantially. So this does work, but it's really not worth doing.

A somewhat related, less serious issue is that rendering an image with transparency and then overlaying on a black background gives different, substantially paler results than rendering directly with a black background. I don't know why this is, but I guess it relates to the emission node and how it computes colour values based on what's behind it.


3) What about a render farm ?
Well now I mean this should be a project ideally suited to such a thing. It's not a complicated file, it just has a bit of a long loading time but thereafter renders quite quickly. All it needs it a bunch of a computers thrown at it.

A very nice gentleman offered me the use of his SheepIt renderfarm account. SheepIt is an absolutely amazing service and once my work desktop is properly updated I plan to leave the client running in the background in perpetuity, unless I need to run something intensive. Unfortunately, my file is more than double the 500 MB upload limit. And they're not fooled by zip files easier, frustratingly (as the .blend file compresses down to just 200 MB). Sheepit doesn't allow Python scripts either, but nor would I want for anyone else to suffer the several hours it takes to recreate the original file from the catalogue (Blender doesn't handle large numbers of objects well, no matter how good your system specs are).

I briefly considered a truly insane plan : for every frame, create a new Blender file with sets of objects at different distances. That would get the file size down and possibly increase the rendering time nonlinearly due to the reduced file size needing less memory. While deleting objects didn't take too long on my work machine, on my laptop this could run for hours and still not complete (at least in 2.79). So I wrote a script to delete objects one by one and print its progress to the terminal. This didn't work in 2.79 (or rather it took about ten minutes to delete one object) but did work in 2.8 (deleting thousands of objects in about half an hour). But I realised this would be far too slow and far too labour intensive, so I abandoned this hair-brained scheme on the grounds of Nope.

I briefly tried a commerical renderfarm, Blendergrid. This looks very nice, doesn't have a file upload limit, and the staff were very helpful and pro-active in dealing with issues. I deem them to be first rate when it comes to customer service. It has an especially nice feature that you can upload a file to get a price quote, so I tried that. Alas ! After about an hour it was still trying to run the file, but giving generic error messages. But no biggie : it can email you the result when it's done... but not, it seems, if the file just keeps crashing.

Ultimately we might have been able to solve that one and do it commercially. How much it would have cost, though, we'll never know.


4) Work the problem
Alack poor Rhys ! At this stage we find him distraught, having tried every trick under the Sun and still stuck with an absurdly long render time. The only solutions appeared to be six weeks running on a work computer or 150 hours on a much-needed laptop. Most frustrating of all was that the render time was absolutely dominated by unnecessary processes, not complex calculations : once the preliminaries have been done, the render completes in about five seconds or less. Never have I felt so near and yet so far.

Then I had two crucial breakthroughs. First, because the camera is moving continuously through a three-dimensional cloud of galaxies, there are no obvious ways to cut out a single set of objects and render multiple groups separately for compositing later. That's why I had to use a script for my distance-based rendering approach earlier. But then I realised that actually, thanks to the nature of the survey, there is a natural break point where the render can be split :

This early version looks very dull nowadays, sorry about that.

When the camera is on the left side, galaxies in the left-hand wedge will always be in front of those on the right (and vice-versa), even given the spherical stereo camera. That means the render can be split after all. Now the split isn't even, unfortunately, but it's still substantial, with about 20,000 galaxies on one side and 10,000 on the other. That means that the smaller file is easily small enough to render on SheepIt, and the reduced size of the file for the bigger half makes it substantially faster to render on the laptop : about 1.5 minutes per half-frame. So my object-deletion script came in useful after all.

And second, this business of rendering left and right separately is (somewhat) nonsense. It's true that for some godforsaken reason Blender can't display top/down left/right renders in the rendered image window... but if you save the image to a file then it's correct. That's totally counter-intuitive but it works. Similarly when animated it works as well. The settings needed for this are simple :
Stereo mode : Stereo 3D
Left and right both enabled
Views format : stereo 3D
Stereo mode : top/bottom
Squeezed frame : enabled
The "squeezed frame" is important. If you don't do this it will render both images at the size specified and stick them together, so the final image will be twice the vertical size requested. That gives odd results, so the squeezed option should be enabled to keep everything in correct proportion. The image size should be square, i.e. accounting for the fact that you'll have two rectangular 2:1 images stuck together.

This discovery might not have been so fortuitous for most other scenes. But this one is massively dominated by the preliminary stages (synching objects and updating shaders), which, it turns out, do not have to be recalculated for each viewpoint. This means that the render time is very nearly halved, with the full image still only taking 1.5 minutes to render. And that gets me exactly where I want to be : the render isn't as fast as it could be in principle, but it's certainly fast enough for my purposes.

Later, I found that some simple scenes (like the opening text) render very much slower at 4k than at 2k for no obvious reason. So it turned out to be much faster to render the left and right images separately and composite them in the sequencer, so that sequencer setup file is still quite useful. Why this doesn't apply to the main rendering sequence I have absolutely no clue.

So that was the solution : splitting the file, rending part on a render farm and part locally, rendering both left and right together, and using 2.8. Whew.


Thirty Thousand Galaxies At Five Hundred Trillion Times The Speed Of Light

Without further ado then, here's the final result that took an insane amount of time :


The final rendering was bestraught with all the usual difficulties. I rendered it once, taking about 37 hours, and it looked very nice except that the camera froze for the last ten seconds so I had to render the whole thing again. Annoyingly that increased the rendering time non-linearly to about 48 hours. But that version worked. And the short sequence where the galaxies first appear made use of a modified version of the the multi-pass rendering script, so I got some use out of that after all.

The final sting in the tale was that my VR headeset (an obscure but affordable device by the name of Magicsee M1) decided, as I was about the view the movie for the very first time, that now would be a good time for a firmware update. It's had only one other update since I got it, and since it's obscenely obscure I presumed it was now dead. Apparently not. Foolishly I decided to accept the update since the render wasn't quite done yet... and that broke everything. All my apps, gone. The button to enable VR mode no longer working.

I was not happy. Not at all.

Fortunately it wasn't nearly as bad as it appeared. The built-in movie player simply had a different option to enable VR mode, and my apps were still there but took longer to show up. The VR mode button still enables VR in apps that need it. Christmas was saved, joy was unconfined, there was dancing in the streets, I decided the movie looked quite nice, and that was the end of the whole sorry business.

No comments:

Post a Comment

Back from the grave ?

I'd thought that the controversy over NGC 1052-DF2 and DF4 was at least partly settled by now, but this paper would have you believe ot...