Is High Fidelity best positioned for "cooperative rendering"?


In the Neil Stephenson’s Snow Crash Metaverse, the Black Sun is known for its enhanced rendering capacity compared to the rest of the “street”. That is why Sushi K goes there to show-off his sophisticated avatar, or why Juanita’s avatar facial expressions in the Black Sun can convey emotions that allow people to “condense fact from the vapor of nuance” and understand what is going on inside the other person’s head while in VR - in today’s parlance, allow the bridging of the uncanny valley.

I think the Ready Player One “Oasis” vision of the Metaverse is utopic, and I don’t believe a single company or platform will ever own the “metaverse”, just like no single company today “owns” the internet despite the fact that many have tried. However, creating a contained and defined “thing” like the Black Sun is possible, and I think High Fidelity today is well ahead of competitors in getting there.

The first prerequisite is the capacity to handle a large number of users, all inside a VR simulation that does not rely on pre-defined or pre-loaded 3d environments like on-line games do. VR-Chat, Facebook spaces and others are far behind, and are certainly realizing how hard it is to achieve what High Fidelity already has.

But the second most important requirement for a Black Sun experience is the capacity to leverage a form of cooperative rendering between cloud and client end-user-device (EUD). One that allows for cloud computing capacity to compensate the limitations of the client end-user-device hardware. Snow Crash clearly talks of the Black Sun computers allowing for this, as opposed to the “street” computer being limited - the difference is on the back-end.

This type of cooperative rendering is going to be key, as the upcoming generation of standalone headsets, Oculus Quest or HTC Focus, will be limited in their rendering capacity – and this is likely will remain a fact for several years if not forever.

The fact that Oculus is planning to restrict access to the Quest store is likely to guarantee a good VR user experience, which means only apps heavily optimize to overcome the rendering limitations will be allowed. Unfortunately, even if and when High Fidelity will be released on the Quest or Focus, the very nature of High Fidelity means that rendering performance will still be dependent on the content of the domain, not to mention users showing up with heavy avatars. In any case, even with powerful PC driven EUDs, there will always be margin to leverage shared cloud services to give users a better visual experience.

So back to the subject of this post, I believe there are several architectural reasons why High Fidelity is well positioned to implement a Black Sun type of cooperative rendering. High Fidelity domain servers “know” the nature and status of all the entities in the domain, “know” the position of the users, and know both in real time! Thanks to this, High Fidelity severs potentially equipped with GPUs could do things like:

a. pre-bake objects, not just within themselves as the current baking process does, but also “among” themselves within the scene. E.g. baking of shadows into textures for static light sources (e.g. Nefertari’s tomb).

b. Process meshes for LOD optimization, decimating and eliminating polygons based on the actual position of the entities within the world (e.g. hidden surfaces or embedded objects) but also with respect to common location of users. Intelligence derived from user location heatmaps could help refine this.

c. Combine objects to reduce entity count. For example combining objects that are parented, non-dynamic, non-grabbable, etc.

d. Give the EUD hints of what versions of optimized objects should be used, beyond what the EUD is already doing with it’s own LOD logic. Again, this thanks to the fact that the server knows where the user is, and what he is looking at.

e. And so on.

In essence, the idea is to look at what is happening during the rendering process in the EUD, and identify the kind of repetitive or non-time-critical rendering processes that could be performed on the server, either at asset load time, or within an acceptable latency time frame. A bit of this is already happening on the client, like already happens when the EUD is overloaded.

And BTW, cooperative rendering is not game streaming like Google Stadia. I am personally skeptical on game streaming, as it requires extremely reliable networks with essentially very low latency and no jitter, and this is not common. On the contrary, cooperative rendering is far more network resilient, as frame rates are still locally guaranteed, albeit with the occasional reduced quality when major change happens.

With over 2,500 ATP assets in Fumbleland, and with over 5,000 entities in the domain, I know a lot of optimization will be required in the future - we are fine for the current TV production objective, as it is cheaper to buy RTX2080TIs than to optimize 100’s of objects. But looking forward to a wider VR user base, I don’t want to start optimizing until I know what my target is, and this will not happen until I actually see how Fumbleland behaves on standalones HMDs. The reality though is that this optimization requirement will never end, and I am sure I am not the only one that faces this problem. We cannot expect ordinary users to have a team of 3D artist on standby as objects are added to a domain, or when HMDs change - this needs to be a feature of the platform.

I am also quite sure this is not as easy as I put it, but neither was getting High Fidelity to what it is now. And before the comments start, let me openly state that I have a superficial understanding of what is under the hood in High Fidelity – I have not coded since the 80s and my understanding of modern rendering technology is limited. But I have been involved VR since the mid 90’s (speaker at Avatars98) and have dealt with multiple evolutionary transitions of computing in my 40 years of professional IT work. I see cooperative rendering as another evolutionary transition - the further application of cloud technologies to VR beyond server hosting. But like many, I am still surprised what Neil Stephenson wrote in 92 is still happening and current – sort of like an anthropic principle of VR.

In any case, I am sure other forum members will have a better and more in-depth understanding of the subject matter, and will either complement or kill this idea :blush:


With the launch of 5G wireless internet I think this is going to become the default model for how VR worlds are served. 5G will have 1 ms latency and bi-directional network speeds of 20 GB/s - so realistically you can do 80 or 90hz with all the processing happening remotely.

That’s going to enable augmented and virtual reality with a very thin client (for low battery and weight) - basically all the processing will happen in the cloud.