"criticism" received on 360 avatars record interacting on the same domain


While sometimes being myself quite critic on some hype still existent on VR and in HF itself, and being aware of various intrinsic limits in the VR/HF over enthusiasm, I received an interesting “criticism” recently from a friend.

His skepticism is about the meaning and value of having 360 people gathered together in a single space constitutes really something great or not.

The first criticism was about the fact that he noticed that over 90% of the people was actually using desktop version and not really in VR. The proof in his view was that when Philip asked to raise the hands only a very few amount of participants did raise it in a “natural” way.

The second criticism was that other games like War of tank have reached 10000 concurrency, and vrchat has normally 7000 concurrent users…
from his observation:

"How many of us were with HMD? None :wink: and then just see the movement of the hands that follows the controllers. Rosedale’s is a fluid sign that it was with HMD. When asked to raise their arms few did so immediately and fluently. The rest simply jumped by pressing on the space bar. Finally, similar cases, when VRChat has made the attendance record (not on the same server), many (the majority) have confessed to being via Desktop. And there were many many more than the HighFidelity test. …

War of tank still has the record of 10,000 tank avatar at the same time. Discord with the voice comes safely to 1000. Where is the innovation exactly? Not to mention SpatialOS that can easily handle much higher numbers but unfortunately it is not having much commercial success. It’s a great result but we have seen its limits … "

Since I am not myself an expert in such VR debate, what do you think about these criticisms?


I doubt there is 10000 tanks on the same server. Maybe 10000 over their network, but not playing in the same map.

Same thing for Vrchat… there is not 7000 people on a same instance (it’s not going over 40) Sometime it lags with just 20.

This is applicable also to: Sansar, Second Life, Sinspace… none of theme can manage such quantity of people in a same location (without split in instances)

Difficult to say if the % of desktop / VR make a real difference on the achievement. I suspect it is negligible.


actually i’d be curious to know if there are some statistics of games MMO in the same category of the record currently held by HF.
If the 10000 concurrent tanks is a false number, which is the real max concurrent un-sharded limit it obtained? The reference article was this one https://worldoftanks.com/en/news/general-news/world-tanks-sets-new-guinness-world-record/

which is a bit old (2013) which is stating:

On January 21st, 2013, we set a record for “Most Players Online Simultaneously on one MOG Server” when we hit 190,541 players on the RU2 server (Russia).

The previous record for this category, that also belonged to World of Tanks, was established on January 23, 2011 when we hit 91,311 players less than half of the new record!

But I cannot unwrap the intrinsic technical details and subtlieties.


I can kind of shed some light on the numbers.

I can’t attest to how VRChat works, but under the hood for High Fidelity is that all kinematic data is transmitted from the client to the server to be broadcasted to everyone else. Things like Second Life rely on animation downloading, which is why at clubs, you will have moments where the avatars are almost suspended in time while your viewer grabs the animation.

The reason this somewhat matters is because of key fact that all the data transmitted from a user in desktop or on mobile is roughly the same as an avatar in VR, since everyone more or less has the same skeleton. This also means that if someone had written a custom animation, they could play it from anywhere, including their own HDD, and everyone could see it in real time. The key challenges with that data, which is mostly processed by the Avatar Mixer, is how to disperse all that information quickly and efficiently to everyone else. In fact, this was the first key thing brought up during the first stress test, along with the audio being the second issue (which resulted in having an outside Zaru party where everyone would mingle and occasionally cheer).

That all being said, your friend does raise an interesting point: how these optimizations fair against an unpredictable user in VR versus a desktop user is worth noting. Things like the wave challenge the server’s ability to handle mass changes all at once of a rather great scale.

However, I would like to throw something extra to this idea: while more niche, one could say that having users with full body tracking would also be a unique stress test in itself. It is one thing to have the usual tracking of the head and hands, but full body tracking means having to possibly deal with even more immediate changes that may not have been accounted for. Imagine if the “wave” was instead a jump in the air. Users with headsets and nothing more would just pop up into the air, with the legs having no idea what to do. Users with at least 2 trackers (Vive) would be able to suspend into the air and have their knees bend as the legs aren’t at the same height. Users with 3 trackers could even have their knees in front of their virtual bodies as matched in VR (assuming a higher jump). The examples only go on and on with such, posing a whole new challenge to the testing.

So perhaps something that could be introduced for desktop users is something to help REALLY stress the system. Just like how there was the feedback app, perhaps a set of ‘tools’ for the user to use to really try to bring the house down, like a bone rotation randomizer. This would give desktop users the ability to impose as much, if not more, stress compared to a VR user.

To close though, the current culprit at this time seems to be two things: audio and download balancing. A key thing that could help with the latter is to have avatars near the user load first as a higher priority than ones further away, which would make sense. Another could be to see (assuming this can be done) if the avatar being downloaded is KTX baked, since the mesh data is small and the texture data can be streamed in over time.As for audio mixing, it will be interesting to see what can be done to improve it. On idea is the whispering feature, which I don’t know all the details for, but it may be interesting to have two audio mixers as an option: one as the global and the other for whispering from user A to user B, giving the primary server the information in single, smaller stream containing all the ambient information when it is used. This could even impose another kind of test, where a game of telephone is done to stress this concept out.


We can’t really compare a dynamic worlds (HF, SL, Sansar, VRchat) with MMO games (WOW, WOT, swtor, Entropia Universe…) where the environments are not modifiable in runtime. (outside than in client side)


It wasn’t a number of people in hmd record
Show me a photo of more people together?


Dredging from my questionable memory, non-VR MMOs and environments like second life seem to max out at about 50-200 users per server, with things like Unity and Unreal based MMOs running far less.

Those products tend to run 10’s, 100’s, or even 1000s of servers to gain the large concurrency numbers you hear about.


Last I recall, WoT matches usually are around 30 tanks in a single game instance (as matches are 15 vs 15). So its not really 10000 on a single game instance.

Only the other than login and chat server would be able to handle those numbers, where as modern mmos are usually instanced to have multiple servers area and have on the fly switching between them.

Only game I know to have 7.5k in a single game instance running over hundreds of servers is Eve Online, but they can do fun stuff with Time dialation and slow down time to process all that mess :smiley: (look up Battle of B-R5RB) which you cant really do when VR headsets are bound to meatspace time instead of ticks


Pfft, haven’t you seen the matrix?!

In all seriousness, I have to agree. Also, Second Life sims are only limited to 110 agents and, again, any animations are not done in real time, as they’re based on pre-recorded animation files.

As for MMOs and what not, they’re really simple to write if you know what you are doing, since when it comes to players, depending on the game, you mostly just need to update positions, aiming directions, and any delta changes since then along with any other usual game data that needs to be included that other players need to know, assuming they’ve changed.

As for Eve, the reason it can use time dilatation is due to how it handles lag such that reaction timing is not relevant. Secretly, it only runs at 1 FPS so dilation is performed when calculations take longer than 1 second. This is why most guides state users should round up anything that looks like a fraction, since that’s what the server will be doing anyway. This design choice improves scalability, which also works out for the players, who would also be in situations where they’d need to perform decisions with more involvement, so servers get an excuse to take a break and players get more time to make decisions. That’s why everyone says time dilation is like having a big party in space.


An architectural description of World of Tank can be viewed at these links:



Other audio server (also open source) can manage 500/1000 peoples:

500 for Discord: https://www.reddit.com/r/discordapp/comments/466ys9/maximum_number_of_users_in_a_voice_channel/

1000 for Mumble Server: https://gaming.stackexchange.com/questions/3115/what-is-the-max-slots-for-a-mumble-server

Mumble does not have a slot limit you could not change yourself.

The number of slots of a server is set by the admin of the server by the users setting. It is not restricted by Mumble itself or the (FOSS) license .

Aside from this setting of Mumble server, the actual usability of slots may depend on:

  • Systems ulimit (Which should only start causing problems with > 8000 users - and then you can adjust your systems settings.) - (Actually, I’m not even sure this would be a problem; but I expect it to not be just with multiple vservers.)
  • Bandwidth
  • CPU usage - but you will very probably hit a bandwidth limit before hitting a CPU limit

One success story is an eve online guild using Mumble with > 1000 concurrently connected users for meetings, raids and whatnot. They have reported Mumble running perfectly fine even at peak times .

In the back of my head I remember of some issue with a vserver hoster that was reported, which had bad I/O performance. I am not sure if that was just an admin usability issue though, when managing a lot of user accounts which then lead to delayed messages or audio. An issue of bad I/O vserver hosting then though.


SpatialOS example:

At the moment of writing, the as-yet-unnamed project is scheduled for release in 2018, with the first playable content slated for the Spring of that year. The game is set to use SpatialOS to help empower a game world that will support:

  • 1,000 concurrent players occupying a detailed shared world
  • A huge 12km x 12km explorable, graphically realistic and highly dynamic environment, enabling advanced tactical gameplay
  • Strong character progression, social hubs and global-scale player-driven narrative that will shape a unique MMO experience
  • An additional last-man standing Player vs Player arena combat mode, with up to 400 players in direct combat.
  • Unprecedented world simulation and effects, including environmental destruction, roaming wildlife, dynamic weather, foliage displacement, tracks, blood trails, fire and water effects, fully immersing players into the experience of being the hunter – or the hunted.


How good, or more poor is the sound quality from mumble ? is it hifi quality ? Or more Old POTS telephone line quality. Mumble use mabye less bandwidth to.


Mumble use (I think also for Discord):



Opus can handle a wide range of audio applications, including Voice over IP, videoconferencing, in-game chat, and even remote live music performances. It can scale from low bitrate narrowband speech to very high quality stereo music. Supported features are:

  • Bitrates from 6 kb/s to 510 kb/s
  • Sampling rates from 8 kHz (narrowband) to 48 kHz (fullband)
  • Frame sizes from 2.5 ms to 60 ms
  • Support for both constant bitrate (CBR) and variable bitrate (VBR)
  • Audio bandwidth from narrowband to fullband
  • Support for speech and music
  • Support for mono and stereo
  • Support for up to 255 channels (multistream frames)
  • Dynamically adjustable bitrate, audio bandwidth, and frame size
  • Good loss robustness and packet loss concealment (PLC)
  • Floating point and fixed-point implementation


Standard quality:


The voice quality setting is all about balance. You don’t want to set it too low or you will get lots of lag, and distored voice but you don’t want it set too high or some of your individual users will have lag. It is advised that you find a sweet spot that works for all of your users, but 72000 is generally the defacto standard and you won’t have any issues or lag at that setting. Setting the voice quality to anything above 130000 won’t increase it any further and that is because that is the maximum that mumble can handle. You can read more about this setting from the official Mumble site https://wiki.mumble.info/wiki/Murmur.ini#bandwidth


Here is the main difference though: Mumble and discord audio do not have natural positional and rotational audio streams. (by rotation I do mean if someone is facing you or not)

Sure mumble can do ‘positional’ with supported game engines that support it or have it modded, but it basically just controls the audio input from other users locally, in the client receiving the data. Source where this is done.

I have never been in a discord or mumble where 300+ users are speaking in the same room at the same time, because as soon as you have more than 30 active talkers, tends to be get out of control.

Even In Eve such A server tends to be FCs (Fleet Commanders), as shown how a voice chat sounds like with a small fleet of 60+ or separated into wings for battle operations / never in a single room. So yes you can have 1000+ users in an alliance server, but it tends to be separated into Alliances > Corps > Fleets > Wings. So you tend to have 3-5 people doing quick reports with 1-3 people being constant commands. If you had that many people in a channel talking it would be quite inefficient.

So I feel like who ever said 500 users on discord might be overestimating the amount that were on the voice chat, or it just had a handful actually talking or issuing commands due to no positional info.

So: Rooms are the main method how Mumble / Ventrillo / Discord control dataflow. An you could technically have as many people as you want in the room as long as only a few are talking. But that would be like comparing Mumble/discord audio streaming to twitch video streaming as everyone is listening to a choice few users and using the low latency to receive commands.

High Fidelity instead demonstrated that you can have hundreds of people having their own conversations at the same time in the same “room”, and still be able to somewhat hear each other even if you are not close, I was able to pick up people I recognize from the crowd just fine by how the speak.

However their approach will use more CPU power from the server, as each user has their processed audio stream.

So if you have 300 users, you have 300 separate streams that are calculated from the 299 other users position and rotation, with each having their own attenuation settings predetermined by their position. This is baked into a single audio stream that the client decides how this is then heard depending on the audio zone (reverb, echo etc).

You cant hear what someone is talking in the distance if the attenuation decides you cant hear them regardless of what you do to the client. Only way you can hear them is to get closer…

This on the positive side makes sure that low latency audio that is is in sync with everyone elses audio, and everyones bandwidth takes less of a tax due to them only receiving a single, combined stream regardless of the amount of users. But it does mean it is really heavy on the server, especially if you want to process it to those 300 clients witout latency

Now regarding SpatialOS. They didnt launch nor have I heard from them recently. It sounds more like marketing speak without being able to test or use their tech.


SpatialOS works fine. For example in single “giant” Shard: https://store.steampowered.com/app/322780/Worlds_Adrift__Early_Access_MMO/

and with other multiplayer game: https://improbable.io/games/made-with-spatialos


Ah Worlds Adrift is Spatial OS?

Interesting I though it was just… Oh its Unity running on Improbable. This is why I havent heard about it that much, as its more of a network architecture thing for any game engine.

Regardless, I have yet to bump more than a handful of players in it, so I havent seen the 1000 of users in a single island while talkin in voice. Even if a single shard, you can have multiple servers running instances in the shard, because you cant have a single server running everything, not even high fidelity does if youve seen the AWS monsters they run.

Now I get it, SpatialOS seems to be more of competing with Amazon Lumberyard (without the engine pairing that lumberyard has) than anything. And thats pretty much saying “Throw more servers” and them handle a problem, for Game Developers without much Backend or DevOps experience in their teams.

Lightening the load for the client, and instead putting it on the “cloud” or the servers them selves. Thats basically what High Fidelity’s domain servers do, but its something we can host our selves… which is distributed, just like what improbably says they do.

It still doesnt feel like this has actually been demonstrated to the way High Fidelity has. Only been marketed in private tests, just like how high fidelity claims they’ve tested to 600 users, yet during load test have issues around 300+ with real users.

Granted though, Id love to see another game tackle the same issues regardless of the engines used, but I doubt any would match in architecture or goal.

Show me another engine where everything in inworld, have positional and directional audio, joint streaming and have 300+ users in the same domain, area talking and hearing each other? id love to see it.


this is allways the trick in demos what works in the lab and what works with trouble some users and their dodgy internet and flakey pc’s is somthing different
I kinda think its not doing badly considering what they put together


I agree that simple sound audio conference (even if spatial) is not exactly the same kind of tests HF seemed to prove to be able to handle. So to me mumbler or discord seem unrelated.

Regarding other MMO it would be interesting to have some references who are not just internal claims, but real events. Is HF 360 concurrency record an internal claim itself, or can be seen as an objective metric?

It’s not that I’m biased towards HF. HF can be actually not the best actor on the stage, but what I’d like to get is some kind of fair comparison for similar MMO products, (this was actually the original curiosity pushing me in asking this post). Maybe there is no clear answer and everything is to be considered quite fuzzy?

So far it seems that there is not anything already set up. Maybe this is where it can be interesting to ask HF technicians if there is an objective and fair metric acknowledged by third parties where HF performances can be shown as in the first place, 2nd or in a category where there is only HF and nobody else :slight_smile:
This can be helpful to avoid hype and subjective opinions.