Voice volume still not at a normal signal level


#1

I still have problems with the voice volume that is to soft on standard domsin config. Souns are many times to loud.

It’s still something that need to be fixt, the voice volume is to low.
In desktop mode is triple worse. in hmd it’s fine if you put the steam volume to the max. what for most other games way to loud is.

High Fidelity need to amplify in the software the voice volume to a more normal level. It really get time this get fixt !

It’s to soft, it’s better to lower it a bit instead of put everything above max and still ared low on voive (but not sound)


Audio attenuation groups
#2

Read a few acoustics docs. The near field for voice audio in humans is approximately a 1m sphere. Then power drop-off is 6dB per distance doubled from the near field edge, so from the source it is: 3m -6dB, 5m -12dB 9m -18dB, 17m -24dB. This assumes no directionality and no large reflective surfaces.


#3

That’s mabye RL numbers. That does not work complete in VR.
Voice is just default not enough amplified to get to a correct output signal on good cards. At the other side spunds are wy to loud.

If i did not add because high fidelit a compressor limiter in the sudio traject before it go to speaker or headphone it would be really soft and not useable.

The only need to amplifiy the voice signal to a higher level , the do not need to change any other settings


#4

I do not believe HF’s audio fall-off curve follows what I wrote. I believe they drop off too fast, and I will be measuring that in the next day. The fall-off in RL is shallower than what I wrote previously too because usually people are in rooms, and the wall reflection creates directionality, a fall-off reduction factor of 2.

My point is that there are a few things one could do to improve audio listenability. One is to implement a near field. It could be larger than 1m too. People tend to stand further apart in VR even when wearing HMD gear, so a 2-3m virtual near field where the fall-off is near zero would be great. Then the falloff could follow the 6dB free-field all-off rule. Just adding that one parameter, near-field distance, for zones could make a huge improvement in how people cold model their rooms or areas. I hope this is given consideration. @Philip

And yes, the other problem is the odd case where sounds injected by non-voice always seem so much higher than voice. Is it that the source levels for voice or other sound producing entities incorrect?


#5

Here’s a mockup domain setting. Similar changes would go in the zone settings:


#6

Except that the user cannot change itb when he’s there in the domain.
The always pointed to that setting, but like you did say before there’s other change needed.


#7

There are multiple items that should be addressed but the most important one I believe needs attention is the mix and the levels relative to the avatar’s ear. A simple volume level is not enough because, as you, others and I have pointed out in the past, there are big problems in mix between sources.

The last tie I wrote and talked about this I suggested having different mixing levels based on sound type, but I now believe that bit of new complication, which would run throughout the entire audio mixing system, could be dispensed with by just implementing a higher fidelity mixing model. It needs, first, a near field model. Then it could use directionality effects, though that too could be ignored if the near field is a programmable distance. Then, finally, when the mix is closer to something realistic, an overall volume level would be nice.


Streaming Audio in highfidelity
#8

What about changing the volume of sources based on where you (the receiver) are looking? Inotherwards, if there are a lot of people talking, and you looking at one person, their audio is transmitted to you unattenuated. This is something we can easily do in the virtual world (because the audio mixer knows exactly where you are looking).

Having played with different attenuation/distance strategies, a couple of very difficult problems arise pretty quickly: One is that for our testing everyone seems to have a different preference, and therefore we’d likely need knobs that everyone could almost constantly ‘ride’, which would have the risk that we’d lose a large proportion of people not interested or able of climbing that learning curve.

A second problem is that with only 16 bits of dynamic range, we are nowhere near the capability of the ear, and therefore no matter what we do, the audio will either be ‘always too loud’ or ‘inaudible almost immediately at greater distance’. We cannot take advantage of the huge dynamic range of the actual human ear, so it is very difficult to position the much narrow 16-bit ‘window’ in a way that makes everyone happy.


#9

Without some attenuation that would lead to another kind of sonic chaos. I like the directionality approach but it would be best to follow how acoustics works with it. Directionality is a power multiplier (it can be defines as an attenuation reducer).

I do believe it I possible to come up with an attenuation curve that fits the 95dB range. It is a matter of deciding where infinity begins. It is a problem that can be characterized with nearness issues (near field). Things very near need a near zero or zero drop-off. This is indeed how things work in RL given where we usually find ourselves. Then we need attenuation beyond the near field, that is presently programmable.

The biggest problem with the too-loud phenomena is twofold. One is that the audio injection of voice needs some auto-levelling. We have people with very different mike gains, different distances between mike and mouth, and also that at close up distances, it is entirely easy for a person to generate well over a 60dB difference. I just tried that with my Reed sound meter, placing the windscreen ball next to my mouth. Speaking softly at 50dB, loudly to 80dB and shouting to 120dB. That kind of 70dB dynamic range, which is typical for near mouth pickup typical of gamer headsets, needs up front levelling or compression in the interface app itself. It ought not range more than 20dB.

The second problem is that of entities injecting audio. Sound effects might be too loud although that problem is more of a designer issue. It has always existed even in SL because people often adjust effects to be heard on small speakers. Put on headphones and it becomes all too clear how overly loud those SFX can be.

I think that just a few knobs could do it:

  • A near field distance between sound emitters and an avatar’s ears
  • An infinity distance where the attenuation goes to near zero (-95dB for 16-bit audio)
  • An attenuation coefficient that applies after the near field distance and up to the infinity distance.
  • A directionality factor (typically 1/2 attenuation reduction aka things you look at seem louder if outside the near field)

I think one can go quite far with that. The question is how difficult is it to set that up?

[edit] brain fart, it’s 95B level range (48dB power range) with 16-bit audio


#10

Most @Balpien.Hammerer did already say.
The problem with adjusting voice volume based on the direction is that when the speaker is right and something happen left that you need to look at , the volume chane into the wrong direction and you hear the speaker less good.

On top there are people with microphone signals that are already to loud. With this idea it sounds that this problem only get worse.

In desktop mode my exteral microphone is routed true compressor limitter so the signal is stable. Sadly i cannot route the vive to it.