-
Notifications
You must be signed in to change notification settings - Fork 3
Using with Godot Engine
Tip
Even though you can interact with the Godot part using GDScript (such as the multiplayer), I highly recommend you use C# to do all the voice chat manipulation for performance sake.
Before we begin, make sure you've got the C# (Mono) version of Godot. Once that's done, you can proceed to installing the package.
To install OpenVoiceSharp, you need to first of open a terminal in the folder containing your project and type:
dotnet add package openvoicesharpOnce that's done, you will notice that you can now use OpenVoiceSharp. However, you will also notice upon running that your game/app will crash due to the fact the libraries are not in the folder.
For that, you need to make sure you have the library files in the same folder as your project, but first, build your .NET project. It can be done through Visual Studio (or Code) or the CLI.
The following files are (replace with Release if needed):
| DLL file name | Path |
|---|---|
OpusDotNet.dll |
/.godot/mono/temp/bin/Debug/OpusDotNet.dll |
RNNoise.NET.dll |
/.godot/mono/temp/bin/Debug/RNNoise.NET.dll |
WebRtcVadSharp.dll |
/.godot/mono/temp/bin/Debug/WebRtcVadSharp.dll |
opus.dll |
/.godot/mono/temp/bin/Debug/runtimes/windows-x64/opus.dll |
rnnoise.dll |
/.godot/mono/temp/bin/Debug/runtimes/windows-x64/rnnoise.dll |
WebRtcVad.dll |
X:/Users/[user]/.nuget/packages/webrtcvadsharp/[version]/build/win-x64/WebRtcVad.dll |
Warning
You need to also make sure those files are present where your game is built & ran. If you need, you can use a batch file to automate the process.
Once this out of the way, edit your .csproj file and add the following line to the <PropertyGroup>:
<CopyLocalLockFileAssemblies>true</CopyLocalLockFileAssemblies> This is pretty annoying, I know. But I have not found an alternative so far.
Now that you've done the first annoying part, lets take a break from the annoying stuff and start programming!
To begin, make sure to create a C# file containing a node or singleton able to manage the voice aspect of things. A barebone and example project can be found here.
Here you have two choices. Either using the capture and recorder classes and deal with the annoying stuff (just remember to make sure to convert in the right format later, or use the BasicMicrophoneRecorder class that does the job for ya.
For simplicity's sake, I've preferred to use BasicMicrophoneRecorder, but using the built in Godot stuff is a viable option. I recommend you check out how another package does it (OLD but works); it should not be that hard to use.
Advantages of using BasicMicrophoneRecorder over Godot's native recording API:
- Resampling is required: it is not automatic and it supplies Vector2[] float samples, so you are losing performance from converting to the right format.
- Not straightforward to use: you have to collect samples manually and they do not fit the exact buffer you wish to capture
-
Is not multithreaded:
BasicMicrophoneRecorderruns on a separate worker thread for performance.
Again, remember to use the right format and to convert to 16 bit PCM to encode, and to store a float[] or byte[] array that will contain your buffer for memory efficiency depending on what you're doing.
The following samples are going to come from VoiceChatManager.cs.
Note
For simplicity's sake, I used the default Godot basic ENet high level multiplayer to handle this job. But as shown, this event callback is not called in the main thread meaning that it must be queued for the main thread to process and send. That behavior can of course be customized according to your needs.
// microphone rec
MicrophoneRecorder.DataAvailable += (pcmData, length) => {
// if not connected or not talking, ignore
if (!Connected) return;
if (!VoiceChatInterface.IsSpeaking(pcmData)) return;
// encode the audio data and apply noise suppression.
// you cannot call RPC if its not the main thread, so we queue it.
QueuedData.Add(VoiceChatInterface.SubmitAudioData(pcmData, length));
};
MicrophoneRecorder.StartRecording();Tip
I highly advise you use another thread than the main thread to do this for performance. But because this is meant to be a barebones boilerplate for you to implement, I have not considered crucial performance in mind. However the following example should be fast and efficient enough for most CPUs.
Noting however that, the memory efficiency could be better as the float samples buffer per player/peer could be pre-allocated (Using VoiceUtilities.GetSampleSize()) and stored somewhere else like in a dictionary.
The workflow should look a little bit like this:
VoiceChatManager node
|_ AudioStreamPlayer [named via player id] for each peer with as stream the continuous playback

And of course, the workflow can be customized to your needs.
Lets cover this via bullet points:
To play back the audio we receive, we have 3 major questions to answer:
- How do I play audio in Godot?
- How do I supply PCM samples (voice data)?
- How do I play back the samples?
For the first question, you can play audio using an AudioStreamPlayer. These instances can contain a Stream, which can be in multiple types.

But for this purpose, we need to use the AudioStreamGenerator class, which will allow us to supply raw audio as it comes in at a specific rate and length, which is exactly what we are looking for.

I handle this in the CreateStreamPlayer() function:
public void CreateStreamPlayer(long id) {
AudioStreamPlayer streamPlayer = new() {
Name = id.ToString(),
Stream = new AudioStreamGenerator() {
MixRate = 48000,
BufferLength = 0.02f * 60.0f // increase if needed
},
Autoplay = true,
Bus = "VC" // you can change it to what you want
};
AddChild(streamPlayer);
streamPlayer.Play();
}We are procedurally (via code) generate those audio streams as players connect and when we join (to sync players that were connected before us).
Now, we have to get into how we create the AudioStreamGenerator and its properties:
MixRate = 48000 (for 48KHz)BufferLength = 0.02f * 60.0f <- this being adjustable later on.
Warning
Having a buffer length as low as 0.02f (20 ms) will cause audio cracking as the CPU wont be able to keep up and it will cause audio cracking. This is why we allow it to have a bit more samples in our buffer. This will cause slightly more latency/delay but it will allow to have a smooth and stable audio playback. You can increase this depending on the CPU usage/performance you're getting.
Once we created our AudioStreamPlayer and our stream that will handle the playback, we can then add it in our VoiceChatManager node, and then play.
Make sure to use AutoPlay in the stream player to automatically play the samples as they arrive automatically.
I then also assign it to the "Bus" VC (that I will get into later), that will allow us to apply real time effects.
And now, continuing how and when we create those streams:
Multiplayer.PeerConnected += (id) => {
CreateStreamPlayer(id);
};
Multiplayer.PeerDisconnected += (id) => {
if (!HasNode(id.ToString())) return;
RemoveChild(GetNode(id.ToString()));
};
Multiplayer.ConnectedToServer += () => {
Connected = true;
int[] previousPeers = Multiplayer.GetPeers();
for (int i = 0; i < previousPeers.Length; i++) {
CreateStreamPlayer(previousPeers[i]);
}
SetStatus("Connected");
};Great, we have prepared our streams to feed them voice data, now, we actually need to feed the data, however:
- Godot takes in float32 PCM data
- I use stereo and Godot takes in stereo
- The playback takes in Vector2 frames
When decoding the packets, we see that we get a byte[] PCM array, even if in float32 PCM. Which is why we have to decode the data and change the format to switch to floats.
Now, Godot takes in stereo. Which is okay since we record the microphone data in stereo, but what if you used mono? Simple. When you read a sample, you have the LEFT and RIGHT channel no matter if the audio is mono or not in the form of a Vector2.
Stereo:
// left channel
sample.X = samples[i];
// right channel
sample.Y = samples[i + 1];If you wanted to use mono audio, you would have an array half the size, meaning you'd have to double on the left (mono) channel.
Mono:
// left and right channel (we use the same sample)
sample.X = samples[i];
sample.Y = samples[i];If you wanted to normalize and get the average of both channels for whatever reason, you can always use (leftChannel + rightChannel) / 2.0f.
Now, here's the example code:
// decode data
(byte[] decodedData, int decodedLength) = VoiceChatInterface.WhenDataReceived(encodedData, encodedLength);
AudioStreamPlayer streamPlayer = GetNode(senderId.ToString()) as AudioStreamPlayer;
var playback = streamPlayer.GetStreamPlayback() as AudioStreamGeneratorPlayback;
if (playback.GetFramesAvailable() < 0) return;
// step 1, convert to float32
float[] samples = new float[decodedLength / 2]; // half it
VoiceUtilities.Convert16BitToFloat(decodedData, samples);
// step 2, convert to vector2
Vector2 sample;
for (int i = 0; i < samples.Length; i += 2) {
sample.X = samples[i];
sample.Y = samples[i + 1];
playback.PushFrame(sample);
}And to finish, we grab the playback we created earlier check if its not full, push the current frame, and that's it!
Tip
I highly advise you create a vector2 that you re-use later for memory efficiency, instead of creating a new one for each frame.
Godot has its own way of handling audio effects or audio mixing in general with something called Audio Buses via the AudioServer. It is a great and powerful tool that can be accessed at the bottom of the screen by default and can be set up there.

For this example, I created an example bus with the reverb effect (disabled by default) via the editor, but that can also be created and set up via code:
// create reverb effect
int reverbBusIdx = AudioServer.BusCount;
AudioServer.AddBus(reverbBusIdx);
AudioServer.AddBusEffect(reverbBusIdx, new AudioEffectReverb(), 0);
AudioServer.SetBusName(reverbBusIdx, "VC");
AudioServer.SetBusEffectEnabled(reverbBusIdx, 0, false);
ReverbBox.Toggled += (toggled) => {
AudioServer.SetBusEffectEnabled(reverbBusIdx, 0, toggled);
};Here it is pretty straightforward, I make a new Bus that adds the reverb effect and I enable/disable it whenever the reverb box is toggled.
NoiseSuppressionBox.Toggled += (toggled) => {
VoiceChatInterface.EnableNoiseSuppression = toggled;
};Very straight forward, just toggle EnableNoiseSuppression from the VoiceChatInterface on or off.
Tip
The sample logic can be applied for any kind of effects you wish to make, for instance, if you make proximity chat and your players are in an echoey room, you can implement the Reverb effect shown here. If you wish to see more information about audio effects (with any engine), click here.
Desktop.2024.04.12.-.02.33.04.14.mp4
There you go! You should now be able to have fully integrated OpenVoiceSharp in no time. Remember to check the barebone and example project in case you're lost here.