Getting relative loudness value from RecordingManager.StartRecordingStream? #430
-
|
Hello, My goal: In a realtime session, get user microphone input and determine if it's mostly silence or not so that the UI can be updated based on 'loudness' What I currently have: I edited RealtimeBehaviour.cs to be similar to the following: private async void RecordInputAudio(CancellationToken cancellationToken)
{
var memoryStream = new MemoryStream();
var semaphore = new SemaphoreSlim(1, 1);
try
{
// we don't await this so that we can implement buffer copy and send response to realtime api
// ReSharper disable once MethodHasAsyncOverload
// Note: I'm using the PCM encoder instead of the Wav
RecordingManager.StartRecordingStream<PCMEncoder>(BufferCallback, 24000, cancellationToken);
...
...
...
if (bytesRead > 0)
{
IsAudioDetected(buffer);
await session.SendAsync(new InputAudioBufferAppendRequest(buffer.AsMemory(0, bytesRead)), cancellationToken).ConfigureAwait(false);
}
...
...
}
private async void IsAudioDetected(ReadOnlyMemory<byte> bufferCallback)
{
await Awaiters.UnityMainThread;
if (userInputAudioClip == null)
{
// 16000 comes from RecordingManager's settings
userInputAudioClip = AudioClip.Create("Microphone clip", 16000, 1, 16000, false);
source.clip = userInputAudioClip;
source.Play();
}
userInputAudioClip.DecodeFromPCM(bufferCallback.ToArray(), PCMFormatSize.SixteenBit, geminiInputSampleRateHz);
float[] spectrumData = new float[sampleCount];
source.GetSpectrumData(spectrumData, 0, windowType);
// Treatment: log, multiplier, offset
// This formula is taken from the Medium blog post
for (int i = 0; i < spectrumData.Length - 1; i++)
{
spectrumData[i] = Mathf.Lerp(spectrumData[i], (logScale ? Mathf.Log(spectrumData[i]) : spectrumData[i]) * ratio + offset, lerpSpeed * Time.deltaTime);
}
avg = spectrumData.Average();
isAudioDetected = avg > some value;
}Problem I'm seeing: No matter the microphone input, Is there a more efficient way to achieve my goal? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
Make sure you put this component on the same GameObject as Sample from com.utilities.audio: AudioReactiveBeahviour public class AudioReactiveBehaviour : MonoBehaviour
{
[SerializeField]
private Transform targetSphere;
[SerializeField]
private float scaleMultiplier = 10f;
[SerializeField]
private float smoothSpeed = 5f;
[SerializeField]
private float currentScale = 1f;
[SerializeField]
private float targetScale = 1f;
/// <summary>
/// Called automatically by Unity's audio system on the audio thread
/// </summary>
/// <param name="data"></param>
/// <param name="channels"></param>
/// <remarks>
/// IS NOT SUPPORTED IN WEBGL! <see href="https://docs.unity3d.com/ScriptReference/MonoBehaviour.OnAudioFilterRead.html"/>
/// </remarks>
private void OnAudioFilterRead(float[] data, int channels)
{
// Compute the RMS value (volume level)
var sum = 0f;
var length = data.Length;
for (var i = 0; i < length; i += channels)
{
var sample = data[i];
sum += sample * sample;
}
var rms = Mathf.Sqrt(sum / (length / (float)channels));
var volume = rms * scaleMultiplier;
// Thread-safe way to pass data to the main thread
targetScale = Mathf.Clamp(1f + volume, 1f, 3f);
}
private void Update()
{
// Smoothly interpolate to new scale on main thread
currentScale = Mathf.Lerp(currentScale, targetScale, Time.deltaTime * smoothSpeed);
if (targetSphere != null)
{
targetSphere.localScale = Vector3.one * currentScale;
}
}
} |
Beta Was this translation helpful? Give feedback.
Make sure you put this component on the same GameObject as
StreamAudioSource, and make sure it is the last component on the object in the inspector.Sample from com.utilities.audio: AudioReactiveBeahviour