Add Audio Direction of Arrival / Sound Source Localization to the Sensor SDK
Similar to what was already suggested here, but prematurely declined: https://feedback.azure.com/forums/920053-azure-kinect-dk/suggestions/38328904-add-sound-source-localization-tracking-separatio
The Microsoft Speech Devices SDK does not provide low-level access to audio direction of arrival (DoA) required for sound source localization. It would be great if the Azure Kinect Sensor SDK could provide this beam information as the old Kinect SDKs (1 and 2) provided. This should be surfaced as a continuous stream of the direction of audio arrival in the horizontal plane (0 - 360 degrees).
This information would open up a ton of useful applications for the microphone array in conjunction with body tracking, e.g., for interactive robots. For example, if a robot is tracking multiple people in a scene, it would need DoA beam information to be able to determine which person is speaking in real-time, and therefore which person to turn to, respond to, etc. This scenario is currently not possible with the Azure Kinect, and is not enabled by anything in the current Windows Audio architecture or Speech SDKs.