Add an API that can transcribe any audio or video file input. E.g. A podcast mp3 or a video interview5 votes
Right now there's no easy way to export test cases with their steps. Our clients need test cases with steps in printable and readable format. Exporting from query after adding a column for steps adds way too many html tags in between the steps and the format is not at all readable.3 votes
Include speech phrase elements, profanity tags, and confusion network data in Speech to Text API results
Come on, I know you've got it. It would really do me a solid, help a brother out if you could include the individual phrase elements with timings and perplexity measurements inside the results json for the speech.phrase socket event. Confusion network data would also be fantastic to have.
"DisplayText": "Where are the speech phrase elements?",
"LexicalForm": "where are the speech phrase elements",
"InverseTextNormalizationResults": ["where are the speech phrase elements"],
"MaskedInverseTextNormalizationResults": ["where are the speech phrase elements"]
Like the Bing Spech API I want an endpoint on WebSocket.
wss: //eastasia.stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v11 vote
Currently there doesn't appear to be a good way to give feedback for any of the speech or vision APIs1 vote