Распознавание речи для бизнеса

Updated: 07.04.2026

2026. Google released AI dictation app that works offline



Google has released a free iOS dictation app called Google AI Edge Eloquent. It will compete with apps like Wispr Flow, SuperWhisper, Willow, and others. When installed, the app downloads a Gemma-based automatic speech recognition model to your phone. In the app, you can see a real-time transcription of your speech, and when you pause, the app automatically filters out filler words like "uh" and "ah" and refines the text. When cloud mode is enabled, the app uses Gemini online model to clean up the text. The app displays the history of your transcription sessions and allows you to search through them all. It can show the words dictated in the last session, your reading speed per minute, and the total number of words spoken.


2026. Microsoft unveiled AI models for speech generation and recognition



Microsoft continues to develop its own AI models MAI (to maintain independence from its partner OpenAI). The new MAI-Transcribe-1 model for speech-to-text translation, according to the company, demonstrates the best accuracy on the FLEURS benchmark for 25 of the most commonly used languages ​​and is 2.5 times faster than the previous Azure Fast solution. Microsoft claims the model is optimized for real-world conditions, including noise and unstable audio. The second new model, MAI-Voice-1, is designed for speech generation. It can create up to 60 seconds of audio in just one second, preserving intonation and voice characteristics. The developers have also added the ability to create a custom voice based on a few seconds of recording, simplifying the creation of voice interfaces and AI agents.


2021. Oki-Toki updated its Speech Analytics



The developers of Oki-Toki have abandoned the monetization of transcripts, making them free. They focused on making transcripts not just a standalone feature, but a tool for automated speech analytics for operators. Now, analysis and transcription of recordings are available to the operator. You can create your own rules, dictionaries of monitored words, connect them to projects, and track violations, sales, and other important trigger words in real time. Combine it with the quality control tool to speed up call review—calls will be tagged. If you only need to track specific calls, you can now filter by rules, just like hashtags in CRM.


2021. Microsoft is acquiring Nuance for $19.7B



Microsoft will acquire Nuance Communications, a leader in speech to text software, for $19.7 billion. The company says this was about increasing its presence in the healthcare vertical, a place where Nuance has done well in recent years. In fact, the company announced the Microsoft Cloud for Healthcare last year, and this deal is about accelerating its presence there. Nuance’s products in this area include Dragon Ambient eXperience, Dragon Medical One and PowerScribe One for radiology reporting. Nuance has a number of products including Dragon Dictate, a consumer and business text to speech product that dates back to the early 1990s.


2019. GoToMeeting has improved its interface and speech recognition


LogMeIn has released a new version of its popular video conferencing system, GoToMeeting. It features a completely redesigned user interface, unified across devices. The developers have also improved audio quality, promising high quality even on slow internet connections. For capturing the results of video meetings, new features include real-time notes and speech recognition, allowing users to read video conference logs as dialogue. Meeting organizers can now create branded virtual meeting rooms and gather their teams for communication and collaboration at any time. Integrations with Office 365, Outlook, Google Calendar, and Slack have been updated.