This is one of the very concrete uses of artificial intelligence: automatic transcription into text from an audio file: a revolution for students, doctors, journalists in particular, but also in businesses.
Over the past year, the number of AI-based automatic transcription software or hardware solutions has exploded. From September 2022 – to be precise – and the release of Whisper, another artificial intelligence technology, provided by OpenAI. We are then two months before the public launch of its other baby, chatGPT, which will attract even more attention… But parallel to this global success, Whisper will begin to serve as a foundation for a multitude of audio to text file conversion applications.
Let’s take the example of a journalist who conducts an interview, and who needs to transcribe the audio of the questions and answers: a tedious exercise that can last several hours. Now, thanks to AI, simply import the audio file into an application or software to obtain, in a few seconds, the full transcript of the interview in text form.
The fidelity of this text depends on the audio quality, the quality of the AI โโand its model: the bigger it is, the more memory and power it requires, the more time it takes to process it, but also the more accurate the transcription . The fidelity of the text ultimately depends on the language. American AIs are less trained in French than English. And so, yes, there are always mistakes to be corrected but the time savings are, in all cases, spectacular.
To take advantage of these automatic transcriptions by AI, you can therefore use an application or software. There are dozens that rely on Whisper or other technologies like IBM’s Watson. For example, on Android, Google Speechnotes; Transcribe on iPhone; on a computer, voice recognition integrated into Windows; and on Mac, MacWhisper. In this galaxy, we meet several economic models: from free to paid, including subscriptions with a quota of minutes of transcription per month, or even translation into other languages.
In all cases, prefer applications that provide local transcription, that is without using the Cloud, such as Chuchotis, proposed on Mac by Denis Delbecq, former researcher and journalist of the Swiss daily newspaper. the weathera colleague who is very attentive to confidentiality and the protection of sensitive information.
And then, there is this accessory, the Plaud Note, launched in Europe last week, during the Viva Technology show in Paris. Imagine a revolutionary aluminum voice recorder, the size of a credit card, and also thin, magnetically fixed in its case, on the back of your smartphone. Press a button and the Plaud Note records with microphones and vibration sensors either the sound around you or your phone conversation. Switching to recording mode is confirmed by a haptic vibration and the activation of a red diode.
The mobile application then allows you to get the transcript, and even an impressive summary, thanks to babilGPT version 4, with several possible formats (conference, course, medical consultation, discussion, etc.). I tried it for a dissertation defense, it was spectacular. The only downside: the use of a still foggy Cloud. A future update could make it possible to target a Cloud in France, for greater data security, and choose another AI like that of the French Mistral AI.