8.9 C
London
Friday, December 5, 2025

I had an enormous audio transcription downside – Gemini solved it, and ChatGPT didn’t

TechnologyI had an enormous audio transcription downside – Gemini solved it, and ChatGPT didn’t

You understand how they are saying, "It's not a contest!" Properly, don't allow them to misinform you; all the pieces is a contest, particularly in the case of AI. There's hardly ever a day when I’m not testing AI capabilities amongst a number of chatbots, and I’m virtually at all times stunned on the outcomes. Some platforms actually are higher than others – no less than for some duties.

This journey began with Notes on my iPhone 17 Professional Max. Normally, I wish to file interviews on an Android smartphone just like the Google Pixel 10 Professional Fold, the place the unbelievable Recorder app expertly captures each utterance and, within the transcription, does a deft job of separating and labeling every speaker.

Nevertheless, I arrived for this interview with simply my iPhone. I do know that buried inside Notes, an app I take advantage of obsessively throughout my iPhone and desktop (I’ve virtually 2,500 notes), are audio recording capabilities hidden underneath the attachment icon (a paperclip).

Notes does a very good job of recording audio, and I discovered my 20-minute recording completely captured in a word. Included was what gave the impression to be a helpful transcription. A fast scan confirmed its accuracy, however there was an enormous downside: it didn't label the audio system; all the pieces blended into one lengthy soliloquy. This is able to make it troublesome to scan and decide aside my topic's quotes from my very own queries and observations.

I resigned myself to a relisten, throughout which I added my very own labels…till I had a distinct thought: What if Gemini may assist?

Gemini 3 Professional places on its gloves

In latest months, I've been impressed with Google Gemini's capabilities, particularly the most recent 3 Professional fashions, and the way it appears to deal with virtually any immediate request with aplomb.

Now that I had the concept, I had to determine learn how to get Gemini to take heed to the recording. Taking part in again the audio on my iPhone audio system and asking Gemini to pay attention was out as a result of I apprehensive about how nicely, say, my desktop mics would possibly decide up the sound popping out of the iPhone audio system. Plus, I used to be within the workplace and didn't need individuals to overhear the personal dialog (till I printed a narrative).

Join breaking information, opinions, opinion, high tech offers, and extra.

First, I discovered that you would obtain the audio file from Notes. In playback, underneath the three dots, there's a Share button that lets me Airdrop the audio file to my 14-inch MacBook Professional. It comes down as an MPEG-4 (M4A) file.

Again in Gemini 3 Professional, I chosen the "+" signal within the immediate area, selected the M4A audio file, and added this temporary immediate: "Hearken to this, transcribe it and make sure you establish the completely different audio system."

Gemini Listen and Transcribe

There was no forwards and backwards. Germini 3 Professional shortly began spitting out the complete transcript with audio system recognized as "Interviewer" and the identify and title of my topic. It's price noting right here that that is the one factor Gemini 3 Professional inexplicably received utterly improper. Though my topic spelled out his identify on the finish of the chat, Gemini selected a distinct one. Apart from that, although, Gemini completely recognized when it was me or or topic talking. And the accuracy was really spectacular.

For the sake of completeness, I requested Gemini 3 Professional to appropriate the identification of my topic and listing me because the "interviewer". With that fastened, I fortunately used the transcript to assist drive my full story.

On this nook, ChatGPT

Naturally, although, I used to be curious if ChatGPT 5.1 (with a Plus account) may accomplish the identical activity.

Within the ChatGPT immediate window, I chosen the audio file and entered the very same immediate. ChatGPT instructed me, "I can positively transcribe audio, however I can’t entry or play the .m4a file immediately from the situation you referenced."

What adopted was an intensive back-and-forth during which ChatGPT stored suggesting other ways for me to add the file, together with reworking it into a zipper file. It doesn’t matter what I did, ChatGPT would present the audio file within the immediate window, but it surely couldn't take heed to it.

On this little competitors, it appears, Gemini 3 Professional is the victor, turning a irritating downside into a straightforward win. The much less stated about how ineffective Apple's Notes transcription is, the higher.

Purple circle with the words Best business laptops in white

➡️ Read our full guide to the best business laptops
1. Greatest general:
Dell Precision 5690
2. Greatest on a price range:
Acer Aspire 5
3. Greatest MacBook:
Apple MacBook Professional 14-inch (M4)

Follow TechRadar on Google News andadd us as a preferred source to get our skilled information, opinions, and opinion in your feeds. Make sure that to click on the Comply with button!

And naturally you too can follow TechRadar on TikTok for information, opinions, unboxings in video type, and get common updates from us on WhatsApp too.

Check out our other content

Most Popular Articles