How I'm using Google's AI Studio and Gemini in Podcast Post-Production

Andrew Mitrak

11 Mar 2025 • 2 min read

Google AI Studio and Gemini are my go-to tools when it comes to transcribing podcast interviews and formatting them so they’re worthy of a blog post and newsletter.

Here is my exact workflow, with the prompts I used for my most recent podcast.

I upload the MP3 to Google AI Studio, which excels at handling audio files. My prompt:

"The attached file is an interview for a podcast called A History of Marketing between Andrew Mitrak and Sergio Zyman, Chief Marketing Officer of Coca Cola. It is about the history of New Coke, the Cola Wars in the 1980s and early 1990s. Please generate a clean transcript and remove "um" and other filler words and accidentally repeated words but otherwise be as accurate as possible."

Providing context in the prompt (names, topic) makes for much more accurate output. I review the transcript in Google Docs using its error-checking features.

I then upload this version of the transcript to my YouTube video, which is a big improvement over its auto-subtitles.

Next, I use the Gemini App. I attach a PDF of journalistic transcribing instructions and use this prompt followed by the full text of the transcript:

"The following is an interview transcript. Please format this for a blog. It is a transcript of the conversation that was originally spoken. Please make edits that correct for grammar so it is easier to be read, while still being accurate to what was spoken. Please add line breaks when speakers alternate. When there is a longer answer, break it up as needed into separate paragraphs."

This cleans up the text, adds formatting, and attributes dialogue. The output at this point looks a lot like a blog post! I export to Google Docs. A 30-minute interview will be about 10 pages.

For SEO and scannability, I use this prompt:

"Please suggest SEO-optimized headers to add to this blog. Make them descriptive of sections. Keep them short, but don't try to be cute. Make sure they improve scannability. Use H2 and H3 formats."

This generates headers I insert into the blog. I rewrite and edit these, but AI saves a lot of time here with the first draft.

Finally, I review the blog post while listening to the MP3. This lets me check both the transcript and the audio file for errors simultaneously. At 2X speed this process takes 15-30 minutes.

This workflow with Google AI Studio and Gemini has streamlined my post-interview process. It's not just about saving time, it's about producing something I otherwise wouldn’t have made without the help of AI.

I wouldn’t bother with transcripts if I had to do them manually, so now the interview is more accessible to audiences who prefer to read instead of listen or watch the interview. It’s also more discoverable, and a better overall experience for everybody.

Hope this long-form, detailed post is useful to those learning to use AI tools. I'm continuing to make this process faster each time. Would appreciate any of your AI tips if you have them!