Keeping up with long-form content is one of the biggest time sinks for developers and knowledge workers.
Podcasts, conference talks, and YouTube tutorials are invaluable sources of information, but a 90-minute episode demands 90 minutes of your attention.
I built a YouTube and podcast summarisation tool to solve this problem for me.
Audio Notes takes any YouTube or podcast URL and produces an AI-powered summary complete with key takeaways, action items, and topic tags.
Instead of watching an hour-long video, you read a 3-minute summary and decide whether the full content is worth your time.
In this post I walk through how the tool works, the technology behind it, and how I combine it with my AI Researcher agent to stay across new developments without drowning in content.
~
The Problem
The pace of change in AI and software development is relentless.
New frameworks ship weekly.
Conference talks pile up.
Podcast backlogs grow faster than you can listen.
The traditional options are:
- Watch everything and lose hours each day
- Skim titles and miss important content
- Rely on someone else’s summary and hope they captured what matters to you
None of these are great. I wanted a tool that could process the source material directly and surface the parts that matter.
~
How Audio Notes Works
The workflow is three steps.
1. Paste a URL
Drop in any YouTube video or podcast URL. The tool accepts public video links and extracts the audio for processing.

2. Transcribe and Summarise

The audio is sent to Azure Speech Services for batch transcription. Once the transcript is ready, OpenAI generates a structured summary using a tailored prompt. This produces:
- A concise written summary of the content
- Key takeaways pulled from the discussion
- Action items if any are mentioned
- Topic tags for quick categorisation
3. Review Insights
The summary page presents everything at a glance.
At the top, three stat cards show the original content duration, the estimated reading time for the summary, and the time saved.
For a 60-minute podcast, you typically get a summary that takes 3-4 minutes to read. That is a 90%+ time saving on every piece of content you process.
Below the stats, the full summary is rendered with markdown formatting, followed by the key takeaways, action items, and topic badges.
~
The Technology Stack
Audio Notes is built on .NET Core with the following services:
- Azure Speech Services
- OpenAI
- Azure Blob Storage
- SQL Server
- Semantic Kernel
The batch transcription pipeline runs asynchronously. You submit a URL, a background process handles the download, transcription, and file storage, and the web app polls for completion.
Once the transcript is available, summary generation takes a few seconds.
~
Combining Audio Notes with the AI Researcher Agent

This is where things get interesting.
In a previous post I described how I built an AI Researcher and Newsletter Publisher using the Microsoft Agent Framework with background responses.
That agent searches for the latest developments across blogs, GitHub repositories, and news sources, then compiles a newsletter.
I use both tools together as part of my weekly learning workflow:
- The AI Researcher agent identifies what is new and noteworthy. It surfaces blog posts, release announcements, or conference talks I should pay attention to.
- When the researcher flags a long-form video or podcast, I feed the URL into Audio Notes to get the summary.
- I scan the key takeaways and decide whether the full content warrants a deeper look.
This combination means I can process a week’s worth of AI and development news in under an hour.
The researcher agent handles breadth, telling me what exists.
Audio Notes handles depth, telling me what each piece of content actually says.
Neither tool replaces the other. Together they cover a pipeline from discovery to quick understanding
~
A Practical Example

A recent workflow looked like this.
The AI Researcher agent flagged a 6 hour Lex Fridman podcast and discussion with David Heinemeier Hansson (DHH), the creator of Ruby on Rails
Rather than blocking out time to listen, I pasted the URL into Audio Notes and within minutes I had:
- A summary of the episode
- Key and notable takeaways around hiring, software development, work-life balance and more
- Action items to consider
The summary told me everything. Total time spent: 2 minutes instead of 6 hours. In the end, I decided to listen to this podcast at another time. Maybe on a long drive.
~
Time Saved at Scale

The time savings compound quickly. If you process 5 pieces of long-form content per week, each averaging 45 minutes, that is nearly 4 hours of listening.
With Audio Notes, the same content takes roughly 20 minutes to review as summaries.
Over a month that is close to 14 hours reclaimed. The summary page makes all this visible.
Each summary shows the original duration alongside the estimated reading time and a percentage indicator of time saved.
It is a small detail, but seeing “XX% time saved” on every summary reinforces that the tool is doing its job.
~
What’s Next and Ideas
I am continuing to refine the summarisation prompts to improve the quality of key takeaways and action item extraction.
I am also exploring the possibility of batch-processing multiple URLs or entire playlists in a single operation, so the AI Researcher agent could automatically feed its discoveries into Audio Notes without manual intervention.
If you are interested in the background responses pattern that powers the AI Researcher agent, I covered the implementation in detail in the background responses post.
~
Summary
Audio Notes turns long-form video and podcast content into structured, scannable summaries.
Combined with the AI Researcher agent for content discovery, it forms a complete pipeline for staying current without the time commitment of consuming everything in full.
You can learn more about the AI Researcher in new Microsoft Agent Framework course here.
~


(@sergeytihon.com)
After four months of work, the book “Safe Clean Architecture” is now complete 
Check it out online.rdeneau.gitbook.io/safe-clean-a…#fsharp #free #e-book
