Back to all works

Scalable Transcription Pipeline

<!--

Work info

-->

<!--

Work info

-->

<!--

Work info

-->

Client:

Transcrivers

Role:

Automation Developer

Year:

2025

Project Overview

The client managed a large archive of audio and video recordings stored. Their goal was to transform these recordings into a searchable knowledge base that could be queried through a custom GPT. Existing tools had file size restrictions and missing metadata, leaving valuable content locked away.

As the workflow architect, I shaped a system that could seamlessly process large files, extract transcripts, and feed them into a GPT-powered knowledge system. I designed a workflow that handled ingestion, transcription, and indexing with secure data handling.

Problem

The biggest challenge lay in transcription models limited to files under 25MB, while the client’s recordings often reached gigabytes. Manual workarounds were unsustainable, leaving a bottleneck between media and usable transcripts. The client also needed time-coded metadata and speaker context for reference and learning purposes.

This meant the solution had to transcribe at scale, integrate directly with Google Drive, and push results into a vector database for GPT queries. Balancing accuracy, scalability, and ease of maintenance became the core problem to solve.

Solution

I designed an automated pipeline using n8n as the orchestration engine, FFmpeg for preprocessing, and AssemblyAI as the transcription service due to its ability to handle large files and return diarized, timestamped text. Alternatives such as Whisper with chunking workflows or self-hosted Whisper were also considered.

The pipeline ingested new files from Drive, converted and split them when needed, and submitted them for transcription. Transcripts were reassembled with adjusted timestamps, stored back into Drive, and indexed into a vector database, enabling the GPT to retrieve precise passages.

Impact

On a small initial test the pipeline cut turnaround times from hours or days to a fully automated flow measured in minutes. It unlocked access to previously inaccessible files and enabled users to query entire video libraries through natural language.

With transcripts enriched by timestamps and metadata, the custom GPT provided more accurate and contextual answers. The workflow not only saved significant operational time but also created a foundation for future enhancements such as automated summarization and editing.

See more projects

Peer & Mentor Matching Automation

Automated Content Processing Pipeline