Parallelize YouTube downloads

Problem statement

I wanted to download a certain YouTube playlist for offline use, with close to 150 videos in total.

Unfortunately yt-dlp and YouTube are in sort of a game of catch-up: YT tries to enforce trickle downloads, and yt-dlp (and youtube-dl and others) are trying to bypass.

For a single video, I don’t mind waiting. But for 150 videos, this takes forever and a day:

yt-dlp \
  -f 'bestvideo[ext=mp4][vcodec!*=av01]+bestaudio[ext=m4a]/mp4' \

Fortunately, there is a better way.


The solution is simple:

  1. Get list of video IDs to download
  2. Parallel-download them
  3. Profit

Getting a list of video IDs to download

This is surprisingly easy:

yt-dlp --dump-json \ \
  | tee vids.json

(why tee? so you see progress, and don’t ^C it thinking it’s stuck)

Just the ID list, then is1:

jq -r '[.id]|@csv' < vids.json | sed 's/"//g'

Parallel-download list of videos

With vids.json downloading the videos in parallel is quite easy. All you need is jq, sed, and xargs:

jq -r '[.id]|@csv' < vids.json | sed 's/"//g' | \
  xargs -n 1 -P 20 -I{} \
    /usr/local/bin/yt-dlp \
    -f 'bestvideo[ext=mp4][vcodec!*=av01]+bestaudio[ext=m4a]/mp4' -- {}

You could obviously run more than 20 workers in parallel… but I’m trying not to be a douche about it2.

Also note the flag terminator (--) omit that and you’ll be sorry, as many YT videos start with a dash.


Enough material to watch during those loong winter nights. :-)

Closing words

I re-ran the original yt-dlp just to make sure my crude parallel downloader Pokemon’d them reliably. And yes, it surely did. \o/

If there’s will, there’s away. Crude way, sure. But not everything has to be production ready, yes?

Next-up: load-balancing across the IPv6 space I got assigned, to thwart possible per-IP throttling. Nah, too much work.

  1. And I’m sure there’s a better way; email me?

  2. And 20x speedup is good enough for the girls I go out with.