There's a Christmas song on YouTube that has a special place in my heart. I've been listening to it every holiday season since its release in 2011. Music has a way of taking us back to memories, and whenever I hear this song, I'm transported back in time. I used to take for granted that it would always be available to me on YouTube, but that all changed in Christmas 2021.
All of a sudden, the song was nowhere to be found. It was gone from YouTube and even the musician's bandcamp store. I searched for it everywhere but had no luck finding it. I even reached out to the musician but didn't hear back from him. Then I stumbled upon a tweet from someone who was also looking for the song, and the musician had responded that it was no longer available online. That was the proof. I'm sure he has a reason for unpublishing it. But for me, it was lost, and I was gutted.
Last Christmas, in 2022, a small miracle happened. I couldn't stop thinking about the song and desperately wanted to hear it again. Then, a ray of hope: I found the direct link to the YouTube video buried in an old tweet. I entered it into the Internet Archive, and voila! There it was. It played through my speakers, just like that. The Internet Archive had saved this YouTube video from 2011. I was so relieved. And when the song ended, I opened my eyes and immediately made sure to save it, so that I'll never lose it again.
I started to notice more videos disappearing from my playlists. YouTube doesn't even tell you what's been deleted, so who knows what I've lost. Maybe they were important to me. Maybe not. But that got me thinking, I don't wanna lose the stuff that means something to me, or just the stuff I like to watch from time to time.
Around the same time, Philipp wrote a post about how he downloads videos and songs to his Plex server automatically. It looked easy enough, but the best part is that it's super user-friendly. You just add videos to an unlisted playlist and they get downloaded automatically. It's seriously the easiest solution I've seen. Adding a video to a playlist takes just two clicks, whether you're on your phone or your computer.
Philipp's setup involves downloading videos to his Plex server, but for me, I just want to keep a backup of my favorite YouTube videos in case they disappear. Call it a ✨ Sicherheitskopie. ✨
As an Android user (boooooh!), I already have a Google One subscription to back up my photos, and I barely use 30% of my Google Drive storage, so it's a no-brainer for me. And if I ever wanna upgrade to 1TB in the future, the pricing is pretty reasonable.
How ethical is it to download a video from a one Google server – YouTube – and upload it to Drive – just another Google server? I don't care. Google isn't scanning your personal files against Content ID and then blocking them or anything. That would be just bonkers and a scandal waiting to happen.
A few years back, I bought a Raspberry Pi, but it mostly just gathered dust in a box of random cables since then. This could be the perfect project to repurpose it: using it to automatically download and back up my favorite YouTube videos. Here's the plan:
- Mount my Google Drive as a network drive
- Download my favorite YouTube videos and save them to the network drive
- Set up an automatic schedule to regularly download new videos in my magic Playlist
I will explain the details of how I've set everything up – also for me as a reference, in case I have to do it again – but maybe you're not interested in those details. No worries, let's skip the details and go straight to the end.
When I tried it out the first time, I was blown away by how fast it was. It only took less than 5 minutes to download, recode, and upload a single 53-minute long 720p video to my Google Drive, which I thought would take much longer, especially on my small Raspberry Pi.
In the next run, I downloaded 11 videos totaling 77 minutes, and it was done in 20 minutes, including recoding and uploading to my Google Drive. 20 minutes! Downloading, recoding, and uploading it again! I didn't expect this.
I wrapped the download script into a small Node.js script that sends me a Telegram message when it's done or encounters an error – which is also an idea I borrowed from Philipp. Thanks, Philipp!
Philipp handles playlist-fetching and memorizing of which videos were downloaded himself, whereas I opt for yt-dlp's built-in solution to download playlists. While it's easier for me, there are upsides to Philipp's approach as he has control over each video download step. On the other hand, I just tell yt-dlp to download whatever it thinks, and then have to parse its output to get some information from it. A tiny yikes!
Regardless of the details, I absolutely adore this solution. And what's even better is that I have this solution in place. Whenever I come across a video I want to keep, I add it to the playlist and don't worry about it again. It's very satisfying to know that my favorite videos can be safe and sound.
So, here's the deal. There's this sweet program called Rclone that can help you mount your Google Drive as a network drive on your Raspberry Pi. I followed this guide, but it was so easy to set up, that the guide wasn't really necessary. But it gave me the confidence that I'm doing the right thing.
When you use it for the first time, Rclone will ask you to set up a new remote, but this is a simple step-by-step process. The only thing that tripped me up was creating API credentials1, and even though they have a good guide to walk you through it, I did it wrong the first time. Let me tell you: just make sure to configure the OAuth Consent Screen before creating the app credentials. Otherwise Google just creates a client id but no client secret, and trust me, you'll need the client secret later on. Without the secret, it will look like it works, but it doesn't! Fun! ... I'm not blaming the guide. They explicitly tell you to first configure the OAuth Consent Screen, and I didn't do it.
At some point, you'll be asked which Google Drive permissions to grant your Pi. You can go ahead and give it full access, but I chose to limit it to read and write access for only the files it creates. That way, it can't see any of my other files on Drive. Does it matter? I don't know. It's one less attack vector. 🤷♂️
And last but not least, mount your Google Drive:
rclone mount mygdrive: ~/gdrive/ --daemon
mygdrive is just the name of your remote, which you set up earlier.
~/gdrive/ is the directory where I mounted my Drive. The
--daemon flag just runs the mount in the background.
Boom! First puzzle piece completed. To make sure everything's working, I created a new file in my mount directory and it showed up on the Google Drive web interface. So far, so good!
Side note: Installing both yt-dlp and ffmpeg on the Raspberry Pi was very quick and painless. In contrast, installing ffmpeg on my expensive MacBook M1X Pro using Homebrew takes ages because it
first compiles the Linux Kernel from scratch and then mines a bitcoin often requires updating and installing numerous dependencies.
Below is the complete command that I am using. Don't worry if it looks intimidating - I will explain each component in detail:
yt-dlp \ -P "/home/pi/Videos" \ -P "temp:tmp" \ -o "../gdrive/Video/%(title)s - %(channel)s - %(id)s.%(ext)s" \ --restrict-filenames \ --download-archive "~/Videos/downloads_downloaded.txt" \ -f "bestvideo[height<=1080][ext=mp4]+bestaudio[ext=m4a]/best[height<=1080]" \ --recode-video mp4 \ https://www.youtube.com/playlist?list=xxx
To begin with, we need to specify the working directory for yt-dlp. yt-dlp generates some intermediate files during the download process, and we don't want them to be synced to Google Drive via Rclone, which just slows us down. For this reason, we set the working directory to be
-P "/home/pi/Videos" \ -P "temp:tmp" \
temp:tmp option specifies that temporary files should be stored in a separate directory called "tmp". Is this necessary? Meh, I don't think so, but it also won't hurt to have it.
Next, we define the output path for the downloaded video. As we want the downloaded video to be saved in the mounted Google Drive, we point to the relative path inside Google Drive.
-o "../gdrive/Video/%(title)s - %(channel)s - %(id)s.%(ext)s" \
According to the docs, the path must be relative; otherwise, the
-P option won't work. Weird? Welp, anyways: It will yeet the downloaded video into a
Video/ folder of my Google Drive, and give it a descriptive name.
%(title)s will be replaced with the video's title,
%(channel)s with the name of the channel,
%(id)s is the ID of the video, and
%(ext)s the file extension.
--restrict-filenames will remove any special characters in the video's name and replace them with underscores.
We want to execute the above command again and again, but not download the videos again and again. We only need to download them once. And of course youtube-dl (and its cool cousin yt-dlp) has a built-in solution for that: the "Download Archive". This little file stores the IDs of all the videos you've already downloaded. All you need to do is add this option to your command:
And voila! yt-dlp will check the archive before downloading each video and only download it if it's not already on the list. I store this file in a local directory and not inside the Drive, because... I used a location inside the network drive and it didn't work. 🤷♂️
I don't know a lot about this part. But I asked ChatGPT to create it for me, and it just worked.
-f "bestvideo[height<=1080][ext=mp4]+bestaudio[ext=m4a]/best[height<=1080]" \ --recode-video mp4 \
My aim is to download videos with a maximum resolution of 1080p, and only as mp4. I don't trust webm as long as it isn't a commonly shared file format.
bestvideo[height<=1080][ext=mp4]+bestaudio[ext=m4a]requests a 1080p video as mp4, and separate best audio quality with it. The resulting video and audio are then combined into a single file using ffmpeg. I'm trying to directly get the mp4 here, so that it doesn't even need to be recoded from webm.
/best[height<=1080]means that if a video using the above criteria cannot be found (probably because it found a webm), I just want the best available audio and video with a maximum resolution of 1080p.
--recode-video mp4is used to tell it to please recode it to an mp4 when it isn't yet. When it already is an mp4, it will just skip the recoding.
The final part is to specify the URL of your playlist. Philipp explains this well in his post: you'll need to ensure that the playlist is set to unlisted instead of private, and that the videos are sorted from newest to oldest.
Once you've set up the playlist, the script will go through it, check which videos haven't yet been downloaded, and proceed to download them. Once the downloads (and maybe recoding) is complete, the file will be transferred to the Google Drive, and it's all done.
To get this running automatically, you'll want to set up a cronjob. You can edit your cronjobs using
crontab -e, and paste in something like this:
0 5,11,18,23 * * * sh /home/pi/yt-download.sh
This will run the download script every day at 5:00, 11:00, 18:00 and 23:00. Maybe too often, but I don't really think it matters. Why not, the thing is otherwise idling anyways.
I also added an extra step where I created a shell script to run yt-dlp, just to keep things neat and tidy. That way, if I need to change anything, I can just update the script instead of messing around with the crontab file.
It's so satisfying to see how pieces just work together. Like gears which fit perfectly into each other. It's done, and we just used existing solutions. Not even had to write a single line of code.
To take it a step further, you can add more pieces to it. I'm parsing the output logs, so I can search for new downloads and trigger a Telegram notification when it found one. But that's just neat gimmicks on top.