Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
May 24, 2020 08:49 am GMT

Download a whole YouTube Playlist at one go

Have you ever been in a situation like, you are having you final exams in a week or so and you haven't attended a single lecture(I mean mindfully, of course)? Then you ask YouTube for help (happens in my case every time) and what you get is a huge(even larger than node_modules) playlist to watch and limited data/data-speed at your place.

Or like, you wanted to learn a new skill/language/framework from YouTube and you get a good "playlist" but limited storage space in your phone.

Now, there are websites/applications to download YouTube videos.
OOPS, either they download one video at a time or if they download a complete playlist at once, they are paid ones. You can download the playlist videos one-by-one(until that's a huge one). Exams are important, you can pay for them and if it is about learning, let it take some more time, we have entire life?

A Developer

Wait a minute, you are a Python developer. Why pay for a service you can build with a few lines of code?
This post will be about a simple project "YoPlaDo-YouTube Playlist Downloader" built using Python. We will be writing a program that takes the YouTube playlist link and web-scraps all the video links using BeautifulSoup and download the videos using Pytube.

Web Scrapping

Have you ever searched and downloaded images? Or have ever Ctrl+ C and Ctrl+ V (If you know you know)? Or submitted an assignment with the solutions you get online? Basically, this is what scrapping it.

Scrapping

Collecting data or to be more specific Extracting data from a website is Web-Scrapping. Instead doing things manually, you can automate things. That's what a Web Scrapper does. You give it a list of things to extract and then it goes for shopping(scrapping) from a website.
For example, you need an image, it searches for img tags.
These days Web-Scrapping is being used in every fields staring from Digital Marketing to DataScience or AI.
So, various languages provide various libraries,frameworks and tools to make a Web Scrapper or a Web Crawler. Python uses Beautiful Soup, Scrapy and a few more.
In this we will be creating a basic project with Beautiful Soup.

A Developer

YoPlaDo

Beautiful Soup

"Beautiful Soup is a Python package for parsing HTML and XML documents. It creates a parse tree for parsed pages that can be used to extract data from HTML, which is useful for web scraping." - Wikipedia

As mentioned, the library helps you extract and use data from websites.
I will explaining things in a simple way. For detailed information, you can go through the Docs.

Pytube

"A pythonic library for downloading YouTube Videos." - Pypi

For detailed information, you can go through the Docs.

Coding Mode On

Let's get started,

We will be extracting links from a playlist(like https://www.youtube.com/playlist?list=) and download each video automatically with the program.This is just a basic project overview for beginners. You can go for documentation and make improvements.

- Import the libraries

If you want to followup with the post, you can take use "https://www.youtube.com/playlist?list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-" as the input link as this is used in the below example. It will be easier to understand. Also make sure the playlist you use, should have videos which are downloadable and without any deleted videos. This problem can be solved using Exception block but in order to keep this post simple for beginners, the block hasn't been added.

Before importing make sure , you have downloaded the required libraires. You can use,

pip install pytube
pip install beautifulsoup4

from pytube import YouTubeimport bs4import requests

Youtube from Pytube is used for playing with YouTube videos.
Bs4 is the Beautiful Soup used for webscrapping.
Requests is the python HTTP library.

- Scrapping all the videos links from the Playlist

playlist=[]url=input("Enter the Youtube Playlist URL : ") #Takes the Playlist Linkdata= requests.get(url)soup=bs4.BeautifulSoup(data.text,'html.parser')

First we create an empty list "playlist" to store all the links to be extracted.

Now the url is taken as an input that is fed to requests.get().
Requests.get() sends a GET request to the specified url.

Then webscrapping comes into play.
For simpler understanding, the line "soup=bs4.BeautifulSoup(data.text,'html.parser')"
brings you the HTML file of the website so that you can extract and process the data. You can use a parser of your choice("html.parser","lxml", "html5lib");they have their own advantages and disadvantages.

- Scrapping(part 2)

for links in soup.find_all('a'):        link=links.get('href')        if (link[0:6]=="/watch" and link[0]!="#"):            link="https://www.youtube.com"+link            link=str(link)            playlist.append(link)print(playlist)"""For example, a playlist with 6 videosEnter the Youtube Playlist URL : https://www.youtube.com/playlist?list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-['https://www.youtube.com/watch?v=iyL9-EE3ngk&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-', 'https://www.youtube.com/watch?v=iyL9-EE3ngk&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-', 'https://www.youtube.com/watch?v=iyL9-EE3ngk&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=2&t=0s', 'https://www.youtube.com/watch?v=iyL9-EE3ngk&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=2&t=0s', 'https://www.youtube.com/watch?v=G7E8YrOiYrQ&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=3&t=0s', 'https://www.youtube.com/watch?v=G7E8YrOiYrQ&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=3&t=0s', 'https://www.youtube.com/watch?v=79D4Y1cUK7I&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=4&t=0s', 'https://www.youtube.com/watch?v=79D4Y1cUK7I&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=4&t=0s', 'https://www.youtube.com/watch?v=MUe0FPx8kSE&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=5&t=0s', 'https://www.youtube.com/watch?v=MUe0FPx8kSE&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=5&t=0s', 'https://www.youtube.com/watch?v=UkpmjbHYV0Y&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=6&t=0s', 'https://www.youtube.com/watch?v=UkpmjbHYV0Y&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=6&t=0s', 'https://www.youtube.com/watch?v=WTOFLmB9ge0&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=7&t=0s', 'https://www.youtube.com/watch?v=WTOFLmB9ge0&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=7&t=0s']"""

soup.find_all('a') finds all the anchor tags from the extracted HTML file.
While looping through all the anchor tags links.get('href') finds all the video links present in the webpage.
The rest part of the code is for validation done to find only the link to the videos of the selected playlist. In the end you can see there are duplicate and unnecessary links, which are needed to be removed.

- Simple manipulation with the "playlist" list

del playlist[0:2]playlist=set(playlist)print(playlist)"""{'https://www.youtube.com/watch?v=79D4Y1cUK7I&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=4&t=0s', 'https://www.youtube.com/watch?v=WTOFLmB9ge0&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=7&t=0s', 'https://www.youtube.com/watch?v=UkpmjbHYV0Y&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=6&t=0s', 'https://www.youtube.com/watch?v=iyL9-EE3ngk&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=2&t=0s', 'https://www.youtube.com/watch?v=MUe0FPx8kSE&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=5&t=0s', 'https://www.youtube.com/watch?v=G7E8YrOiYrQ&list=PLGzz7pyosmlJfx9ivigemSouoZR9uLT2-&index=3&t=0s'}"""

This is added to remove some extra and duplicate links.
Note: set() will change the order of the videos. You can add your own code to remove duplicate links present in the list.

- Downloading Videos

vquality=input("Enter the video quality (1080,720,480,360,240,144):")vquality=vquality+"p"for link in playlist:    yt = YouTube(link)    videos= yt.streams.filter(mime_type="video/mp4",res=vquality)    video=videos[0]    video.download("Downloads")    print(yt.title+" - has been downloaded !!!")

Alt Text

Note: Pytube updates have some bugs. Go for the documentation for updates in regex patterns. If you get KeyError:cipher, either the pytube hasn't been installed properly or there is a bug. Refer to https://stackoverflow.com/questions/56382295/getting-error-with-pytube-signature-cipher-get-signaturejs-streams-key

"vquality" is used to select the video quality you want to download.

Here comes the Pytube. Looping through the "playlist" list, each link is processed and downloaded using Pytube.

"yt" object stores the link and the "yt.streams.filter()" returns the filtered list as per the parameters given.
"video.download("Url to directory")" processes the link and downloads it to the mentioned directory.

Easy One

And with merely 30 lines of code, you saved a few dollars and gifted yourself a good project.
Well, for proper execution you can add your own exception blocks and logic. I would suggest you to go through the documentation.

Just a meme
For the application, you can go for the "dist" folder in the repository

https://github.com/Dstri26/YoPlaDo-Youtube-Playlist-Downloader/

Happy Scrapping! Happy Coding.

That's my first technical blog. Do correct me if I am wrong. <3 <3 <3


Original Link: https://dev.to/dstri26/download-a-whole-youtube-playlist-at-one-go-3331

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To