In this workshop you will learn even more about how you can extract information from websites! Cool right?


BeautifulSoup is a Python module that helps you parse and navigate HTML or XML documents, making it easier to extract and manipulate data from web pages. (Documentation: https://beautiful-soup-4.readthedocs.io/en/latest/)


The following topics will be covered:

BeautifulSoup:

  • Scraping data from an example website listing various artists
  • Storing movie information from the IMDb Top 250 website into Pandas DataFrames

Requirements

  • Your own laptop
  • Internet connection
  • Google account
  • Google Colab installed in Google Drive
  • Keine Stichwörter