OUR EXPERT
Matt Holder has worked in IT support for over a decade, and is keen to utilise Linux alongside other installed systems.
Before we begin, a word of warning. Web scraping can be viewed as a negative endeavour – even a type of hacking – if what you’re trying to do is take somebody else’s intellectual property. When taking the ideas in the article further, ensure to take into consideration any legal implications.
The web scraping that we’ll carry out in this article is using purely fictional data, so this won’t be a problem. That said, web scraping is a way to take information from a rendered web page, store it into variables, lists and other data types, and then use it for carrying out another purpose.
Okay, let’s get on to the article proper. In this tutorial we’ll be using Python and the lxml and beautiful soup modules to scrape information from a website. As previously stated, the data used is purely fictional, but once the concepts have been learned it can be used for many different purposes. Once we’ve scraped data from the web page, we’ll use it to calculate some statistics.
X marks the spot
The first concept to