UNLIMITED

Linux Format

Website and RSS feed Python scraping

OUR EXPERT

Matt Holder has worked in IT support for over a decade, and is keen to utilise Linux alongside other installed systems.

Before we begin, a word of warning. Web scraping can be viewed as a negative endeavour – even a type of hacking – if what you’re trying to do is take somebody else’s intellectual property. When taking the ideas in the article further, ensure to take into consideration any legal implications.

The web scraping that we’ll carry out in this article is using purely fictional data, so this won’t be a problem. That said, web scraping is a way to take information from a rendered web page, store it into variables, lists and other data types, and then use it for carrying out another purpose.

Okay, let’s get on to the article proper. In this tutorial we’ll be using Python and the lxml and beautiful soup modules to scrape information from a website. As previously stated, the data used is purely fictional, but once the concepts have been learned it can be used for many different purposes. Once we’ve scraped data from the web page, we’ll use it to calculate some statistics.

X marks the spot

The first concept to

You’re reading a preview, subscribe to read more.

More from Linux Format

Linux Format3 min read
Elegoo Mars 5 Ultra
If you’re interested in 3D printing gaming miniatures, Elegoo’s Mars 5 Ultra is precisely what the Dungeon Master ordered. You don’t need to be a 3D printing expert to get started with this machine, thanks to its bevy of automatic features. It’s also
Linux Format2 min read
Kernel Watch
Linus Torvalds released Linux 6.12-rc4 (Release Candidate 4), saying that he was “not happy with how big this [the number of changes going in at this stage in the development cycle] is – it’s probably far from the biggest rc4 ever, but it is the bigg
Linux Format1 min read
Code Club And CoderDojo Combine!
There’s a new coding king in town and it’s still Code Club! Since 2012, Code Club has helped two million young people get into coding and tackle technology in interesting ways. But nothing stays still, especially with technology, and the new vision i

Related Books & Audiobooks