Idea for teaching:
If each student sets up a free GitHub account —
They can make Gists, like this:
Could these be used for peer grading? Easy to share.
Once the student has a GitHub account, he/she can write code in Codepen (http://codepen.io/) and automatically save to Gist from there.
Glen McGregor (national affairs reporter with the Ottawa Citizen newspaper) wrote a helpful article about scraping for journalists:
But the best way and most effective approach to real web‑scraping is to write your own custom computer scripts. Often, these are the only way to extract data from online databases that require user input, such as the vehicle recalls list or restaurant inspections site.
To do this, you will need to learn a little bit of computer programming using a language such as Python, Ruby, Perl or PHP. You only to need to choose one.
Python, named after Monty not the snake, is my favourite for its simple syntax and great online support from Pythonistas. Ruby is also popular with data journalists. …
A program to scrape the vehicle recalls database would be written to submit a search term to the Transport website from a list of vehicle makes. It would capture the list of links the web server returns, then another part of the program would open each of these links, read the data, strip out all the HTML tags, and save the good stuff to a file.
Depending on the number of records and the speed of the server, it might take hours to run the program and assemble all the data in a single file. (For journalists not inclined to learn a computer language, Scraperwiki.com brings together programmers with people who need scraping work done.)
Read more > here.