How to write software that scrapes music off bandcamp and downloads it?

How to write software that scrapes music off bandcamp and downloads it?

I don't wanna scrape anything off bandcamp, I just want to learn how do these things work.

Other urls found in this thread:

github.com/Otiel/BandcampDownloader
twitter.com/NSFWRedditVideo

learn 2 regex

if you want example code look at flexget

look at bandcamp source code. Find where the link to the music file is and wget that file.

github.com/Otiel/BandcampDownloader

such a sexy boy

install gentoo

Fiddle around the page to see how it works. Sometimes it can be as easy as recursive wget (ignore robots), others you might have to code some logic on http requests, and sometimes (depending on how much of a cunt the webdev is), you might have to emulate a web browser with something like phantomjs.

>Parsing html with regex

what would you use instead of regex?

This, wget is very powerful if you know how to use it.

Well, how about a proper parser?

You can probably parse html with regex, chances are that your doing it wrong and working at least twice as much. I certainly wouldn't recommend it.

>I certainly wouldn't recommend it.
so, how would you do it?

>Well, how about a proper parser?
way to sidestep the question. how would you do it?

Fetch the html and use vim to rip all relevant links which are forwarded to the shell script to download it.

Check out BAS - Browser automation studio. Dunno if you can make it download music. But it is the easyest way to go when it comes to no coding skill and a need for web automation. And its completelt FREE.

>Google is your friend
If you don't know shit about technology, why to you come to Sup Forums?

>a proper parser
>'just do it all manually! that is what I would do'
the whole point is to automate the process.

Vim is not a ordinary text editor. You can run Vim macro inside a bash shell which will do the work for you.

With a proper XML parser and xpath expressions.

read the source code of soundscrape and youll have a pretty good idea

My favorite answer on the entire site

I don't know the bandcamp website, but I build webscraper with python+beautifulsoup.
When I need javascript, I use python+selenium

Not him, but XPath is meant to do that. You shouldn't try to parse HTML with regex