"Hacking"

I'm trying to get a textbook, only copy online I found was on a chinese site with a download paywall (but free viewing).

What I did until now:
I inspected the source and made script in pic to generate 500+ links to every page.

What I think I need to do:
automate chrome to do this for me (load page, wait 5-10 secs to load the page, then use the "Print to PDF" functionality and number each file 1.pdf, 2.pdf etc. Only, I don't know how to do this (I don't know JS, and if I need it, what would I need to learn?)

That's what I think I need to do, any other ideas maybe?

Both links:
Original site:
source of pages (replace the x in num=x to get desired page): docread.mbalib.com/read/768f505c1c5ed5f21c5bc0874b711328?num=1&code=35bd670302aea685074b3f8e29015d52&max=0

Original site: doc.mbalib.com/view/768f505c1c5ed5f21c5bc0874b711328.html

Other urls found in this thread:

golibgen.io/view.php?id=1495599
twitter.com/NSFWRedditGif

If you get can it in plain text or HTML, why not use something like Pandoc to convert it to PDF?

Can't check the original site, it doesn't load for me.

this is it's not plaintext it's something weird.... (flash)

also, the 2nd link should work..

tried runnin it, seems like I'll have to clean it up manually before running: Package inputenc Error: Unicode char 智 (U+667A)
(inputenc) not set up for use with LaTeX.

Perhaps Flash to HTML5 to PDF? There should be some conversion tools around.

okay I cleaned it up and it kinda works, but only thing is that when you save the source html. google chrome doesn't save everything

the content is saved in "unnamed" (with no extension) files. And chrome saves only like 7 or 8.

How do I download everything, after waiting it all to load?

example:

I wonder if gnash could render these SWFs correctly, that would save a lot of time...

Why do you want the textbook to be split in 500 different PDFs? Why not just one PDF with 500 pages?

I guess there's separate programs that can append them together later...

in bash

for i in {1..500}
do
wget "docread.mbalib.com/read/768f505c1c5ed5f21c5bc0874b711328?num="$i"&code=35bd670302aea685074b3f8e29015d52&max=0"
done

i think that should grab the whole page, it's behind that flash/embed, but i think it's just the pdf content. which you should be able to just merge with some pdf tool.

not on a linux box, so i can't test it though.

that way you'll end up with 500 swfs which is pretty disgusting. we just need a way to render them to pdf