Register on the forum now to remove ALL ads + popups + get access to tons of hidden content for members only!
vintage erotica forum vintage erotica forum vintage erotica forum
vintage erotica forum
Home Home
Go Back   Vintage Erotica Forums > Information & Help Forum > Help Section > Tutorials

Follow Vintage Erotica Forum on Twitter
Best Porn Sites Register FAQ Members List Calendar Mark Forums Read

Notices
Tutorials Step by step Guides and How to's with screengrabs.


Reply
 
Thread Tools Display Modes
Old 04-04-2018, 04:34 PM   #1
saint825xtc
Junior Member
 
Join Date: Aug 2016
Posts: 6
Thanks: 5
Thanked 16 Times in 5 Posts
saint825xtc 0
Default How to pull images from this site. Traditional methods don't work.

http://online.pubhtml5.com/vfof/guzu/#p=1

So I've tried all the methods that I'm aware of for bulk download but none of them have worked. I had to use Chrome developer view to find the source files and then click and save one at a time.

Using a bulk image downloader extension didn't work. It wasn't able to see any of the image files.

Any ideas?
saint825xtc is offline   Reply With Quote
The Following 4 Users Say Thank You to saint825xtc For This Useful Post:


Old 04-04-2018, 05:37 PM   #2
halvar
Journeyman
 
halvar's Avatar
 
Join Date: Jan 2008
Location: HH
Posts: 934
Thanks: 56,605
Thanked 12,981 Times in 926 Posts
halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+
Default

In cases like this I use wget. A command line tool available for practically every operating systems.

Using bash it can be invoked in a for loop:

HTML Code:
for x in $(seq 1 148); do wget http://online.pubhtml5.com/vfof/guzu/files/large/$x.jpg; done
If you don't know how to do a for loop you can easily create a windows-bat-file using a tool like excel with a line for every single image:

HTML Code:
wget http://online.pubhtml5.com/vfof/guzu/files/large/1.jpg
wget http://online.pubhtml5.com/vfof/guzu/files/large/2.jpg
wget http://online.pubhtml5.com/vfof/guzu/files/large/3.jpg
....
wget http://online.pubhtml5.com/vfof/guzu/files/large/148.jpg
execute the bat file and it will download all.

Maybe there are also tools with a nice GUI out there to do this.

Addendum: I forgot the -i param. You can create a text file containing urls and download them with
HTML Code:
wget -i myurls.txt

Last edited by halvar; 04-04-2018 at 05:41 PM.. Reason: forgot -i
halvar is offline   Reply With Quote
The Following 5 Users Say Thank You to halvar For This Useful Post:
Old 04-04-2018, 05:45 PM   #3
saint825xtc
Junior Member
 
Join Date: Aug 2016
Posts: 6
Thanks: 5
Thanked 16 Times in 5 Posts
saint825xtc 0
Default

Instead of a windows bat file, is there a mac alternative?
saint825xtc is offline   Reply With Quote
The Following 2 Users Say Thank You to saint825xtc For This Useful Post:
Old 04-04-2018, 05:55 PM   #4
halvar
Journeyman
 
halvar's Avatar
 
Join Date: Jan 2008
Location: HH
Posts: 934
Thanks: 56,605
Thanked 12,981 Times in 926 Posts
halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+
Default

Mac is even easier, since you have a bash terminal and curl!

Run Terminal and type
HTML Code:
curl --version
to test if you have curl installed.

If it is installed then execute
HTML Code:
for x in $(seq 1 148); do curl -o $x.jpg http://online.pubhtml5.com/vfof/guzu/files/large/$x.jpg; done
I am doing this with Linux, but it should work on Mac too.
halvar is offline   Reply With Quote
The Following 3 Users Say Thank You to halvar For This Useful Post:
Old 04-04-2018, 07:31 PM   #5
saint825xtc
Junior Member
 
Join Date: Aug 2016
Posts: 6
Thanks: 5
Thanked 16 Times in 5 Posts
saint825xtc 0
Default

That worked great Halvar!

Just so i can educate myself, would you mind translating that code a bit. It looks like you are telling it, "whenever you see 'x' after $, write a sequential number starting at 1 and ending at 148."

do curl -o (is this one command or is this 2 different sections? ie. do curl and -o)

Then you use the URL but substitute the page number with "$x".

I know some basic html/css programming but that's about the extent of my code knowledge.

Would there be a way to do this if each image had a name and not a number?
saint825xtc is offline   Reply With Quote
The Following 4 Users Say Thank You to saint825xtc For This Useful Post:
Old 04-04-2018, 08:06 PM   #6
halvar
Journeyman
 
halvar's Avatar
 
Join Date: Jan 2008
Location: HH
Posts: 934
Thanks: 56,605
Thanked 12,981 Times in 926 Posts
halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+
Default

HTML Code:
for x in $(seq 1 148) do
means do the following command with the numbers from 1 to 148 (148 times)

This is because
Code:
seq 1 148
returns the numbers from 1 to 148. You can execute "seq 1 5" in Terminal to get the idea

$x is a placeholder for the current value.

Code:
curl -o $x.jpg http://online.pubhtml5.com/vfof/guzu/files/large/$x.jpg;
is one command which is terminated by ";"
The "-o $x.jpg" is an option meaning save this as 1.jpg, 2.jpg and so on.

A more simple example printing the numbers 3 to 5:
Code:
for foo in $(seq 3 5); do echo $foo; done;
Sometimes you need leading zeros:
Code:
for foo in $(seq -w 3 10); do echo $foo; done;
You can provide a list of names, but you have to spell them out:
Code:
for v in foo bar "foo bar"; do echo ${v}; done;
* If a value contains blanks it has to be quoted ("foo bar")
* Sometimes the placeholder has to be in curly braces ${v}

Just toy around a bit to get the hang of it. I am not a bash scripting expert myself, but I often find it rather useful. Here is a very good documentation: http://tldp.org/LDP/Bash-Beginners-Guide/html/
halvar is offline   Reply With Quote
The Following 3 Users Say Thank You to halvar For This Useful Post:
Old 04-04-2018, 08:25 PM   #7
saint825xtc
Junior Member
 
Join Date: Aug 2016
Posts: 6
Thanks: 5
Thanked 16 Times in 5 Posts
saint825xtc 0
Default

That is super helpful Halvar. I appreciate it.

Do you have any ideas on how to pull the image files from these 2 sites. I couldn't find the source location of the images.

http://magzus.com/read/penthouse_let...pril_2017_usa/
This one has an option for "reading online" which opens up a frame and allows you to flip through the pages.


This other site I was able to find the source but it doesn't appear to be in sequential order and the file name structure seems to change. I'm not sure how to handle this one either.
http://openbook.hbgusa.com/openbook/9781455531356
saint825xtc is offline   Reply With Quote
The Following 3 Users Say Thank You to saint825xtc For This Useful Post:
Old 04-05-2018, 06:09 AM   #8
halvar
Journeyman
 
halvar's Avatar
 
Join Date: Jan 2008
Location: HH
Posts: 934
Thanks: 56,605
Thanked 12,981 Times in 926 Posts
halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+halvar 50000+
Default

Quote:
Originally Posted by saint825xtc View Post
Do you have any ideas on how to pull the image files from these 2 sites. I couldn't find the source location of the images.
Sorry, here I cannot help. The first seems to use Adobe Flash which I do not have. If Flash is used, you won't find images on the html page. Flash bypasses the browser.

On the second only the first couple of pages are available without buying. And what is visible are not images but text.

This does not surprise me, sites usually know how to protect their stuff.
halvar is offline   Reply With Quote
The Following 3 Users Say Thank You to halvar For This Useful Post:
Old 05-30-2018, 04:03 AM   #9
deepsepia
Widget Flummoxer
 
deepsepia's Avatar
 
Join Date: Jul 2007
Location: Upper left corner
Posts: 4,026
Thanks: 17,634
Thanked 34,526 Times in 4,011 Posts
deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+
Default

Quote:
Originally Posted by saint825xtc View Post
That is super helpful Halvar. I appreciate it.

Do you have any ideas on how to pull the image files from these 2 sites. I couldn't find the source location of the images.

http://magzus.com/read/penthouse_let...pril_2017_usa/
This one has an option for "reading online" which opens up a frame and allows you to flip through the pages.
Oh, these are fun. Not all of them work the same way, but this one can be cracked.

What you're going to want to do is to turn on the "Developer Tools" option (I'm using Firefox) and then take a look at the "GET" functions listed under "cached media" tab . . . you can see there the plain URL to the images. These tools in Firefox (there are similar ones in other browsers, they all work similarly) are incredibly powerful, they can let you watch as a particular webpage goes back to a server for graphics resources; since the aim of the designers is to make this hard to do, there are a lot of wrinkles.



So that's what I'm seeing when I load that page, it may be hard to see, but I've selected the "Storage" tab-- its in blue because its selected-- so what I'm looking at are the "GET"s that this page is doing to store locally on my machine, which include the URLs of all the actual JPGs of pages in the magazine.

Notice that I've got the URLs to images, like this

Code:
http://image.issuu.com/170228092429-a44baae32e0c0ec0323085902a9faef1/jpg/page_17.jpg

and notice that the pattern


hXXp://image.issuu.com/170228092429-a44baae32e0c0ec0323085902a9faef1/jpg/page_ [somenumber.jpg]

is repeated for all the pages, so you can use a CURL loop as halvar illustrated above and grab all the pages, by iterating the [somenumber] from 1 to the highest page number

Copy and paste this into Terminal on a Mac (with CURL)

Code:
for x in $(seq 1 148); do curl -o $x.jpg http://image.issuu.com/170228092429-a44baae32e0c0ec0323085902a9faef1/jpg/page_$x.jpg; done

Last edited by deepsepia; 05-30-2018 at 04:59 AM..
deepsepia is offline   Reply With Quote
The Following 2 Users Say Thank You to deepsepia For This Useful Post:
Old 05-30-2018, 07:49 PM   #10
deepsepia
Widget Flummoxer
 
deepsepia's Avatar
 
Join Date: Jul 2007
Location: Upper left corner
Posts: 4,026
Thanks: 17,634
Thanked 34,526 Times in 4,011 Posts
deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+deepsepia 100000+
Default

Quote:
Originally Posted by saint825xtc View Post
This other site I was able to find the source but it doesn't appear to be in sequential order and the file name structure seems to change. I'm not sure how to handle this one either.
http://openbook.hbgusa.com/openbook/9781455531356
Using the same techniques that I used above, what you see when you look at the GET functions is that this is a bunch of text blocks that get plugged in Cascading Style Sheets to format them as a book.

So you get text that's coming in from a URL like:
http://openbook.hbgusa.com/openbook/...apter001.xhtml

. . . and you can use the same "fusking" trick that I used with the jpgs above, just plug it into halvar's CURL code, so that you iterate through the chapters, eg

..../chapter001.xhtml
..../chapter002.xhtml

. . . and so on.

You'll then have to some work to do if you want to format these the way they were in the original . . . you need to run these downloaded resources with the stylesheet they were using on the site, which is, I think
http://openbook.hbgusa.com/openbook/...stylesheet.css

. . . but I haven't checked that. In general these CSS pages have a lot of similar looking files, and it takes a bit of trial and error to identify which parts of the puzzle go where.

But its kinda fun. It is _not_ blackbelt hacking by any means, not really “hacking” at all — all you’re doing is saving stuff that the site is pushing to your machine. but you can get a lot done just by poking around the guts of a website. There are lots of sites which disable right click, for example, you can pretty much always find the resource they're hiding in the GETs

Same is true with some thumbnail gallery that something like Imagehost Grabber can't resolve-- you open the page and start looking through the Developer Tools Inspector to see just what gets called.

Last edited by deepsepia; 05-30-2018 at 09:49 PM..
deepsepia is offline   Reply With Quote
The Following User Says Thank You to deepsepia For This Useful Post:
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump




All times are GMT. The time now is 03:01 AM.






vBulletin Optimisation provided by vB Optimise v2.6.1 (Pro) - vBulletin Mods & Addons Copyright © 2018 DragonByte Technologies Ltd.