June 20th, 2017, 10:15 AM | #121 | |
Vintage Idiot
Join Date: Feb 2012
Location: History
Posts: 22,129
Thanks: 226,701
Thanked 356,685 Times in 21,624 Posts
|
Quote:
That is to say, if you call that contructor and pass in a value (in the appropriate positional argument) for filterPoster then filterPoster will have that value, e.g. "andw" if that's what you passed it. |
|
The Following 10 Users Say Thank You to effCup For This Useful Post: |
June 20th, 2017, 11:06 AM | #122 | |
Sunny Mod
Join Date: Jan 2016
Posts: 5,511
Thanks: 48,469
Thanked 53,317 Times in 5,482 Posts
|
Quote:
But I tend to reduce the batch-size down to 5, max. 10 threads with the timeout error appearing more often now. My problem is, that I got for example in my last 30-threads batch (each thread has only five posts) 92 timeouts, which means, I have to go to the post, check and save the thumbnail manually. Unfortunately the thumb couldn't get saved, if the timeout-error occurs. I still hope, that the timeout-problem will disappear, because there are still c. 4500 threads to go in MIR.
__________________
. |
|
The Following 11 Users Say Thank You to deezer For This Useful Post: |
June 20th, 2017, 11:48 AM | #123 |
Vintage Idiot
Join Date: Feb 2012
Location: History
Posts: 22,129
Thanks: 226,701
Thanked 356,685 Times in 21,624 Posts
|
If anyone's still having trouble running halvar's tool, please note that you'll need this:
here. Last edited by effCup; June 20th, 2017 at 12:15 PM.. Reason: lnk |
The Following 11 Users Say Thank You to effCup For This Useful Post: |
June 20th, 2017, 03:05 PM | #124 | |
Vintage Idiot
Join Date: Feb 2012
Location: History
Posts: 22,129
Thanks: 226,701
Thanked 356,685 Times in 21,624 Posts
|
Quote:
I don't wish to complicate the program but it would be very helpful if it could check for existing filenames prior to writing/downloading them, and append, say, _00N before the dot-suffix--where N is an incremental counter/value? |
|
The Following 11 Users Say Thank You to effCup For This Useful Post: |
June 20th, 2017, 06:00 PM | #125 | |
Blocked!
Join Date: Jan 2008
Location: HH
Posts: 1,963
Thanks: 115,040
Thanked 32,801 Times in 1,955 Posts
|
Quote:
the arguments would then be thread-id user password page-from page-upto storage-path poster The more arguments we get, the more complicated it gets... I will look into it |
|
June 20th, 2017, 06:06 PM | #126 | |
Blocked!
Join Date: Jan 2008
Location: HH
Posts: 1,963
Thanks: 115,040
Thanked 32,801 Times in 1,955 Posts
|
Quote:
I will try to come up with something. |
|
June 20th, 2017, 09:47 PM | #127 |
Blocked!
Join Date: Jan 2008
Location: HH
Posts: 1,963
Thanks: 115,040
Thanked 32,801 Times in 1,955 Posts
|
version 1.23 of imagerescue released
https://1fichier.com/?egemhhxsm6
Setting lower values results in aborting downloads that would have worked but took too long. These are the three timeout values the Apache HTTP components uses. The difference and meaning is not clear to me. I would tend to set a lower value to the connect timeout and a higher value to socket and request (20, 60, 60) But I am not sure tweaking these values is efficient. I would rather loose one download that takes 60 seconds to complete than the 5 downloads I that would go through in the same time. |
June 21st, 2017, 03:13 AM | #128 | |
Former Staff
Join Date: Jun 2007
Location: Germany
Posts: 11,875
Thanks: 19,210
Thanked 570,912 Times in 11,033 Posts
|
Quote:
"thumb_[file name]_1.gif/jpg" pics - all of them had different file names originally. And a minor request: Would it be possible to implement some kind of progress bar or the percentage of work done in your tool (xy % processed) or the like in the GUI? It seems that version 1.23 is working remarkably slower than prior versions - on my PC (Windows 10), at least - I filled only the mandatory fields, just one thread ID, no filter used. I simply wished to get an idea of how long it still takes until the download will be finished. It feels a bit like being caught in an endless loop. Not complaining, just reporting! Your efforts are great, halvar!
__________________
m Please add source, post complete photo and scan sets - with indexes, if available, preserve genuine file names (that will help to ID sources and model names), thank, credit, and quote original posters. I'm afraid I haven't any time for reuploads. Don't send reports (or PMs) of dead files or requests! Once the files posted above are expired, please help each other, add the info I provided as well. To view links or images in signatures your post count must be 0 or greater. You currently have 0 posts. -> Underlined words in my posts are clickable. <-
|
|
June 21st, 2017, 04:56 AM | #129 | |
Vintage Idiot
Join Date: Feb 2012
Location: History
Posts: 22,129
Thanks: 226,701
Thanked 356,685 Times in 21,624 Posts
|
Quote:
It will be halvar's call to make but a progress bar can probably only tell us progress by posts within a thread? Or threads within a queue? That takes no account of the (varying) number of images in different posts, so may not be particularly helpful/informative for this task? |
|
The Following 10 Users Say Thank You to effCup For This Useful Post: |
June 21st, 2017, 05:14 AM | #130 |
Vintage Idiot
Join Date: Feb 2012
Location: History
Posts: 22,129
Thanks: 226,701
Thanked 356,685 Times in 21,624 Posts
|
I'm not sure what's happened, but with 1.23 & this thread/post there were a couple of timeouts and as a result I got all images, although one of them was a thumb, and I got that thumb twice (with non-conflicting filenames).
I've not seen that before. Code:
2017-06-21 16:52:02 INFO: 2 threads specified: [280344, 313924] 2017-06-21 16:52:02 INFO: Starting Thread: 280344 [...] 2017-06-21 16:52:14 INFO: Finished Thread: 280344 2017-06-21 16:52:14 INFO: Starting Thread: 313924 2017-06-21 16:52:15 INFO: HTTP/1.1 200 OK 2017-06-21 16:52:15 INFO: IDstack cookie found. Successfully logged in to VEF 2017-06-21 16:52:15 INFO: Download first thread page: http://vintage-erotica-forum.com/t313924-p1-x.html 2017-06-21 16:52:15 INFO: HTTP/1.1 200 OK 2017-06-21 16:52:15 INFO: Thread name: Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus 2017-06-21 16:52:15 INFO: Download forum page: http://vintage-erotica-forum.com/t313924-p1-x.html to D:\saved from vef\t313924-Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus\t313924-p1-Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus.html 2017-06-21 16:52:16 INFO: End of Thread reached - no next page link found - 2017-06-21 16:52:16 INFO: [ForkJoinPool-1-worker-0]image-page: http://www.imagebam.com/image/32bdcc274532306 2017-06-21 16:52:38 SEVERE: [ForkJoinPool-1-worker-0] Error downloading 'http://www.imagebam.com/image/32bdcc274532306' org.apache.http.conn.HttpHostConnectException: Connect to 101.imagebam.com:80 [101.imagebam.com/199.58.85.103] failed: Connection timed out: connect at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:159) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:359) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) at vef.imgrescue.AbstactImageHost.downloadFile(AbstactImageHost.java:130) at vef.imgrescue.AbstactImageHost.download(AbstactImageHost.java:78) at vef.imgrescue.ImageLinkProcessor.lambda$downloadImages$8(ImageLinkProcessor.java:121) at java.util.ArrayList.forEach(Unknown Source) at vef.imgrescue.ImageLinkProcessor.downloadImages(ImageLinkProcessor.java:105) at vef.imgrescue.ImageLinkProcessor.lambda$null$1(ImageLinkProcessor.java:56) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(Unknown Source) at java.util.TreeMap$EntrySpliterator.forEachRemaining(Unknown Source) at java.util.stream.AbstractPipeline.copyInto(Unknown Source) at java.util.stream.ForEachOps$ForEachTask.compute(Unknown Source) at java.util.concurrent.CountedCompleter.exec(Unknown Source) at java.util.concurrent.ForkJoinTask.doExec(Unknown Source) at java.util.concurrent.ForkJoinTask.doInvoke(Unknown Source) at java.util.concurrent.ForkJoinTask.invoke(Unknown Source) at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(Unknown Source) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(Unknown Source) at java.util.stream.AbstractPipeline.evaluate(Unknown Source) at java.util.stream.ReferencePipeline.forEach(Unknown Source) at java.util.stream.ReferencePipeline$Head.forEach(Unknown Source) at vef.imgrescue.ImageLinkProcessor.lambda$processForumPages$2(ImageLinkProcessor.java:42) at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(Unknown Source) at java.util.concurrent.ForkJoinTask.doExec(Unknown Source) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(Unknown Source) at java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) at java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) Caused by: java.net.ConnectException: Connection timed out: connect at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method) at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source) at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source) at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source) at java.net.AbstractPlainSocketImpl.connect(Unknown Source) at java.net.PlainSocketImpl.connect(Unknown Source) at java.net.SocksSocketImpl.connect(Unknown Source) at java.net.Socket.connect(Unknown Source) at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) ... 34 more 2017-06-21 16:52:38 INFO: [ForkJoinPool-1-worker-0]downloaded: http://thumbnails101.imagebam.com/27454/32bdcc274532306.jpg to D:\saved from vef\t313924-Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus\t313924-p1-Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus-post-1-3723923\thumb_newdoc254 - Copy.jpg 2017-06-21 16:52:38 INFO: [ForkJoinPool-1-worker-0]image-page: http://www.imagebam.com/image/da7234274532408 2017-06-21 16:52:41 INFO: [ForkJoinPool-1-worker-0]downloaded: http://102.imagebam.com/download/sFt1GIXnxdOynSko4nbopQ/27454/274532408/newdoc255%20-%20Copy.jpg to D:\saved from vef\t313924-Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus\t313924-p1-Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus-post-1-3723923\newdoc255 - Copy.jpg 2017-06-21 16:52:41 INFO: [ForkJoinPool-1-worker-0]image-page: http://www.imagebam.com/image/3a5f52274532476 2017-06-21 16:52:55 INFO: [ForkJoinPool-1-worker-0]downloaded: http://103.imagebam.com/download/Iobsbb06MVjngObTIgS7AA/27454/274532476/newdoc257.jpg to D:\saved from vef\t313924-Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus\t313924-p1-Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus-post-1-3723923\newdoc257.jpg 2017-06-21 16:52:55 INFO: [ForkJoinPool-1-worker-0]image-page: RETRY http://www.imagebam.com/image/32bdcc274532306 2017-06-21 16:53:17 SEVERE: [ForkJoinPool-1-worker-0] Error downloading 'http://www.imagebam.com/image/32bdcc274532306' org.apache.http.conn.HttpHostConnectException: Connect to 101.imagebam.com:80 [101.imagebam.com/199.58.85.103] failed: Connection timed out: connect at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:159) at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:359) at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:381) at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:237) at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) at vef.imgrescue.AbstactImageHost.downloadFile(AbstactImageHost.java:130) at vef.imgrescue.AbstactImageHost.download(AbstactImageHost.java:78) at vef.imgrescue.ImageLinkProcessor.lambda$downloadImages$8(ImageLinkProcessor.java:121) at java.util.ArrayList.forEach(Unknown Source) at vef.imgrescue.ImageLinkProcessor.downloadImages(ImageLinkProcessor.java:105) at vef.imgrescue.ImageLinkProcessor.lambda$null$3(ImageLinkProcessor.java:62) at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(Unknown Source) at java.util.HashMap$EntrySpliterator.forEachRemaining(Unknown Source) at java.util.stream.AbstractPipeline.copyInto(Unknown Source) at java.util.stream.ForEachOps$ForEachTask.compute(Unknown Source) at java.util.concurrent.CountedCompleter.exec(Unknown Source) at java.util.concurrent.ForkJoinTask.doExec(Unknown Source) at java.util.concurrent.ForkJoinTask.doInvoke(Unknown Source) at java.util.concurrent.ForkJoinTask.invoke(Unknown Source) at java.util.stream.ForEachOps$ForEachOp.evaluateParallel(Unknown Source) at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateParallel(Unknown Source) at java.util.stream.AbstractPipeline.evaluate(Unknown Source) at java.util.stream.ReferencePipeline.forEach(Unknown Source) at java.util.stream.ReferencePipeline$Head.forEach(Unknown Source) at vef.imgrescue.ImageLinkProcessor.lambda$processForumPages$4(ImageLinkProcessor.java:62) at java.util.concurrent.ForkJoinTask$AdaptedRunnableAction.exec(Unknown Source) at java.util.concurrent.ForkJoinTask.doExec(Unknown Source) at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(Unknown Source) at java.util.concurrent.ForkJoinPool.runWorker(Unknown Source) at java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source) Caused by: java.net.ConnectException: Connection timed out: connect at java.net.DualStackPlainSocketImpl.waitForConnect(Native Method) at java.net.DualStackPlainSocketImpl.socketConnect(Unknown Source) at java.net.AbstractPlainSocketImpl.doConnect(Unknown Source) at java.net.AbstractPlainSocketImpl.connectToAddress(Unknown Source) at java.net.AbstractPlainSocketImpl.connect(Unknown Source) at java.net.PlainSocketImpl.connect(Unknown Source) at java.net.SocksSocketImpl.connect(Unknown Source) at java.net.Socket.connect(Unknown Source) at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75) at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142) ... 34 more 2017-06-21 16:53:17 INFO: [ForkJoinPool-1-worker-0]downloaded: http://thumbnails101.imagebam.com/27454/32bdcc274532306.jpg to D:\saved from vef\t313924-Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus\t313924-p1-Mystery_Followup_Who_is_Wendy_Annie_Kerr_from_Aus-post-1-3723923\thumb_newdoc254 - Copy_1.jpg 2017-06-21 16:53:17 INFO: FINISHED! 2017-06-21 16:53:17 INFO: Finished Thread: 313924 From the above post: this image simply downloads to the browser window. This image didn't first time I tried, but did the second. Note also that although the image has finished downloading/displaying in the browser window, the browser still thinks it's transferring data. This is the one with the google-analytics message. That's also the image that halvar's tool gave me two thumbs of. It's better to get two thumbs the same than no thumbs (or images) for some. Probably it's not easy to do the re-checking/re-attempting while also controlling for possible duplicates--because the tool perhaps won't easily tell the difference between clashing filenames caused by duplicate downloads, as opposed to clashing filenames caused by duplicate filenames already linked/in the post. One could perhaps check filesize or maybe(?) some image properties like dimensions but... I don't think you'll want/need to see the html file it downloaded but if so let me know. |
The Following 11 Users Say Thank You to effCup For This Useful Post: |
Thread Tools | |
Display Modes | |
|
|