November 4th, 2017, 04:01 PM | #171 |
Blocked!
Join Date: Jan 2008
Location: HH
Posts: 1,963
Thanks: 115,040
Thanked 32,801 Times in 1,955 Posts
|
Future Plans for forum-backup
I will continue to improve it gradually. If there is something that could make things easier for you give me a note. Often little things achieve much.
Of course bugs have a priority if you find one, drop me a line and I will look into it. Long term topics are: Duplicate Detection
I do not know of a way to specifically search or get a list of updated posts. So one way to deal with it could be
Last week a couple of pages were deleted from a thread due to a DMCA takedown: http://vintage-erotica-forum.com/sho...&postcount=718 http://vintage-erotica-forum.com/t58...n-phoenix.html So locally where pages 1 to 9, on the server only 1 to 6. If new posts where added the pages 6-8 would not have been downloaded again. A workaround for this is
I am planning to do some of this in my week off work in January. |
November 4th, 2017, 10:06 PM | #172 |
Sunny Mod
Join Date: Jan 2016
Posts: 5,523
Thanks: 48,654
Thanked 53,449 Times in 5,494 Posts
|
Does the forum-backup detect merges?
E.g., this thread was originally started by SIT (now post #4). Later were three posts of Jmailman merged into the thread, because they were posted in the LKMM-thread, but belongs to this thread. As Jmailman's post have a lower post-id, they are now on top of the thread. If you now would parse the thread after the merge there would be no update detected, because forum-backup parses the thread for the latest update. Therefore the new added/merged posts wouldn't be saved, right? If the above is correct, wouldn't it make sense, to compare the whole thread using the post-id's, to detect updates, or would this be to time consuming? Wouldn't this procedure solve the problem with Deleted Posts too?
__________________
. |
November 5th, 2017, 03:04 PM | #173 |
Blocked!
Join Date: Jan 2008
Location: HH
Posts: 1,963
Thanks: 115,040
Thanked 32,801 Times in 1,955 Posts
|
I hesitate parsing whole threads. For a few threads it could be done. But it is too much if there are hundreds of threads configured with tens or hundreds of pages each.
A solution could be:
But the page no, which is stored with the post, should be updated. addendum: But on problem persists: a thread is only processed, if the change results in an updated time stamp on the thread list: Last edited by halvar; November 5th, 2017 at 03:13 PM.. Reason: fixed typo, addendum |
January 24th, 2018, 08:15 AM | #174 |
Blocked!
Join Date: Jan 2008
Location: HH
Posts: 1,963
Thanks: 115,040
Thanked 32,801 Times in 1,955 Posts
|
I have created a new version of the forum backup tool:
Version 0.0.18, https://1fichier.com/?p5izqj8leqw5m3c3jqjt
The tool is mainly created for my personal use. It would be nice somebody else finds it useful. But if not then not. I know that it is not very easy to use. What does it do
New versions pick up settings and data from previous versions.
New Installation unzip the zip file anywhere start forum-backup.bat, e.g. by double click. A console window opens open the url http://localhost:3137/ with your browser click on the settings page link and enter your values click on 'Check Configuration'. The output should look like this (every line starting with 'OK') click on 'Save Configuration' to save the configuration. The configuration values are saved to <user-home>/forum-backup/forum-backup.properties. e.g. 'C:\Users\<your login name>\forum-backup\forum-backup.properties' depending on your operating system. Click on the 'Back to main view' link and start with downloading a single post There are tree ways to download posts Only posts and linked images are downloaded. Nothing is stored in the database. Enter one or more post numbers and click on 'Download Post' The images are saved to <storage-path>/adhoc-posts/<post-folder>, e.g. 'C:\data\forum-storage\adhoc-posts\Judith_Ramirez-post-2230617' threads One or more threads are downloaded. For each thread
during the download a simple progress bar is shown downloaded files are stored at the storage location: thread backup
Last edited by halvar; May 10th, 2020 at 08:54 AM.. Reason: updated link to 0.0.18 |
March 31st, 2018, 07:47 AM | #175 |
Blocked!
Join Date: Jan 2008
Location: HH
Posts: 1,963
Thanks: 115,040
Thanked 32,801 Times in 1,955 Posts
|
0.0.7 with pixhost.to support
The new version 0.0.7 was released. It contains:
* support for pixhost.to * pixhost.to links are converted to pixhost.to when downloading. This should not be necessary anymore since the links were fixed on the forum * a new report to show downloads of the last n days, with optional filtering for failed downloads. Full install (including java runtime) https://1fichier.com/?d4gkjyjy94 (66MB) Only the jar: https://1fichier.com/?lwymlqpmoa (7MB) If you have a working installation just replace your forum-backup-0.0.x.jar with the new forum-backup-0.0.7.jar |
April 1st, 2018, 02:57 PM | #176 | |
Vintage Member
Join Date: Jun 2007
Location: England Town
Posts: 1,107
Thanks: 1,592
Thanked 19,884 Times in 984 Posts
|
Quote:
Last edited by The Old Hacker; April 1st, 2018 at 03:14 PM.. Reason: Edited text. |
|
April 1st, 2018, 03:39 PM | #177 | |
Blocked!
Join Date: Jan 2008
Location: HH
Posts: 1,963
Thanks: 115,040
Thanked 32,801 Times in 1,955 Posts
|
Quote:
Also I did not want to have the documentation with screen shots containing VEF in a public repo. Even though issue tracking and wiki would be nice. The source is included in the zip download. It is a Java project using gradle. If you have a JDK installed you can build it using gradle, e.g. './gradlew build'. Please drop me a line if you have suggestions for improvements or found errors. (I know of the text errors on the settings page). Sadly I cannot make this my official side project, since "Downloader for erotic material" does not look that good on a resume. So I have another official side project where I spend most of my free programming time on. |
|
April 15th, 2018, 06:15 PM | #178 |
Blocked!
Join Date: Jan 2008
Location: HH
Posts: 1,963
Thanks: 115,040
Thanked 32,801 Times in 1,955 Posts
|
forum-backup 0.0.8 with improved handling of deleted posts/pages
The new version 0.0.8 was released. It contains:
handling of deleted posts/pages situation: A thread had pages from 1 to 10 and those were previously downloaded. Now multiple posts are deleted and only pages 1 to 7 exist. A new post is added to page 7. previous behavior: Page 7 is not downloaded because locally page 10 exists. The new post is missed. new behavior: previously downloaded pages 8 to 10 are renamed by appending the date to the filename. page 7 is downloaded. There are still scenarios where new posts are missed, but this fixes a major one. I expect more deletions in the future (because of dmca, cleanups and newly banned content) Full install (including java runtime) https://1fichier.com/?lb6qavanj1 (66MB) Only the jar: https://1fichier.com/?bpf29h2561 (7MB) If you have a working installation just replace your forum-backup-0.0.x.jar with the new forum-backup-0.0.8.jar |
May 11th, 2018, 10:14 AM | #179 |
Sunny Mod
Join Date: Jan 2016
Posts: 5,523
Thanks: 48,654
Thanked 53,449 Times in 5,494 Posts
|
I haven't tested your new tool yet.
It seems that p&h changed the BB-code long ago. (the post I noticed that is from 2008) The images are still up, but the thumbnail don't work. The image url has changed but redirects to the new url. I can't show you the post, as the old BB-code was now manually updated. But here is an example: Old BB-code: [URL=http://image.pimpandhost.com/guest/879324_x.html][IMG]http://pimpandhost.com/media/simple/1/thumbs/da498bdb2dda_1.jpg[/IMG][/URL] New BB-code: [URL=http://pimpandhost.com/image/879324][IMG]http://ist1-3.filesor.com/media/image/1/_/_/_/1/d/a/4/9/thumbs%2Fda498bdb2dda_0.jpg[/IMG][/URL] My question is: does your tool download the images with the old BB-code too? IHG doesn't work on these.
__________________
. |
May 11th, 2018, 12:07 PM | #180 |
Blocked!
Join Date: Jan 2008
Location: HH
Posts: 1,963
Thanks: 115,040
Thanked 32,801 Times in 1,955 Posts
|
Old or new should not matter. The HTML page behind that URL is downloaded, no matter what the URL looks like. The challenge is usually to find the image and the original filename on that page.
I just downloaded post and got 11 images. Is that the kind of post you are referring to? It seems to have old style URLs: HTML Code:
<a href="http://image.pimpandhost.com/guest/498275_x.html" target="_blank"><img src="http://pimpandhost.com/media/simple/1/thumbs/7d156ac06bf7_1.jpg" border="0" alt="" onload="..." /></a> HTML Code:
<host id="pimpandhost"><urlpattern>^https?:\/\/(?:image\.|www\.)?pimpandhost\.com\/image\/.+?$</urlpattern><searchpattern>function(pageData, pageUrl) { var iUrl = pageData.match(/img class=('|")normal\1 src=('|")(https?:\/\/.+?)(_l)?(\.(gif|jpe?g|png|GIF|JPE?G|PNG))\2/); return iUrl ? {imgUrl: iUrl[3] + iUrl[5], status: "OK"} : {imgUrl: null, status: "ABORT"} }</searchpattern></host> This urlpattern could work: HTML Code:
^https?:\/\/(?:image\.|www\.)?pimpandhost\.com\/(image|guest)\/.+?$ Last edited by halvar; May 11th, 2018 at 12:26 PM.. Reason: grammar |
|
|