Quantcast
Channel: newsbin.com
Viewing all articles
Browse latest Browse all 16552

Db Gz to Db Failed debug messages...

$
0
0
by Quade (Posted Sun, 17 Nov 2013 23:56:34 GMT)
One is you believe that there are servers that report the same article # but with different content.


No. You request a range of posts, 1-1000 and you only get 990 posts because of server sync issue. Then the next time you download headers, you think you have up to record 1000 so, you download 1000-2000 but, the 10 missing from before remain missing even though they've shown up since the last time you downloaded. You can't know they've shown up since the last header download.

#2 is accurate. Gaps in articles aren't unheard off. Posts get removed. I don't care about how or why the servers might do this. All I care about is making Newsbin work. If that means 10 overlap or 1000, that's what I'll implement.

Is there any way to detect #1 without creating a gz file? I.E. query the content of the last article # from newsbin's db, and then string compare with the result of an "XOVER [lastarticle#]-" ?


As for detecting it, I don't see the point. Overlap takes care of it. Newsbin isn't tracking record numbers in the DB's. Once the records are imported, the record number is discarded. It's not used during download. All the headers from all servers are combined together on the DB.


How about collecting up a bunch of 10 post overlap GZ files in some other folder. You should have a crapload. Collect up 1000. Then exit Newsbin, rename the import folder, start Newsbin, then dump these 1000 gz files into the empty import folder and see how long it takes to process them.

Read Main Topic

Viewing all articles
Browse latest Browse all 16552

Trending Articles