No matter how clever you are, the stupidest (simple) solutions outperform the "smart" (complicated) ones.

Anything not related to map making or Blizzard games should be posted here.
Fun • Forum games • Images • Films • Music • Humor • Videos
Forum rules
The off topic has no rules :)
User avatar
3ICE
Admin
Posts: 2629
Joined: Sat Mar 01, 2008 11:34 pm
Realm: Europe
Account: 3ICE
Clan: 3ICE
Location: Hungary
Contact:

No matter how clever you are, the stupidest (simple) solutions outperform the "smart" (complicated) ones.

Unread post by 3ICE »

(I paused reading at the "Automating the Search" section to leave this quick comment:)
Looks interesting with what's I'm sure many a clever data dump idea, but I lost some interest because the simplest, stupidest solution — splitting the search into "AIza-", "AIza0", "AIza1", "AIza2", ... "AIzaq" ... "AIzaZ" — was not mentioned.
If necessary two (or even three) steps deep, as in:
"AIza--" "AIza-0" "AIza-1" ... "AIza-Z"
"AIza0-" "AIza00" "AIza01" ... "AIza0Z"
...
"AIzaZ-" "AIzaZ0" "AIzaZ1" ... "AIzaZZ"

No matter how clever you are, the stupidest (simple) solutions outperform the "smart" (complicated) ones.
ImageImageImageImageImage
Image
ImageImage

User avatar
3ICE
Admin
Posts: 2629
Joined: Sat Mar 01, 2008 11:34 pm
Realm: Europe
Account: 3ICE
Clan: 3ICE
Location: Hungary
Contact:

Re: No matter how clever you are, the stupidest (simple) solutions outperform the "smart" (complicated) ones.

Unread post by 3ICE »

(Yes, I am still reading the article - very informative! But...)

And then you just had to go and parse the whole HTML source code (tokenizing it, etc) instead of simply lifting every
/AIza[a-zA-Z0-9\-]{35}/
string with regex... Waste of processing power.

Edit: So you DID use regex in the end. And even better than mine, (simpler)
/AIza.{35}/
Should have started with that, instead of constructing a tag hierarchy from HTML code, filtering it, etc. Tokenizing HTML is far more expensive and uses orders of magnitude more regex queries than a single search over plaintext would. Of course C code working with raw string comparisons would... but no, let's not go there.

Edit: Finished reading, thank you for the write up. I stopped picking apart the second half of the article because there was nothing to complain about. Harvesting, deduplication, search result unpagination, etc. are all top notch. :)
ImageImageImageImageImage
Image
ImageImage

Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 198 guests