~ Classrooms ~
Version June 2000
This is the fourth classroom, I have collated here a thread on my old messageboard
where one of the most serious
~S~ (Humphrey P.) I know of attempts, with the help of Gregor Samsa and Iefaf, some
very sound 'search engine reversing'. Just read the
text below, where he goes to great lengths, with the help of various friends,
in order to decipher the meaning of some Altavista's codes, and I'm sure you'll enjoy...
This thread, originally, on my old messageboard.
Spelunking altavista's acronyms
~SS~ Humphrey P., Gregor Samsa & Iefaf, June 2000
Thread slightly edited by
It began with this question by 184.108.40.206:
Anybody know how to set search parameters to www.raging.com through CGI string instead of
Humphrey P., first attempt
How to run it without cookies, someone with our highest numbered URL would like to know.
(by the way, who owns the highest numbered URL? the lowest? the one in the middle?
the median? yours? should we stone them?)
www.raging.com is the same as
ragingsearch.altavista.com is the same as www.altavista.com
ragingsearch.altavista.com/cgi-bin/query? the same as www.altavista.com/cgi-bin/query?
Have you been collecting CGI strings?
Here's a bunch for:
& : separator
act=2007 : I have an
d0=1%2F1%2F99 : date from
d1=18%2F5%2F2000 : date to
hl=on : -?-
kl=cs : Czech
kl=XX : any language (Expect the rest of the languages to follow ISO
639, eg: http://babel.alis.com:8080/langues/iso639.en.htm ISO 639: 1988)
mmdo=16 : (on an
image search: stype=simage)
par=0 : parent equals zero? (I haven't proven this yet.)
: page is advanced query
pg=q : page is simple query (default main page at
q=this+AND+that+AND+these+AND+those+AND+NOT+them : my query is
q=the : my query is [the]
r=is : raise or so(r)t to the top the keyword(s)
sc=on : show one result per Web site (see:
http://doc.altavista.com/adv_search/ast_as_compress.shtml site compression)
-?- (you'd think so...)
search.x=32 : (starting pixel of ad filled page?)
(starting pixel of ad filled page?)
stype=simage : searchtype is s-image
searchtype is s-text
text=yes : don't send me so many ads
what=web : search what? The
pg=pref : customize
Will raging search run with some of those strings? AltaVista was allowing
only the last language of a set: eg. if you asked for &kl=es&kl=en you got &kl=en. So, were the
cookies a kludge around the CGI parameter limitations?
Gregor Samsa, first answer
I thought you'd take on here, Humphrey ;)
How can raging be run without cookies ? I
guess, the easiest way to do so is to disable "Cookies" in your browser or firewall or whatever.
What happens if you deny the cookie ? Raging doesn't remember your settings and you get the
defaults next time.
Now, Mr. Spaceproxy-Without-Name, although Humphrey said it was no
answer, in fact he gave one. Just the question has to change a bit. It looks as if raging behaves
like the altavista simple search interface, with some differences.
Look at this
"q=test" The thing you are looking for
Family Filter ? (Set it ON if you are under age ;)
Number of hits to show ?
"Translate=off" You get no link to babelfish
One of the
differences is, that you can use several language parameters at once with raging whereas
www.altavista.com/cgi-bin/query? only uses the last KL param given. HP showed it.
Humphrey P., second go
Yeah, gs. My little trip to the raging cookie pusher brought me back the same ones you
FFF=on : family filter [on]
wfmt=tau : (i wanted compact instead of complete page
nbq=20 : number of results to show per page 
prf=Submit : (something
about my profile - it was submitted? I accepted a cookie? I came from the profile page?)
KL=zh : language to search in (in this case, Zhongwen. You'd recognize it as
KL=en : English
KL=fr : French
KL=de : Deutsch
enc=big5 : language encoding
(in this case, for Zhongwen)
Translate=on : show the translate option
for a cookie
which looked like:
And don't forget
pg=pref : customize or preferences page
v=m : (on the way to the preferences
Some notes and questions.
Had you noticed the capital "KL=" ?
(It seems to me you were trying that out a few weeks ago?)
Perhaps KL=en is different from
kl=en -?- If you would use all caps KL=en&KL=fr&KL=de you would get all your languages even in
"AltaVista found no document matching your query."
pages found at advanced. But notice what it does to your list of languages! [my
So, what does simple
About 23 pages found.
word count: Bilbao: 239105; Adolf: 342273;
Gustav: 447835; II: 28663115
Hmmm. grumble grumble... I've got a lapse of thinking
attack. 23 pages... where have I seen that before?
Another thing... text=yes is
ignored where? ah, when you are off looking for images? (where else? where there's money to be
This from a while ago. Does it still work?:
One was a conundrum: (guy got altavista to
index its own error page)
Two was someone's saved query:
Web Pages 518,470 pages found.
Third, is new
What a nifty portal to the world!
http://world.altavista.com/r/x13/http://www.h2g2.com/A172685 Babel fish:
http://world.altavista.com/r/x8/http://www.h2g2.com/A172685 Origin of the
Translation Tools (an ad for
arch%2Fresults.htm&user=avworld&q=special+characters+in+html&stype=stext&x=37&y=13 Using World
Alphabets on Web
Internet Domains (a pretty long list from ISO 3166 (does have TV=Tuvalu))
Now, where on
this page does kl=en?
The search in the left frame panel is set up to default to search in
Hmmm. That's not it, though. On the backside, that looks like this:
rnet&kl=en&pg=q Sending faxes over the Internet
Finding 'query?' on the backside, I come
up with these AltaVista parameter
kl=en : language is English
pg=q : page is
AltaVista Main search
pg=aa : page is
sc=on : site compression is on
stype=stext : search type is s-text
user=avworld : (who else could be user? Are there
Does user: AltaVista World have special privileges?
What is it, that x and y are placing? Or are they trying to document the origin of a
gridmapping scheme to track your mouse position?
Can you think of other ways to
get AltaVista to index its own generated listings and error pages? Bite its own tail? (oh, gee,
somebody had the name of that hoop snake a while back... I forget what it's called.)
Gregor Samsa, second go
Hi, Humphrey !
Want some more stuff ? I think, most of it is new. the altavista main
engine has a lot more holes than I thought. I didn't even get to your kl/KL question and the
tailbiting stuff. But see for yourself.
One more oddity at
I did not accept the cookie, but
entered the language parameters manually after the first results were shown:
After going on to page 2 of the hits I happened to see all these
language identifiers with only _one_ "KL": KL=enesfrde&
No differences in what (or how much)
it found, as far as I can tell.
[word count: Pinkola: 4031; Clarissa:
121615; Estes: 304093]
"Raging Search found no document matching your
[word count: Pinkola:
4031; Clarissa: 121615; Estes: 304093]
"36 pages found."
The first one seems not to
work at all. The second one behaves like the old simple search in that it only looks up German
What the heck is going on here ? Has that changed since lately
This seems to toggle the
appearance of the "Results from this site only" - link
I thougt I had
found some differences in numbers at raging a few weeks ago, but our dirty old friend told me he
couldn't reproduce it. I didn't do any work on that since then. But I scheduled it for the coming
weekend. (I first have to set up my machine completely new. Right now, it sucks - the usual win
problem after installing/deinstalling/playing around too much). Don't tell me to use Linux - I
use it anyway. But I like Opera far too much. And the linux version is not very decent yet.
I didn't even know that "text=yes" could be used with
the advanced search...(Thanx for the tip ;)
Watch that typo: "quer<" instead of "query". It is simply _ignored_. Do you see
any differences when using "query" compared to "quer<" ? Even
Obviously it does not matter. Every URL that aims for the cgi-bin gets processed ? Or
one I found by chance. I'm unsure what it might tell us.
Anyway, it tells you it
"found no document matching your query" AND gives you ten results, but with no links to more
results. It's somehow like the Hitchiker's Guide: Can u reverse this and tell me what the
question was ?
I suppose, "stype=ntext" is Usenet, whereas "stext" means the regular
textsearch. Let's switch back to
"About 75,816 pages found". And what is the first ten ? Aha !
It seems, if we tell
altavista that it should do an advanced usenet search, it gets confused. Fortunately I am
completely ignorant what the parameters look like in a usenet search. So I gave this one a
Again: "...found no documents..." and ten hits. But what ? See the titles that are displayed in
your hit list:
www.oregongrounds.com - nothing but an imagemap. No "test1" in the source
of this page. But this might be chance. It could have changed recently. While I look through the
others, I see that all they have "test1" as title.
I get the same results with
I switch back to
=> 3954 hits.
OK, a "real" Usenet search looks
"/cgi-bin/query" with "Top" seems to be a general feature. It works here as
I'll look into the differences between kl and KL
tomorrow, as well as to try this tailbiting stuff. I need to read my old notes again. For now it
is time to go to bed. If I remember right, the guy who found the structure of the benzole
molecule is said to have dreamt of a snake that bit its own tail. I'll try to dream of cgi
params, perhaps that helps.
This is just a confirmation of what you've found.
Whatever you write before the question
mark is not taken by raging or
Out of curiosity I scanned from 220.127.116.11 to 18.104.22.168
and checked the reverse DNS.
Next step is to view the raw websites and sniff around the
of course this is still in fieri...
(c) 2000: [fravia+], all rights