COMP: quirks of the archives


I tried to track down all the posts in the Mallorn archives about
stratification of TB seeds by refrigerating or freezing back when we
were talking about 'flushing' seeds & couldn't get the search engine to
cooperate - it would give me freezer & refrigerator posts, but wasn't
limiting the results to include only the posts about seeds.  A temporary
puzzle for Chris (our faithful archives keeper).  Here's what he figured
out:
=================== Chris sez ==================
The software that builds an index of words keeps track of which words
are very common (like and, or, the).  If a word appears more than 500
times in one megabyte of data, the word goes into a 'stop list' and
isn't used in the queries anymore.

It turns out that 'seed' is a very common word in the iris-talk
archives,
so it is excluded as a viable (no pun intended) search term.
==================  end =================

I thought others might be interested in knowing this - probably doesn't
interfere with most searches, but could occasionally be a problem.  I
think it works to search for <seed> by itself, but not in a string like
I was using: <seed AND (refrig OR freez)>.  Also, I think Chris was able
to get it to work right when he did a year at a time search, but not all
years at once.

Also, the search engine in iris-photos is working right again now (some
of you had posted you couldn't find things).

Linda Mann


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




Other Mailing lists | Author Index | Date Index | Subject Index | Thread Index