Search Tools for IRF’s Legacy Document Collection and IRF’s Legacy Projects

Filtered search:



This heritage website (IRF.ORG) is an attempt to provide electronic access to much of the information produced and collected by Island Resources from 1972 to 2015 (Legacy Collection Documents), and to supply information about the projects and programs (Legacy Projects) that were the basis for IRF’s reports, publications, and other products.

General Tips about the Search Tools

There are many ways to use the search tools on this page to find documents, references and links of use for everything from casual reading to in-depth research.

First, try a search by Subject — most easily selected by clicking on the most appropriate term in the column on the right side of this and every search page.

The second way to search is simply a full text search using the most specific relevant terms (e.g., acropora, rather than coral.) This search is built into the WordPress platform that IRF.ORG uses, and it searches across all textual material in the website. The RULES for WordPress full text search are presented at the bottom of this page, and might help you to avoid some unexpected results.

Third, and especially if your interest is in information developed specifically by IRF, is to search by Category (which are listed in the upper right corner of each page). These four Categories correspond to the major programmatic themes pursued by the Foundation, which are explained directly, at length, in the Legacy section of this website.

Fourth, in a related vein, since 1995 IRF has moderated dozens of e-mail discussion groups on the “Yahoogroups” platform. Each group has an archive that is text-searchable, if you register with that group. At this page, we have information about the subject matter of the main e-mail groups and how to register.


POSTSCRIPT:  An underlying premise for the establishment of the Foundation and all of IRF’s subsequent work was the conviction that for highly stressed small island environments, access to reliable information (both historic and current) needed to be emphasized as the basis for improved, less risky, decision-making about resource management. Details about the organizational dimensions and history of the environmental library first established by the Foundation in the 1970s can be found at Historical Development of the IRF Library Collection.


Rules for WordPress Full Text Search:

TERMINOLOGY

What the user puts in the search box is a list of “search terms”. Search terms are either, loosely speaking, “words” or “phrases”. These are separated by spaces, tabs or (slightly surprisingly) commas. Phrases are indicated by enclosing them in double quotes – but users have to know this syntax, so hardly any searches will use phrases (though more than the URL options mentioned at the end, because Google searches using quoted phrases are at least a little known).When search terms “match” words and phrases in a page or post, the same sequence of characters in the terms are found in the article and that article is included in the results. This is also sometimes called a “search hit”.

WHERE IT SEARCHES

Default WordPress searches only look at page or post titles and main content. They completely ignore excerpts, comments, tags, categories, custom fields and everything else.

[The IRF.ORG web site, however, has added the capability to search the text content of downloadable .PDF documents.]

Search terms can appear in either title or content independently. That is, a search for north east will match an article called The North West passage which contains starting from the east… in the content.

HOW IT DIFFERS FROM GOOGLE

Most people’s benchmark for searching is probably Google, so it may be helpful to understand the main differences.

  • WordPress searches aren’t backed by a thesaurus, so a search won’t find things which are synonymous, or different verb tenses or plurals (it may appear to find plurals sometimes when singular is asked for, but that’s for a different reason – see below)
  • It doesn’t match articles where only some of the search terms match, only all of them,
  • Very confusingly for users, WordPress finds results where the search term appears inside other words, not just whole words. For example, love will find articles containing she was beloved by him, it was a lovely  (all potentially helpful) and he was wearing gloves (distinctly unhelpful). Sometimes this helps particularly when dealing with plurals or punctuation such as words with dashes (see below)
  • Punctuation is not ignored, except for some limited cases. In particular, apostrophes are a problem in that Andrews won’t match Saint Andrew’s Church (but Andrew would because partial words match); nor will dashes (low-energy – one word to WordPress – won’t match low energy, though low energy would match low-energy because each of the two search words partially matches the single “word” low-energy in the article)
  • Accented characters are treated as different from their unaccented counterparts. Thus cafe won’t match café, whereas Google would.
  • If the PHP which WordPress runs in is installed with only US character handling functions (technically, without the multi-byte functions starting mb_…, which bizarrely is the default for PHP installs) case-insensitivity is limited to US characters, thus excluding accented characters and the like.

HOW IT SEARCHES

Search terms match an article if they appear anywhere in the content or title. If there is more than one term (word or phrase) the terms can be found in any order. Thus north west will match an article containing ten degrees west and forty degrees north.

Searches are case insensitive. That is North matches north and vice-versa. (However, as noted above, this only works properly if PHP is installed with multi-byte character support).

Phrases (multi-word search terms) match exactly, other than in case, so any punctuation, multiple spaces, leading spaces etc will count.

Words must also match exactly including punctuation, but because spaces, tabs or commas separate search terms, these won’t be matched in single words.

Outside quotes, common words (“stop words”) are ignored completely; by default, these are:
about,an,are,as,at,be,by,com,for,from,how,in,is,it,of,on,or,that,the,
this,to,was,what,when,where,who,will,with,www
but blogs which have translation packages installed will provide their own list for the relevant language.

(Hence the north street matches north street even if the appears nowhere in the article).

Any single letter (that is, a to z, not any single character) search terms are ignored.

Also, rather arbitrarily, if you have more than 9 words (after stop words are removed) it treats as if they were in quotes whether they are or not, I imagine because of the complexity of the database search it would otherwise generate.

ABOUT PLURALS AND TENSES

It may sometimes appear that singular words match their plurals or present tenses match past tenses etc. However, this is purely a consequence of partial word matches. Hence rabbit will find rabbits, disorder will find disorders and disordered and child will find children. However rabbits won’t find rabbit (plural will never match singular – except for sheep!) and knife won’t find knives (singular is not a stem of plural).

Partial matches are a consequence of how the underlying search is done in the database. The terms simply have wildcards attached either end and the database is told to search for that. In technical terms this is the MySQL LIKE clause implemented as content LIKE ‘%searchterm%‘.

These partial searches make WordPress search relatively forgiving, and therefore they hide many of the other limitations. However, it does mean you get completely unexpected false positives (like gloves for love) and worse, many false negatives where users who realise that glove matches gloves, but not why, would expect knife to match knives and therefore not realise pages are potentially missing from the results.

SPECIAL CASES

There are a couple of options (exact and sentence) that can be put in the URL to change the standard behaviour, but unless the search form has been modified to put these in, it’s unlikely they’d ever be used (and they aren’t very helpful either). ‘exact‘ doesn’t include the wildcards in the search, so the entire title or content must match the search term, not just appear somewhere within. ‘sentence‘ treats the whole search data as a single search term irrespective of quotes and separators (loosely speaking, as if the whole thing were in quotes; this is what the ‘more than 9 terms’ case above also does).

Legacy Collection Document Categories

Legacy Collection Subjects

Legacy Collection Spatial Subjects

Legacy Project Categories