The World Wide Web by Email

by Odd de Presno


Sample text from the Online World Monitor newsletter ISSN: 0805-6315. August 1994. (C) by Odd de Presno, Norway. (Note: Links are not maintained!)
Most people only have email access to the Internet, and are therefore deprived of interactive access to the World Wide Web.

The good news is that most pages are available by email!

Request WWW pages by sending email to agora@www0.cern.ch . Put your retrieval commands in the BODY of the mail, like this

   send <URL>
Example:

www http://www.biotech.washington.edu/WebCrawler/WebCrawlerExamples.html 
That's all. Lean back and wait. You will get a page filled with hints on how to use the WebCrawler service. The mail will look like this:

Example


Date: Mon, 15 Aug 1994 18:10:44 +0200
From: daemon@www0.cern.ch (The CERN WWW Team Administration)
Subject: Hints for Searching the WebCrawler Index (was:  )
 
This is a test version. Please mail any comments to www-request@info.cern.ch 
  
The document you requested, which URL is http://www.biotech.washington.edu/WebCrawler/WebCrawlerExamples.html, follows
  
  
              Hints for Searching the WebCrawler Index
   The WebCrawler knows about a lot of documents, so it pays to make precise
   queries.  Often, though, you can be too precise, so finding what you want
   may take a couple of queries.  Here are some suggestions about what to do
   when you don't get what you want, some examples to help you out, and
   detailed explanation of what happens to your query before it's run.
    
 WHAT TO DO WHEN...
   
   Your search produces no results.  Check your spelling!  If that looks OK,
   then try to be less specific in your query.  For instance, the query
   molecular biotechnology DNA sequencing genetics chromosome human genome
   project is too specific -- no one document contains all of those keywords.
   Something like molecular biotechnology DNA sequencing is more appropriate.
    
   Your search produces too many results.  Be more specific, and make sure you
   have the AND button checked.  Try to think of words that uniquely identify
   what you're looking for.  Some words are of little value, because they
   identify lots of documents in the WebCrawler's index.  For instance, the
   words information and university together identify nearly half the documents
   in the index, so they're not very useful in trying to narrow down the
   search.
    
   You get an error from the WebCrawler.  The WebCrawler will return an
   unfriendly error message if it's too busy, or if it chokes on your query.
   If it repeatedly has trouble with your query, please let me know, as I'm
   trying to eliminate these problems.  Thanks!
    
 Examples
 
    Most specific queries work quite well.  For instance, if you're looking
    for information on the music group They Might Be Giants, search for They
    Might Be Giants, or just TMBG.
       
    Some keywords are found in many places.  For example, instead of
    searching for kermit, use something more descriptive like kermit columbia
    or kermit source code communication.  Make sure the "AND" button is
    checked.
       
    To find references to the New York Times, try the query New York Times.
    To be more specific, try something like New York Times online newspaper.
      
 How a query works
 
    The query is parsed in to keywords on space and punctuation boundaries.
      
    Each word is folded to lower case, and any endings are stripped (NeXT
    Computers becomes next computer).
      
    Each word is checked against a stop list, to see if it's too common to
    worry about (to be or not to be is a null query!).
      
    Each word is fed to the index, and the resulting lists of documents are
    combined.
       
                                                    bp@cs.washington.edu[1]
                                                                                
 
 *** References from this document ***
 [1] http://www.cs.washington.edu/homes/bp/bp.html

The last line of the report is interesting. The "[1]" refers to the following entry in the page's text:

                                              bp@cs.washington.edu[1]
Interactive WWW users can click at this reference to see the associated page. Those using email must send the URL at the bottom of the report back to the LISTSERV to get it.

Actually, there is also a WWWmail command called "deep" that allows you to get all documents in the URL you mentioned. If you replace "www" above with

 deep http://www.biotech.washington.edu/WebCrawler/WebCrawlerExamples.html 
you will get both the "Hints" page, and the one giving more information about bp@cs.washington.edu .

    Note: If the requested document is too large, you'll only get the
    first 5,000 lines.
There may be many such references pointers in the text, as illustrated by this page at URL: http://web2.xerox.com/digitrad

Example:


Date: Mon, 15 Aug 1994 14:03:10 +0200 From: daemon@www0.cern.ch (The CERN WWW Team Administration) Subject: Digital Tradition Folk Song Full Text Search (was: ) This is a test version. Please mail any comments to www-request@info.cern.ch The document you requested, which URL is http://web2.xerox.com/digitrad, follows Digital Tradition Folk Song Full Text Search DIGITAL TRADITION FOLK SONG DATABASE This is a searchable index of the Digital Tradition Folk Song Database (April 1994 version). Please read About The Digital Tradition[1] and Searching Digital Tradition[2]. Full Text Search You may enter a Search Pattern to select songs from the database. Options: search titles[3] or search full text; show matching text or list titles only[4]; list first 50 or list more (100)[5]; default settings. Contents Keywords List[6] Titles List[7] Tunes List[8] (DT of April 1994) *** References from this document *** [1] http://web2.xerox.com/docs/DigiTrad/AboutDigiTrad.html [2] http://web2.xerox.com/docs/DigiTrad/DigiTradSearch.html [3] http://web2.xerox.com/digitrad/titles [4] http://web2.xerox.com/digitrad/short [5] http://web2.xerox.com/digitrad/list=100 [6] http://web2.xerox.com/docs/DigiTrad/DigiTradKeywords.html [7] http://web2.xerox.com/docs/DigiTrad/DigiTradTitles.html [8] http://web2.xerox.com/docs/DigiTrad/DigiTradTunes.html

For more information about this WWW by mail service, send the word "help" to agora@www0.cern.ch.

                        --- end ---
Note: There is another service delivering WWW by email at the email address webmail@www.ucc.ie . Check out this Web-address: http://www.ucc.ie/webmail/ for instructions.

The Online World Monitor newsletter

is a bi-monthly online product. Initially meant as a free, optional offering for supporters of The Online World resources handbook, it is also open for subscription by others.

The newsletter and the book are companions. While the book describes the online world as it is, the newsletter tracks changes. It can more freely focus on selected offerings or phenomena than can be done within the strict framework of the book.

For more about the newsletter, try the following URL:

      monitor.html
To give copies of this sample to others, send a message to LISTSERV@LISTSERV.NODAK.EDU with the following command in the TEXT of your mail:
    GIVE TOW SAMPLE1 TO <email address>
Replace <email address> by the recipient's email address.

Thanks,

Odd de Presno

(publisher/author)

Email: presno@eunet.no 
Web page: http://home.eunet.no/~presno/presno.html
KIDLINK (Global Dialog for Kids 10 - 15) on URL: http://www.kidlink.org

Feel free to redistribute as long as the text remains intact as it appears here (including this paragraph). Permission to quote/excerpt/reference in other media is hereby granted, so long as cited material is identified as coming from The Online World Monitor newsletter. For any other use, contact the author for permission.
Back to the the newsletter page. Back to the The Online World home page. For Quick Navigation