phelps.net

Class Crawler

public class Crawler extends Observable implements Runnable

Under development. Crawler over network or file system, reporting to clients. Given a start page, crawls. Specify: this page only, subtree only, site only. And: levels deep Multithreaded, no cycles, respects ROBOTS.TXT, ... Add as an observer to take action after each page parse.

Version: $Revision: 1.2 $ $Date: 2003/06/01 08:13:48 $

See Also: RobustHyperlink

Field Summary
static intANY
booleanfTrace
static intPAGE
static intSITE
static intSUBTREE
Constructor Summary
Crawler(URL start)
Method Summary
intgetMaxDepth()
intgetMaxThreads()
intgetScope()
static voidmain(String[] argv)
voidrun()
voidsetMaxDepth(int maxDepth)
voidsetMaxThreads(int max)
voidsetScope(int scope)

Field Detail

ANY

public static final int ANY

fTrace

public boolean fTrace

PAGE

public static final int PAGE

SITE

public static final int SITE

SUBTREE

public static final int SUBTREE

Constructor Detail

Crawler

public Crawler(URL start)

Method Detail

getMaxDepth

public int getMaxDepth()

getMaxThreads

public int getMaxThreads()

getScope

public int getScope()

main

public static void main(String[] argv)

run

public void run()

setMaxDepth

public void setMaxDepth(int maxDepth)

setMaxThreads

public void setMaxThreads(int max)

setScope

public void setScope(int scope)