com.faa.dom.tutorial
Class DOMImageCrawler

java.lang.Object
  |
  +--java.lang.Thread
        |
        +--com.faa.dom.tutorial.DOMImageCrawler

public class DOMImageCrawler
extends java.lang.Thread
implements Observable

A DOM Visitor that takes all the images that match a particular keyword, and saves them to disk. This Visitor only works with XML with the following DTD:

<!ENTITY % metadata "author,date,keywords*">

<!-- An Article is composed by metadata and paragraphs with images -->
<!ELEMENT article  (%metadata;,para+)>
<!ELEMENT para     (#PCDATA|image|a)*>
<!ELEMENT a        (#PCDATA)>
<!ELEMENT author   (#PCDATA)>
<!ELEMENT date     (#PCDATA)>
<!ELEMENT keywords (#PCDATA)>
<!ATTLIST a href      CDATA   #REQUIRED> <!--simple links to ABSOLUTE URLs -->
<!ELEMENT image    (%metadata;)>
<!ATTLIST image    src    CDATA   #REQUIRED>    


Fields inherited from class java.lang.Thread
MAX_PRIORITY, MIN_PRIORITY, NORM_PRIORITY
 
Constructor Summary
DOMImageCrawler(int newDepth, java.lang.String initialURL, java.util.Vector newKeywords, java.lang.String newTargetDir)
          Construct a new DOMImageCrawler, starting with the given parameters.
 
Method Summary
 void addObserver(Observer observer)
          Add an observer
 java.lang.String getCurrentDocument()
           
 int getCurrentLevel()
           
 int getLoadedImagesCount()
           
 java.util.Hashtable getToVisit()
           
 java.util.Hashtable getVisited()
           
 void removeObserver(Observer observer)
          Remove an object from the list of of observers (if it is not there, do nothing)
 void run()
          Iterate through the nodes to visit, passing each new document to the visitor method, so it can be scanned
 void saveImage(org.w3c.dom.Element n)
          Extract the name of the image from the DOM Node and save it to disk
 void setDepth(int newDepth)
           
 void setKeywords(java.util.Vector newKeywords)
           
 void visitNode(org.w3c.dom.Node n)
          Visit the tree, watching for image nodes with the appropriate keywords.
 
Methods inherited from class java.lang.Thread
activeCount, checkAccess, countStackFrames, currentThread, destroy, dumpStack, enumerate, getContextClassLoader, getName, getPriority, getThreadGroup, interrupt, interrupted, isAlive, isDaemon, isInterrupted, join, join, join, resume, setContextClassLoader, setDaemon, setName, setPriority, sleep, sleep, start, stop, stop, suspend, toString, yield
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

DOMImageCrawler

public DOMImageCrawler(int newDepth,
                       java.lang.String initialURL,
                       java.util.Vector newKeywords,
                       java.lang.String newTargetDir)
Construct a new DOMImageCrawler, starting with the given parameters.
Parameters:
depth - the depth of the search
initialURL - initial place to start the search
keywords - the given keywords
Method Detail

setDepth

public void setDepth(int newDepth)

setKeywords

public void setKeywords(java.util.Vector newKeywords)

visitNode

public void visitNode(org.w3c.dom.Node n)
               throws java.net.MalformedURLException,
                      java.io.IOException
Visit the tree, watching for image nodes with the appropriate keywords.
Parameters:
n - the current node

saveImage

public void saveImage(org.w3c.dom.Element n)
Extract the name of the image from the DOM Node and save it to disk
Parameters:
n - the paragraph object containing the image

run

public void run()
Iterate through the nodes to visit, passing each new document to the visitor method, so it can be scanned
Overrides:
run in class java.lang.Thread

addObserver

public void addObserver(Observer observer)
Add an observer
Specified by:
addObserver in interface Observable
Parameters:
observer - the observer object to be added.

removeObserver

public void removeObserver(Observer observer)
Remove an object from the list of of observers (if it is not there, do nothing)
Specified by:
removeObserver in interface Observable
Parameters:
observer -  

getCurrentLevel

public int getCurrentLevel()

getCurrentDocument

public java.lang.String getCurrentDocument()

getVisited

public java.util.Hashtable getVisited()

getToVisit

public java.util.Hashtable getToVisit()

getLoadedImagesCount

public int getLoadedImagesCount()