public obeyRobotsTxt($mode, $robots_txt_uri = null)
|$mode||bool||Set to TRUE if you want the crawler to obey robots.txt-files.|
|$robots_txt_uri||string||Optionally. The URL or path to the robots.txt-file to obey as URI (like "http://mysite.com/path/myrobots.txt"|
If not set (or set to null), the crawler uses the default robots.txt-location of the root-URL ("http://rooturl.com/robots.txt")
If this is set to TRUE, the crawler looks for a robots.txt-file for the root-URL of the crawling-process at the default location
and - if present - parses it and obeys all containig directives appliying to the
useragent-identification of the cralwer ("PHPCrawl" by default or manually set by calling setUserAgentString())
The default-value is FALSE (for compatibility reasons).
Pleas note that the directives found in a robots.txt-file have a higher priority than other settings made by the user.
If e.g. addFollowMatch("#http://foo\.com/path/file\.html#") was set, but a directive in the robots.txt-file of the host
foo.com says "Disallow: /path/", the URL http://foo.com/path/file.html will be ignored by the crawler anyway.