Author: | - | Version: | - |
Package: | phpcrawl | Category: | - |
-
Public Properties | ||
---|---|---|
URL-related information | ||
| The name of the requested page or file, e.g. "page.html". | |
| The host-part of the URL of the requested page or file, e.g. "www.foo.com". | |
| The path in the URL of the requested page or file, e.g. "/page/". | |
| The port of the URL the request was send to, e.g. 80 | |
| The protocol-part of the URL of the page or file, e.g. "http://" | |
| The query-part of the URL of the requested page or file, e.g. "?x=y". | |
| The complete, full qualified URL of the page or file, e.g. "http://www.foo.com/bar/page.html?x=y". | |
| The linking-depth of the URL related to the entry-URL of the crawling-process. | |
Content-related information | ||
| The number of bytes the crawler received of the content of the document. | |
| The content of the requested document (html-sourcecode or content of file). | |
| The temporary file to which the content was received. | |
| The content-type of the page or file, e.g. "text/html" or "image/gif". | |
| Cookies send by the server. | |
| The complete HTTP-header the webserver responded with this page or file. | |
| The number of bytes the crawler received of the header of the document. | |
| The HTTP-statuscode the webserver responded for the request, e.g. 200 (OK) or 404 (file not found). | |
| All meta-tag atteributes found in the source of the document. | |
| Flag indicating whether content was received from the page or file. | |
| Flag indicating whether content was completely received from the page or file. | |
| Will be true if the content was received into temporary file. | |
| Will be true if the content was received into local memory. | |
| The complete HTTP-header the webserver responded with this page or file as a PHPCrawlerResponseHeader-object. | |
| Same as "content", the content of the requested document. | |
Information about found links | ||
| An numeric array containing information about all links that were found in the source of the page. | |
| An numeric array containing a PHPCrawlerURLDescriptor-object for every link that was found in the page. | |
Referer information | ||
| The complete URL of the page that contained the link to this document. | |
| Contains the raw link as it was found in the content of the refering URL. (E.g. "../foo.html") | |
| The html-sourcecode that contained the link to the current document. | |
| The linktext of the link that "linked" to this document. | |
Error-handling | ||
| The code of the error that perhaps occured while requesting/receiving the document. (See PHPCrawlerRequestErrors::ERROR_... - constants) | |
| Indicates whether an error occured while requesting/receiving the document. | |
| A representig, human readable string for the error that perhaps occured while requesting/receiving the document. | |
Benchmarks | ||
| The approximated data-transferrate for this document. | |
| The approximated time it took to receive the data of the document. | |
| The time it took to connect to the server | |
| The server response time | |
| Number of unbuffered bytes received | |
Deprecated | ||
| Alias for received_completely, was spelled wrong in prevoius versions of phpcrawl. (deprecated!) | |
Other | ||
| The complete HTTP-request-header the crawler sent to the server (debugging info). | |
| Indicated whether the traffic-limit set by the user was reached after downloading this document. |