public resume($crawler_id)
$crawler_id | int | The crawler-ID of the crawling-process that should be resumed. (see getCrawlerId()) |
No information |
If a crawling-process was aborted (for whatever reasons), it is possible
to resume it by calling the resume()-method before calling the go() or goMultiProcessed() method
and passing the crawler-ID of the aborted process to it (as returned by getCrawlerId()).
In order to be able to resume a process, it is necessary that it was initially
started with resumption enabled (by calling the enableResumption() method).
This method throws an exception if resuming of a crawling-process failed.
Example of a resumeable crawler-script:// ...
$crawler = new MyCrawler();
$crawler->enableResumption();
$crawler->setURL("www.url123.com");
// If process was started the first time:
// Get the crawler-ID and store it somewhere in order to be able to resume the process later on
if (!file_exists("/tmp/crawlerid_for_url123.tmp"))
{
$crawler_id = $crawler->getCrawlerId();
file_put_contents("/tmp/crawlerid_for_url123.tmp", $crawler_id);
}
// If process was restarted again (after a termination):
// Read the crawler-id and resume the process
else
{
$crawler_id = file_get_contents("/tmp/crawlerid_for_url123.tmp");
$crawler->resume($crawler_id);
}
// ...
// Start your crawling process
$crawler->goMultiProcessed(5);
// After the process is finished completely: Delete the crawler-ID
unlink("/tmp/crawlerid_for_url123.tmp");