I am working on a simple php scraper , the problem is that some of the websites I need to scrape have a captcha I need to solve, I used some services before, but since this is a small project I'd like to solve the captchas manually.
Is there a library I could use to simplify this? I mean the service I used had a library, where I just sent the image to their server and they gave me back the captcha solved, now I am looking for some library that would do something like that but it needs to have the part that shows the captcha and allow me to solve it manually and then pass it back to my app.
I think it's actually trivial to do. On the PHP site of things, you just need to send the image or img URL to a separate process. exec() would usually not be an option, so I'd suggest a inetd-process and fsockopen:
$f = fsockopen("localhost", 55555, $errno, $errstr, 30);
fwrite($f, $IMAGE_URL);
$captcha = fread($f, 100);
Register a script for port 55555 and on invocation make it read the URL from stdin. Display the image in a window, wait for keyboard input, return said input over stdout (socket). Don't forget set_time_limit
though.
I'd recommend a Tcl/Tk script, but am too lazy to write one.