I'm stuck at the point where I need to crawl websites that have a form post. Nutch does not support this. How do I get around this so I can crawl these websites using Nutch? Is there a better solution?
Here's the simplest solution. The problem is, there is no one simple solution for big amount of websites. There are problems with cookie expiring / using of Javascript during login / etc. Search through Nutch's JIRA, there were many discussions about that.