Search code examples
jsonswiftalamofire

Creating generic fetcher using Alamofire JSOn and HTML


I am trying to start a project based on web scraping. I have the tools already setup for different platforms for JSON I use SwiftyJSON and for raw HTML I use hpple. My problem is I am trying to setup some generic class for content and some generic class for the fetcher for the content. Since every operation goes like this,

Login If there is username or password supply it. If it has captcha display and use the result Fetch the data using Alamofire Scrape the data either by using JSON or HTML Populate the content class.

I am wondering if there is a way to define some kind of protocol, enum or generic templates so that for each class I can define those different functions. I think if I can’t make this right, I will write the same code over and over again. This is what I have come up with. I will appreciate if you can help me to set this up right.

enum Company:Int {
    case CNN
    case BBC
    case HN
    case SO 
    
    var captcha:Bool {
        switch self {
        case CNN:
            return false
        case BBC:
            return true
        case HN:
            return true
        case SO:
            return false
        }
    }
    var description:String {
        get {
            switch self {
            case CNN:
                return "CNN"
            case BBC:
                return "BBC"
            case HN:
                return "Hacker News"
            case SO:
                return "Stack Overflow"
            }
        }
    }
}

class Fetcher {
    var username:String?
    var password:String?
    var url:String
    var company:Company
    
    init(company: Company, url:String) {
        self.url = url
        self.company = company
    }
    
    init(company: Company, url:String,username:String,password:String) {
        self.url = url
        self.company = company
        self.username = username
        self.password = password
    }
    
    func login() {
        
        if username != nil {
           // login
        }
        if company.captcha {
            //show captcha
        }
    }
    
    func fetch(){
        
    }
    
    func populate() {
        
    }
}

class CNN: Fetcher {
    
    
}

Solution

  • Okay, this was a fun exercise...

    You really just need to build out your Company enumeration further to make your Fetcher more abstract. Here's an approach that only slightly modifies your own that should get you much closer to what you are trying to achieve. This is based on a previous reply of mine to a different question of yours.

    Company

    enum Company: Printable, URLRequestConvertible {
        case CNN, BBC, HN, SO
    
        var captcha: Bool {
            switch self {
            case CNN:
                return false
            case BBC:
                return true
            case HN:
                return true
            case SO:
                return false
            }
        }
    
        var credentials: (username: String, password: String)? {
            switch self {
            case CNN:
                return ("cnn_username", "cnn_password")
            case BBC:
                return nil
            case HN:
                return ("hn_username", "hn_password")
            default:
                return nil
            }
        }
    
        var description: String {
            switch self {
            case CNN:
                return "CNN"
            case BBC:
                return "BBC"
            case HN:
                return "Hacker News"
            case SO:
                return "Stack Overflow"
            }
        }
    
        var loginURLRequest: NSURLRequest {
            var URLString: String?
    
            switch self {
            case CNN:
                URLString = "cnn_login_url"
            case BBC:
                URLString = "bbc_login_url"
            case HN:
                URLString = "hn_login_url"
            case SO:
                URLString = "so_login_url"
            }
    
            return NSURLRequest(URL: NSURL(string: URLString!)!)
        }
    
        var URLRequest: NSURLRequest {
            var URLString: String?
    
            switch self {
            case CNN:
                URLString = "cnn_url"
            case BBC:
                URLString = "bbc_url"
            case HN:
                URLString = "hn_url"
            case SO:
                URLString = "so_url"
            }
    
            return NSURLRequest(URL: NSURL(string: URLString!)!)
        }
    }
    

    News

    struct News {
        let title: String
        let content: String
        let date: NSDate
        let author: String
    }
    

    Fetcher

    class Fetcher {
    
        typealias FetchNewsSuccessHandler = [News] -> Void
        typealias FetchNewsFailureHandler = (NSHTTPURLResponse?, AnyObject?, NSError?) -> Void
    
        // MARK: - Fetch News Methods
    
        class func fetchNewsFromCompany(company: Company, success: FetchNewsSuccessHandler, failure: FetchNewsFailureHandler) {
            login(
                company: company,
                success: { apiKey in
                    Fetcher.fetch(
                        company: company,
                        apiKey: apiKey,
                        success: { news in
                            success(news)
                        },
                        failure: { response, json, error in
                            failure(response, json, error)
                        }
                    )
                },
                failure: { response, json, error in
                    failure(response, json, error)
                }
            )
        }
    
        // MARK: - Private - Helper Methods
    
        private class func login(
            #company: Company,
            success: (String) -> Void,
            failure: (NSHTTPURLResponse?, AnyObject?, NSError?) -> Void)
        {
            if company.captcha {
                // You'll need to figure this part out on your own. First off, I'm not really sure how you
                // would do it, and secondly, I think there may be legal implications of doing this.
            }
    
            let request = Alamofire.request(company.loginURLRequest)
    
            if let credentials = company.credentials {
                request.authenticate(username: credentials.username, password: credentials.password)
            }
    
            request.responseJSON { _, response, json, error in
                if let error = error {
                    failure(response, json, error)
                } else {
                    // NOTE: You'll need to parse here...I would suggest using SwiftyJSON
                    let apiKey = "12345678"
                    success(apiKey)
                }
            }
        }
    
        private class func fetch(
            #company: Company,
            apiKey: String,
            success: FetchNewsSuccessHandler,
            failure: FetchNewsFailureHandler)
        {
            let request = Alamofire.request(company.URLRequest)
            request.responseJSON { _, _, json, error in
                if let error = error {
                    failure(response, json, error)
                } else {
                    // NOTE: You'll need to parse here...I would suggest using SwiftyJSON
                    let news = [News]()
                    success(news)
                }
            }
        }
    }
    

    Example ViewController Calling Fetcher

    class SomeViewController: UIViewController {
    
        override func viewDidLoad() {
            super.viewDidLoad()
    
            Fetcher.fetchNewsFromCompany(
                Company.CNN,
                success: { newsList in
                    for news in newsList {
                        println("\(news.title) - \(news.date)")
                    }
                },
                failure { response, data, error in
                    println("\(response) \(error)")
                }
            )
        }
    }
    

    By allowing the Company object to flow through your Fetcher, you should never have to track state for a company in your Fetcher. It can all be stored directly inside the Enum.

    Hope that helps. Cheers.