While working on one of the projects, i tried to find multi-purpose HTTP request class that can use different network interfaces/ip addresses with retry option (if connection slow or server not responding for some reason).
Here is a small class wrapper build on top of Ruby Curb implemented as a module:
module ApiRequest USER_AGENTS = [ 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3', 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 2.0.50727)', 'Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.3) Gecko/20100423 Ubuntu/10.04 (lucid) Firefox/3.6.3', 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_3; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.70 Safari/533.4', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.2) Gecko/20100323 Namoroka/3.6.2', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100401 Ubuntu/9.10 (karmic) Firefox/3.5.9' ] CONNECTION_TIMEOUT = 10 @@interfaces = [] # get random user-agent string for usage def random_agent USER_AGENTS[rand(USER_AGENTS.size-1)] end # get random IP/network interface specified in @@interfaces def random_interface size = @@interfaces.size size > 0 ? @@interfaces[rand(size-1)] : nil end # perform request, assign_to - specify network interface/ip def perform(url, assign_to=nil) puts url interface = assign_to.nil? ? self.random_interface : assign_to req = Curl::Easy.new(url) req.timeout = CONNECTION_TIMEOUT req.interface = interface unless interface.nil? req.headers['User-Agent'] = self.random_agent begin req.perform if req.response_code == 200 return req.downloaded_bytes > 0 ? req.body_str : nil else nil end rescue Exception return nil end end # perform request by number of attempts def fetch(url, attempts=3) result = nil 1.upto(attempts) do |a| result = self.perform(url) break unless result.nil? end return result end end
And sample usage:
class TestRequest include ApiRequest def foo body = self.fetch('http://google.com') end end
If module variable “@@interfaces” is array of ip addresses or network interfaces then one of them (randomly selected) will be used to perform request. Also, function “fetch” has parameter “attempts” which set to 3 by default. It means that operation will be invoked n times until result is downloaded from url. Otherwise – it returns nil.
Function perform has a parameter “assign_to” (which it not used in “fetch” function) that allows to bind request to specified interface. It is useful if you have situation when you might use different workers that bound to exact interface or just one that uses random ip`s. Also, class ApiRequest has a list of user agents which it uses randomly for each performed request.