root/trunk/lib/mongrel.rb

Revision 988, 14.4 kB (checked in by normalperson, 1 year ago)

mongrel: avoid needless syscall when num_processors limit is reached

Since we're going to close the socket immediately after the
num_processors limit is reached, there's no point in calling
setsockopt(2) to enable TCP_CORK on it. Instead, only enable
TCP_CORK for connections we are able to handle.

Additionally, avoid calling worker_list.length twice in the
connection rejected case and instead just set num_workers to
@workers.list.length once. I'm assuming the original
caching of worker_list = @workers.list to avoid having
the log message display a different number due to a race
condition; and this preserves that functionality.

Line 
1 # Ruby
2 require 'socket'
3 require 'tempfile'
4 require 'yaml'
5 require 'time'
6 require 'etc'
7 require 'uri'
8 require 'stringio'
9
10 # Ensure working require
11 require 'mongrel/gems'
12
13 # TODO: Only require these for RUBY_VERSION <= 1.8.6
14 #       and only for platforms that require it, exclusive matching
15 if !RUBY_PLATFORM.match(/java|mswin/) && !RUBY_VERSION.match(/1\.8\.\d/)
16 Mongrel::Gems.require 'cgi_multipart_eof_fix'
17 Mongrel::Gems.require 'fastthread'
18 end
19 require 'thread'
20
21 require 'http11'
22 require 'mongrel/logger'
23 require 'mongrel/cgi'
24 require 'mongrel/handlers'
25 require 'mongrel/command'
26 require 'mongrel/tcphack'
27 require 'mongrel/configurator'
28 require 'mongrel/uri_classifier'
29 require 'mongrel/const'
30 require 'mongrel/http_request'
31 require 'mongrel/header_out'
32 require 'mongrel/http_response'
33
34 # Mongrel module containing all of the classes (include C extensions) for running
35 # a Mongrel web server.  It contains a minimalist HTTP server with just enough
36 # functionality to service web application requests fast as possible.
37 module Mongrel
38
39   # Used to stop the HttpServer via Thread.raise.
40   class StopServer < Exception; end
41
42   # Thrown at a thread when it is timed out.
43   class TimeoutError < Exception; end
44
45   # A Hash with one extra parameter for the HTTP body, used internally.
46   class HttpParams < Hash
47     attr_accessor :http_body
48   end
49
50   # This is the main driver of Mongrel, while the Mongrel::HttpParser and Mongrel::URIClassifier
51   # make up the majority of how the server functions.  It's a very simple class that just
52   # has a thread accepting connections and a simple HttpServer.process_client function
53   # to do the heavy lifting with the IO and Ruby. 
54   #
55   # You use it by doing the following:
56   #
57   #   server = 2("0.0.0.0", 3000)
58   #   server.register("/stuff", MyNiftyHandler.new)
59   #   server.run.join
60   #
61   # The last line can be just server.run if you don't want to join the thread used.
62   # If you don't though Ruby will mysteriously just exit on you.
63   #
64   # Ruby's thread implementation is "interesting" to say the least.  Experiments with
65   # *many* different types of IO processing simply cannot make a dent in it.  Future
66   # releases of Mongrel will find other creative ways to make threads faster, but don't
67   # hold your breath until Ruby 1.9 is actually finally useful.
68   class HttpServer
69     attr_reader :acceptor
70     attr_reader :workers
71     attr_reader :classifier
72     attr_reader :host
73     attr_reader :port
74     attr_reader :throttle
75     attr_reader :timeout
76     attr_reader :num_processors
77
78     attr_accessor :logger
79
80     # Creates a working server on host:port (strange things happen if port isn't a Number).
81     # Use HttpServer::run to start the server and HttpServer.acceptor.join to
82     # join the thread that's processing incoming requests on the socket.
83     #
84     # The num_processors optional argument is the maximum number of concurrent
85     # processors to accept, anything over this is closed immediately to maintain
86     # server processing performance.  This may seem mean but it is the most efficient
87     # way to deal with overload.  Other schemes involve still parsing the client's request
88     # which defeats the point of an overload handling system.
89     #
90     # The throttle parameter is a sleep timeout (in hundredths of a second) that is placed between
91     # socket.accept calls in order to give the server a cheap throttle time.  It defaults to 0 and
92     # actually if it is 0 then the sleep is not done at all.
93     def initialize(host, port, num_processors=950, throttle=0, timeout=60, log=nil, log_level=:debug)
94      
95       tries = 0
96       @socket = TCPServer.new(host, port)
97      
98       @classifier = URIClassifier.new
99       @host = host
100       @port = port
101       @workers = ThreadGroup.new
102       @throttle = throttle
103       @num_processors = num_processors
104       @timeout = timeout
105       @logger = Mongrel::Log.new(log || "log/mongrel-#{host}-#{port}.log", log_level)
106     end
107    
108     # Does the majority of the IO processing.  It has been written in Ruby using
109     # about 7 different IO processing strategies and no matter how it's done
110     # the performance just does not improve.  It is currently carefully constructed
111     # to make sure that it gets the best possible performance, but anyone who
112     # thinks they can make it faster is more than welcome to take a crack at it.
113     def process_client(client)
114       begin
115         parser = HttpParser.new
116         params = HttpParams.new
117         request = nil
118         data = client.readpartial(Const::CHUNK_SIZE)
119         nparsed = 0
120
121         # Assumption: nparsed will always be less since data will get filled with more
122         # after each parsing.  If it doesn't get more then there was a problem
123         # with the read operation on the client socket.  Effect is to stop processing when the
124         # socket can't fill the buffer for further parsing.
125         while nparsed < data.length
126           nparsed = parser.execute(params, data, nparsed)
127
128           if parser.finished?
129             if not params[Const::REQUEST_PATH]
130               # it might be a dumbass full host request header
131               uri = URI.parse(params[Const::REQUEST_URI])
132               params[Const::REQUEST_PATH] = uri.path
133             end
134
135             raise "No REQUEST PATH" if not params[Const::REQUEST_PATH]
136
137             script_name, path_info, handlers = @classifier.resolve(params[Const::REQUEST_PATH])
138
139             if handlers
140               params[Const::PATH_INFO] = path_info
141               params[Const::SCRIPT_NAME] = script_name
142
143               # From http://www.ietf.org/rfc/rfc3875 :
144               # "Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST
145               #  meta-variables (see sections 4.1.8 and 4.1.9) may not identify the
146               #  ultimate source of the request.  They identify the client for the
147               #  immediate request to the server; that client may be a proxy, gateway,
148               #  or other intermediary acting on behalf of the actual source client."
149               params[Const::REMOTE_ADDR] = client.peeraddr.last
150
151               # select handlers that want more detailed request notification
152               notifiers = handlers.select { |h| h.request_notify }
153               request = HttpRequest.new(params, client, notifiers)
154
155               # in the case of large file uploads the user could close the socket, so skip those requests
156               break if request.body == nil  # nil signals from HttpRequest::initialize that the request was aborted
157
158               # request is good so far, continue processing the response
159               response = HttpResponse.new(client)
160
161               # Process each handler in registered order until we run out or one finalizes the response.
162               handlers.each do |handler|
163                 handler.process(request, response)
164                 break if response.done or client.closed?
165               end
166
167               # And finally, if nobody closed the response off, we finalize it.
168               unless response.done or client.closed?
169                 response.finished
170               end
171             else
172               # Didn't find it, return a stock 404 response.
173               client.write(Const::ERROR_404_RESPONSE)
174             end
175
176             break #done
177           else
178             # Parser is not done, queue up more data to read and continue parsing
179             chunk = client.readpartial(Const::CHUNK_SIZE)
180             break if !chunk or chunk.length == 0  # read failed, stop processing
181
182             data << chunk
183             if data.length >= Const::MAX_HEADER
184               raise HttpParserError.new("HEADER is longer than allowed, aborting client early.")
185             end
186           end
187         end
188       rescue EOFError,Errno::ECONNRESET,Errno::EPIPE,Errno::EINVAL,Errno::EBADF
189         client.close rescue nil
190       rescue HttpParserError => e
191         Mongrel.log(:error, "#{Time.now.httpdate}: HTTP parse error, malformed request (#{params[Const::HTTP_X_FORWARDED_FOR] || client.peeraddr.last}): #{e.inspect}")
192         Mongrel.log(:error, "#{Time.now.httpdate}: REQUEST DATA: #{data.inspect}\n---\nPARAMS: #{params.inspect}\n---\n")
193         # http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.4
194         client.write(Const::ERROR_400_RESPONSE)
195       rescue Errno::EMFILE
196         reap_dead_workers('too many files')
197       rescue Object => e
198         Mongrel.log(:error, "#{Time.now.httpdate}: Read error: #{e.inspect}")
199         Mongrel.log(:error, e.backtrace.join("\n"))
200       ensure
201         begin
202           client.close
203         rescue IOError
204           # Already closed
205         rescue Object => e
206           Mongrel.log(:error, "#{Time.now.httpdate}: Client error: #{e.inspect}")
207           Mongrel.log(:error, e.backtrace.join("\n"))
208         end
209         request.body.delete if request and request.body.class == Tempfile
210       end
211     end
212
213     # Used internally to kill off any worker threads that have taken too long
214     # to complete processing.  Only called if there are too many processors
215     # currently servicing.  It returns the count of workers still active
216     # after the reap is done.  It only runs if there are workers to reap.
217     def reap_dead_workers(reason='unknown')
218       if @workers.list.length > 0
219         Mongrel.log(:error, "#{Time.now.httpdate}: Reaping #{@workers.list.length} threads for slow workers because of '#{reason}'")
220         error_msg = "#{Time.now.httpdate}: Mongrel timed out this thread: #{reason}"
221         mark = Time.now
222         @workers.list.each do |worker|
223           worker[:started_on] = Time.now if not worker[:started_on]
224
225           if mark - worker[:started_on] > @timeout + @throttle
226             Mongrel.log(:error, "#{Time.now.httpdate}: Thread #{worker.inspect} is too old, killing.")
227             worker.raise(TimeoutError.new(error_msg))
228           end
229         end
230       end
231
232       return @workers.list.length
233     end
234
235     # Performs a wait on all the currently running threads and kills any that take
236     # too long.  It waits by @timeout seconds, which can be set in .initialize or
237     # via mongrel_rails. The @throttle setting does extend this waiting period by
238     # that much longer.
239     def graceful_shutdown
240       while reap_dead_workers("shutdown") > 0
241         Mongrel.log(:error, "#{Time.now.httpdate}: Waiting for #{@workers.list.length} requests to finish, could take #{@timeout + @throttle} seconds.")
242         sleep @timeout / 10
243       end
244     end
245
246     def configure_socket_options
247       case RUBY_PLATFORM
248       when /linux/
249         # 9 is currently TCP_DEFER_ACCEPT
250         $tcp_defer_accept_opts = [Socket::SOL_TCP, 9, 1]
251         $tcp_cork_opts = [Socket::SOL_TCP, 3, 1]
252       when /freebsd(([1-4]\..{1,2})|5\.[0-4])/
253         # Do nothing, just closing a bug when freebsd <= 5.4
254       when /freebsd/
255         # Use the HTTP accept filter if available.
256         # The struct made by pack() is defined in /usr/include/sys/socket.h as accept_filter_arg
257         unless `/sbin/sysctl -nq net.inet.accf.http`.empty?
258           $tcp_defer_accept_opts = [Socket::SOL_SOCKET, Socket::SO_ACCEPTFILTER, ['httpready', nil].pack('a16a240')]
259         end
260       end
261     end
262    
263     # Runs the thing.  It returns the thread used so you can "join" it.  You can also
264     # access the HttpServer::acceptor attribute to get the thread later.
265     def run
266       BasicSocket.do_not_reverse_lookup=true
267
268       configure_socket_options
269
270       if defined?($tcp_defer_accept_opts) and $tcp_defer_accept_opts
271         @socket.setsockopt(*$tcp_defer_accept_opts) rescue nil
272       end
273
274       @acceptor = Thread.new do
275         begin
276           while true
277             begin
278               client = @socket.accept
279
280               num_workers = @workers.list.length
281               if num_workers >= @num_processors
282                 Mongrel.log(:error, "#{Time.now.httpdate}: Server overloaded with #{num_workers} processors (#@num_processors max). Dropping connection.")
283                 client.close rescue nil
284                 reap_dead_workers("max processors")
285               else
286                 if defined?($tcp_cork_opts) and $tcp_cork_opts
287                   client.setsockopt(*$tcp_cork_opts) rescue nil
288                 end
289                 thread = Thread.new(client) {|c| process_client(c) }
290                 thread[:started_on] = Time.now
291                 @workers.add(thread)
292  
293                 sleep @throttle/100.0 if @throttle > 0
294               end
295             rescue StopServer
296               break
297             rescue Errno::EMFILE
298               reap_dead_workers("too many open files")
299               sleep 0.5
300             rescue Errno::ECONNABORTED
301               # client closed the socket even before accept
302               client.close rescue nil
303             rescue Object => e
304               Mongrel.log(:error, "#{Time.now.httpdate}: Unhandled listen loop exception #{e.inspect}.")
305               Mongrel.log(:error, e.backtrace.join("\n"))
306             end
307           end
308           graceful_shutdown
309         ensure
310           @socket.close
311           # Mongrel.log(:error, "#{Time.now.httpdate}: Closed socket.")
312         end
313       end
314
315       return @acceptor
316     end
317
318     # Simply registers a handler with the internal URIClassifier.  When the URI is
319     # found in the prefix of a request then your handler's HttpHandler::process method
320     # is called.  See Mongrel::URIClassifier#register for more information.
321     #
322     # If you set in_front=true then the passed in handler will be put in the front of the list
323     # for that particular URI. Otherwise it's placed at the end of the list.
324     def register(uri, handler, in_front=false)
325       begin
326         @classifier.register(uri, [handler])
327       rescue URIClassifier::RegistrationError
328         handlers = @classifier.resolve(uri)[2]
329         method_name = in_front ? 'unshift' : 'push'
330         handlers.send(method_name, handler)
331       end
332       handler.listener = self
333     end
334
335     # Removes any handlers registered at the given URI.  See Mongrel::URIClassifier#unregister
336     # for more information.  Remember this removes them *all* so the entire
337     # processing chain goes away.
338     def unregister(uri)
339       @classifier.unregister(uri)
340     end
341
342     # Stops the acceptor thread and then causes the worker threads to finish
343     # off the request queue before finally exiting.
344     def stop(synchronous=false)
345       @acceptor.raise(StopServer.new)
346
347       if synchronous
348         sleep(0.5) while @acceptor.alive?
349       end
350     end
351
352   end
353 end
354
355 # Load experimental library, if present. We put it here so it can override anything
356 # in regular Mongrel.
357
358 $LOAD_PATH.unshift 'projects/mongrel_experimental/lib/'
359 Mongrel::Gems.require 'mongrel_experimental', ">=#{Mongrel::Const::MONGREL_VERSION}"
Note: See TracBrowser for help on using the browser.