Thursday, April 12, 2012

Lighttpd Websocket Plugin - Mod_FD_Transfer




Lighttpd - Web-sockets Plugin

Using Linux File Descriptor Transfer


Link to Source Code on Github

Introduction


Lighttpd is a fast and low memory footprint web-server designed for high performance. Lighttpd supports CGI , FastCGI etc protocols for delegating web-server tasks to back-end services like generation of dynamic pages. These protocols have few drawbacks. This new module Mod_fd_transfer targets at addressing those drawbacks and giving a scalable solution. Mod_fd_transfer module is based on the concept of open file descriptor transfer concept of Linux based systems. And it fits into the Lighttpd's plugin architecture.

Current Systems

Lighttpd



Lighttpd is secure, fast, compliant and very flexible web-server that has been optimized for high-performance environments. It has a very low memory footprint and takes care of cpu-load. Its advanced feature set include FastCGI, CGI, Auth etc. Two of these are very relevant with respect to Mod_fd_transfer hence we will consider them in detail, CGI and FastCGI.

Common Gateway Interface



The Common Gateway Interface(CGI) is a standard that defines how web server software can delegate the dynamic generation of web pages to a stand-alone application or an executable file. Such applications are known as CGI scripts. A web server that supports CGI can be configured to interpret a URL that it serves as a reference to CGI scripts. CGI scripts or executables can be written in any programming language and hence are language independent. CGI scripts or executables runs in separate processes isolated from the core Web server, which provides greater security. The request parameter to the web-server go as environmental variables to the CGI program. Every time a request is sent the CGI script is launched. The output that CGI scripts generates is the response for that request. CGI has disadvantages, every time a request is made a new process is invoked for CGI script and the process can take up much more time and memory than the actual work of generating the output. And even after launch the CGI program needs to be interpreted or compiled if its a script and not executable. The overhead involved in process creation can be reduced by solutions such as FastCGI.

FastCGI



FastCGI is a protocol for interfacing interactive programs with a web server. FastCGI is a variation on the earlier Common Gateway Interface (CGI). FastCGI's main aim is to reduce the overhead associated with interfacing the web server and CGI programs, allowing a server to handle more web page requests at once. Instead of creating a new process for each request, FastCGI uses persistent processes to handle a series of requests.To service an incoming request, the web server sends environmental information and the page request itself to a FastCGI process over a socket. Responses are returned from the process to the web server over the same connection, and the web server subsequently delivers that response to the end-user. The connection may be closed at the end of a response, but both the web server and the FastCGI service processes persist. Each individual FastCGI process can handle many requests over its lifetime, thereby avoiding the overhead of per-request process creation and termination. Processing of multiple requests simultaneously can be achieved in several ways: by using a single connection with internal multiplexing (i.e. multiple requests over a single connection); by using multiple connections; or by a combination of these techniques.

In context of Lighttpd



Lighttpd provides an interface to external programs that support the FastCGI interface. FCGI basically uses mod_fastcgi as Lighttpd plugin. And there are few service side libraries such as libfcgi, libfcgi++ etc for writing c/c++ based service back-ends respectively. Briefly how FastCGI works in Lighttpd is as : Lighttpd maintains a socket connection between all the FastCGI services. Lighttpd receives requests. Request is then parsed and identified to which FastCGI service it needs to be sent to. If that service is not alive than that service is invoked(fork and exec-ed) and the socket connection is established. Then the requests parameters and environmental variables are passed as a message over that socket to the specific service. The FastCGI library libfcgi which is attached to the service creates a socket connection with Lighttpd server. It over rides the stdin and stdout of the service process with the in-out of the socket connection with the Lighttpd server. This makes it easy for the service process to read the request message over stdin and the response to that request is sent by directly writing the response to stdout.

Drawbacks of FastCGI with Lighttpd


  1. It does not support multiple request processing in parallel. Which means when one request comes and is re-directed to say X back-end service another request for the same X back-end would be queued at the mod_fcgi till the first request is completed.
  2. As there is no support to process multiple requests at the same time there cant be any support for web-socket/server-push-events sort of keep-alive requests in parallel with normal HTTP requests.
  3. According to fcgi protocol every request that is received is first parsed completely including the post body at Lighttpd web-server process and then it is passed over to corresponding fcgi backend service wrapped inside FCGI message. Same request details are parsed again with respect to fcgi protocol at fcgi back-end service. This leads to parsing the same details twice.
  4. Response path of every request also goes through Lighttpd hence there is also an extra Inter Process Communication(IPC) added.
  5. As each back-end can process just one request at a time if one requests results in time taking asynchronous call to network or i/o, consequent requests response time would increase appropriately.

Goal


Support multiple request processing in parallel and support for web-socket/ server-push-events. In more details, multiple request processing in parallel mean when one request has arrived and if processing of that request requires the back-end to make some other asynchronous request it would allow the back-end to process the next request in mean time. Web-socket/server-push-events would requires back-end to keep multiple requests alive at the same time and respond to them out of order. Along with above solutions to existing FastCGI protocol drawbacks our goal also include increasing the servers response time. Reducing the time that is spent by every request within the web-servers code and improving the through put of the web-server.
We
think that requests with large data and complex URL are the burden of corresponding back-end service and not of the web-server and hence web-servers should be made free as soon as they decode the back-end service to which the request needs to be forwarded. Also once the request is forwarded we should be able to by-pass the web-server completely for the response path as the entire response is computed by the back-end service.




Concepts behind

Fundamentals of open file descriptor transfer concept in linux based systems


In Linux you can transfer an open file descriptor from one process to another process. For every file that is opened by any process has entry in the system wide file table in Linux. At the same time every process has process wide file table which internally has pointers to file entries in system wide table. When we mean passing a open file descriptor what we want is one entry in system file table but individual process will have separate entries for same file in their respective process file tables.


Technically, we are passing a pointer to an open file table entry from one process to another. This pointer is assigned the first available descriptor in the receiving process. Having two processes share an open file table is exactly what happens after a fork. What normally happens when a descriptor is passed from one process to another is that the sending process, after passing the descriptor, then closes the descriptor. Closing the descriptor by the sender doesn't really close the file or device, since the descriptor is still considered open by the receiving process (even if the receiver hasn't specifically received the descriptor yet).

How this concept can be used in Lighttpd service back-end



We just saw how any open file descriptor can be transferred from one process to another. In a server-client model server basically needs to do socket->bind->listen and then wait on accept till any connections are made. And the client needs to do socket->connect . When client does connect to the server it basically creates a new TCP/UDP connection to the server and on server side a new connection file descriptor is received in accept call. Every connection is represented as a file in Linux and has a specific file descriptor which is same as the socket file descriptor for that connection. Server-client can then read/write from this socket file descriptor as if it was a normal file. Remember that for every request a new file descriptor is assigned. Using this concept when a request is received at any server we could identify the file descriptor of that connection and use the open file descriptor transfer concept to transfer this entire connection to a different process. This would mean that this server can off-load requests to back-end service process or assign requests to different back-end processes based on some parameters. These back-end service processes then can directly read the content of the request from this newly transferred FD and can simply respond to this request by writing to the same FD. And the transferring web-server does not come into picture at all.


Concept of web-sockets



WebSocket is a technology providing for bi-directional, full-duplex communications channels, over a single Transmission Control Protocol (TCP) socket. To establish a WebSocket connection, the client sends a WebSocket handshake request, and the server sends a WebSocket handshake response. Once a web-socket connection is established it acts as a normal TCP connection and both sides can send/receive data over the connection.




Internals of Lighttpd and it's Plugin architecture



To understand how Lighttpd works you need to understand its two major components, the state machine of the requests in Lighttpd and the plugin architecture of Lighttpd.

State Machine

The state machine of Lighttpd describe how every request goes through what states of Lighttpd and what tasks are performed on the request in every state. The state-machine is currently made of 11 states which are walk-through on the way each connection. Some of them are specific for a special operation and some may never be hit at all.
:connect:
waiting for a connection
:reqstart:
init the read-idle timer
:read:
read http-request-header from network
:reqend:
parse request
:readpost:
read http-request-content from network
:handlereq:
handle the request internally (might result in sub-requests)
:respstart:
prepare response header
:write:
write response-header + content to network
:respend:
cleanup environment, log request
:error:
reset connection (incl. close())
:close:
close connection (handle lingering close)
more details about this state machine can be found in lighttpd/doc/state.txt file in lighttpd source code.


Plugin Architecture



Lighttpd has plugin architecture. Where each plugin can process the request in different ways. Few examples of plugins are mod_fcgi, mod_access, mod_dirlisting, mod_cgi etc. Each of these modules are .so libraries which are loaded by lighttpd at run time based on the configuration file. Each of these plugin has very specific interface to lighttpd. Every plugin has 16 plugin entry points or you can say lighttpd has 16 hooks which are used in different states of the execution of the request


Server-wide hooks



:init_:
called when the plugin is loaded
:cleanup_:
called when the plugin is unloaded
:set_defaults_:
called when the configuration has to be processed
:handle_trigger_:
called once a second
:handle_sighup_:
called when the server received a SIGHUP


Connection-wide hooks



Most of these hooks are called in ``http_response_prepare()`` after some fields in the connection structure are set.
:handle_uri_raw_:
called after uri.path_raw, uri.authority and uri.scheme are set
:handle_uri_clean_:
called after uri.path (a clean URI without .. and %20) is set
:handle_docroot_:
called at the end of the logical path handle to get a docroot
:handle_subrequest_start_:
called if the physical path is set up and checked
:handle_subrequest_:
called at the end of ``http_response_prepare()``
:handle_physical_path_:
called after the physical path is created and no other handler is
found for this request
:handle_request_done_:
called when the request is done
:handle_connection_close_:
called if the connection has to be closed
:handle_joblist_:
called after the connection_state_engine is left again and plugin
internal handles have to be called
:connection_reset_:
called if the connection structure has to be cleaned up


Architecture of mod_fd_transfer



mod_fd_transfer is also a plugin similar to one described above. This plugin basically does the task of fd forwarding to appropriate back-end service through file descriptor transfer concepts described above. Consider few back-end services as individual processes running and ready to accept an open FD. When a request comes to lighttpd it is passed to mod_fd_transfer to check if it needs to be processed by it or not. Mod_fd_transfer checks the URL of the request to see if any such back-end is registered with it or not. In case if it finds a appropriate back-end service, it identifies the fd of the request and forwards that request to that back-end service. The back-end on receiving the new FD reads the remaining request from the FD and then writes to the same FD to response to that request. By default lighttpd behavior is to parse the entire HTTP headers and then also parse the entire post body of the request and keep it cached (or in temp file depending on size of the data). Then it calls the hooks of plug-ins. Then if we transfer this request FD to back-end service there would be nothing to be read from the FD only the response would be written to that FD. Due to this per-caching by lighttpd we could achieve only half of the advantage of fd transfer concept. So we modified the state-machine of lighttpd and also the plugin architecture to best fit our use-case as well as not to break any existing plugin of lighttpd.
We defined a new hook namedhandle_request_rawand we defined its behavior such that this hook would be called immediately after parsing the request header but before parsing the request body. And in mod_fd_transfer we defined a function for this hook. So now instead of parsing the body, only when the request is parsed we immediately get the hook and we are ready to transfer the file descriptor. Default state-machine behavior is such that when a request is passed to a plugin module, to process it and if it returns no results it would by default create a dummy response. But in our case we do not want lighttpd to create any dummy response because response will be generated by back-end process and lighttpd should not send any data at all. We also changed the state-machine to handle a new state name CONN_FORWARD. This state basically does nothing but assumes that this connection fd is transferred to some other process hence I need to directly close this connection. Hence we achieved exactly what we wanted.
mod_fd_transfer also takes care of exec-ing the back-end process on arrival of first request for that back-end.





Contributors:

Aalap Shah - aalap.shh@gmail.com
Shakti Ashirvad - shakti.ashirvad@gmail.com