BACK
MUGI
„My Uncommen Gateway Interface”
Why MUGI
When I started to explore the world of internet programming I did some research about who uses which tool, language, technique to do what kind of content. I explored the world of PHP, Java as well as (my)sql and the most commen db backends. I wasn't all that happy with the performance or the features of those things. So I took a look on the interfaces commenly provided to connect own programs to the servers. CGI was easy to program for, but speed wise a desaster. I was also convinced that the usual feature overloaded servers ware unescessary overkill wasting my resources. The idea to write a server module was also considered, but my reasoning prevented that also. So I decided to write the server for the interface I wanted to have. What I wanted was simple, a way to connect other server like applications via the web-server to the internet, if need be from another computer in the lan.
Then I thought about what such an interface should provide. I decided to go for a minimal version of it and leave the target application the choice to implement whatever it would need by itself. Nothing easier than building up a library of useful functions over some time to keep this thing covered for later projects. After that windy preamble you'd be able to understand better how this thing is intended to work.


What it is/isn't
It is more flexible than a server module, it is somewhat safer but it also is slower and might need more system resources. It is a lot faster than CGI, but it provides much less prepared data to work with. A MUGI can do more than a CGI, but that comes with a higher security risk, especially if you use code or programs from unknown sources. Under these circumstances it is as problematic as a module. Best would be to see it as a hybrid of those two extremes.


Registerd MUGIs
On the server side a MUGI has to be registered in the configuration file. Look in the manual for a detailed syntax. After that the server has a name to look out for as well as the host address and the port of the application. In case of a registered mugi the name is basically handled just like the "cgi/" that marks a cgi-script call. Let's take a look at an example:

You've created a news page mugi that provides the commands add_msg, del_msg and show_current. It runs on the same computer as the web-server itself, so to prevent a connection from the outside you just use the loopback address "127.0.0.1" and asign the port 7500 to it. You registerd the mugi to the virtual server entry for "www.myserver.com" under the name "news". So the url you'd have to use to reach those functions would look like this:
http://www.myserver.com/news/show_current
http://www.myserver.com/news/add_msg
http://www.myserver.com/news/del_msg

Actually the server just looks for the beginning "news/" and then just passes on the whole thing to the given address "127.0.0.1:7500". You should be aware of course that a perhaps existing directory "news" in your document root will be ignored after that. If you want to conceal the fact that a page is built dynamically by a mugi you can also name the "show_current" into "show_current.html", you just have to make sure that your mugi understands that properly, it would also give a hint for the browser what to expect.


A MUGI as target for a VirtualServer
Instead of registering a mugi you can also choose to define it as primary target of a virtual server. If you take the above example and create a VS with the name "news.myserver.com" and define the "127.0.0.1:7500" as its target the calls could look like this:
http://news.myserver.com/show_current
http://news.myserver.com/add_msg
http://news.myserver.com/del_msg

or, if you really took the same program without changing its internal work:
http://news.myserver.com/news/show_current
http://news.myserver.com/news/add_msg
http://news.myserver.com/news/del_msg

However, the server first checks for a registered MUGI, then for CGI and then for a MUGI target. Static content won't be served at all unless you pass an appropriate command back, but to that later.


How does it work
It works fine for me of course... ;) Serious, whatever you do, you have to make sure your mugi is capable of handling a sudden close of the connection to the server gracefully. You should also consider to handle/ignore the "broken pipe" system signal, else the mugi might be closed by that. The via MUGI connected program has to wait for incoming connections on its listen port, so it has to run either as resident demon or something comparable.
  1. In case of an incoming request that has a MUGI target the server opens a connection to the registered address/port. It then sends the following header:

    size what
    30 byte  ip-address of the client as string like 127.0.0.1
    4 byte  the port number of the server that received the request
    4 byte  size of the request header
    variable  the original request header

    The port number and the size are 32Bit integers in net byte order. Usually you'd want to read those 38 bytes first, in case you can't create a buffer (or don't want to) your program can just close the port. The server will then report an error 500 with the text "Internal communication error while sending request to mugi" back to the browser. You can also send a msg on your own, but for that you will have to read that header first.

  2. After that the server waits for a single byte to tell him how to proceed. The byte may have the following values:
    • 0, skip an eventally existing body and await the response. A body will be ignored completely. A loopback however might still read it as its own body later
    • 1, send an eventually existing body in the form
       4 Byte length in net order  body as it comes in 
    All other values will produce an error and the connection is closed.

  3. Now the server waits again, this time to receive a 32 Bit integer value in net order.
    The values:
    • -4, the MUGI orders the server to send a static file with additional header-lines. This is for example used to pass an ETAG back to the server to check against one in the original request for example to produce a 206 (byte-range) or a 304 (not modified) answer if useful. It can also be used to add a "connection: close" to enforce closing of the connection afterwards, in case of a pipelined http/1.1 request situation all following requests will be deleted. This has the following basic syntax:

       the -4   4 Byte length of path string  4 Byte length of additional header  4 Bytes command  path  header 

      • The 4 byte length values are expected in net byte order.
      • The additional possible headerlines are limited to ones that are not already generated automatically. So valid would be for example ETAG or Cookie, but not Last-Modified or Date. The server won't check this and the browser would receive both versions. Basically the length is up to you, but as the internal buffer is limited to 2000 Byte you might cause the server to reject the loopback and send an error instead.
      • The file path will be seen as invalid if longer than 999 chars. It has to be either absolute or relative to the servers working directory. Yes, that can be a rather nasty thing if you are not careful. It also means that you don't want to use a mugi from an unknown source. However, you can always do a chroot before starting the server to prevent the worst.
      • The command can have either a value of 1 for GET or a 3 for HEAD. Other values provoke an error
    • -3, basically the same as -4, but without the additional header part.
       the -3   4 bytes path length  4 bytes command  path 

    • -2, the real loopback. The original request command and path is replaced.
       the -2   4 bytes command  4 bytes path length  path 
      No, it is no mistake, the command really comes before the length this time.
      • command can have values of -1 to 7, with -1 keep original one, 0 OPTIONS, 1 GET, 2 POST, 3 HEAD, 4 PUT, 5 DELETE, 6 TRACE, 7 CONNECT. Please be aware that not supported commands will produce an according error and that a missing body in the loopback will also create an error, the original length information is still there after all.
      • the path will directly replace the one in the original header.
      After that the server loops back right behind the "read request header" stage, hence the name loopback...

    • -1, this creates an error 500 with the text "mugi reports unspecified error".

    • 0, the response follows, but the size is unknown. This will cause a byte-range to fail. In case the connection is closed afterwards for some reason (for example due to "connection: close" in the response header) or it was a http/1.0 request the body will be sent normal and the closing connection will define the body size. In case of a continued persistent http/1.1 request the body will be sent using the chunked transfer-coding. The server will however look for a content-length entry in the response header to prevent that.

    • >0, the value will be taken as the overall size of the response (header+body). The server will calculate the size of the body by subtracting the header-size from the given overall size. It will also add a content-length: header field if nescessary.

  4. After the server got the response header it will analyse this header before doing anything else. It will also apply several changes to it. For example the header fields needed for a keep-alive will be added. The connection might be closed due to reaching the maximum number of allowed persistent requests. If missing, a date: will be added. Finally, if the response was a "200" (ok) the header will be checked for byte ranges as well as timestamps/etags that might cause the change to a "304" or "206" response. In this case the server might close the connection to the mugi prematurely. If the response header has a "connection: close" field the server will take that as an order to close the connection after this request is sent, while a "cache: no-cache", "cache: private" or a "pragma:" with either of those values will prevent a change to a 304 response.
BACK