What is Common Gateway Interface (CGI)?

The CGI is specified in RFC 3875, though that is a later "official" codification of the original NCSA document. Basically, CGI defines a protocol to pass data about a HTTP request from a webserver to a program to process - any program, in any language. At the time the spec was written (1993), most web servers contained only static pages, "web apps" were a rare and new thing, so it seemed natural to keep them apart from the "normal" static content, such as in a cgi-bin directory apart from the static content, and having them end in .cgi.

At this time, here also were no dedicated "web programming languages" like PHP, and C was the dominating portable programming language - so many people wrote their CGI scripts in C. But Perl quickly turned out to be a better fit for this kind of thing, and CGI became almost synonymous with Perl for a while. Then there came Java Servlets, PHP and a bunch of others and took over large parts of Perl's market share.


CGI is an interface specification between a web server (HTTP server) and an executable program of some type that is to handle a particular request.

It describes how certain properties of that request should be communicated to the environment of that program and how the program should communicate the response back to the server and how the server should 'complete' the response to form a valid reply to the original HTTP request.

For a while CGI was an IETF Internet Draft and as such had an expiry date. It expired with no update so there was no CGI 'standard'. It is now an informational RFC, but as such documents common practice and isn't a standard itself. rfc3875.txt, rfc3875.html

Programs implementing a CGI interface can be written in any language runnable on the target machine. They must be able to access environment variables and usually standard input and they generate their output on standard output.

Compiled languages such as C were commonly used as were scripting languages such as perl, often using libraries to make accessing the CGI environment easier.

One of the big disadvantages of CGI is that a new program is spawned for each request so maintaining state between requests could be a major performance issue. The state might be handled in cookies or encoded in a URL, but if it gets to large it must be stored elsewhere and keyed from encoded url information or a cookie. Each CGI invocation would then have to reload the stored state from a store somewhere.

For this reason, and for a greatly simple interface to requests and sessions, better integrated environments between web servers and applications are much more popular. Environments like a modern php implementation with apache integrate the target language much better with web server and provide access to request and sessions objects that are needed to efficiently serve http requests. They offer a much easier and richer way to write 'programs' to handle HTTP requests.

Whether you wrote a CGI script rather depends on interpretation. It certainly did the job of one but it is much more usual to run php as a module where the interface between the script and the server isn't strictly a CGI interface.


CGI is an interface which tells the webserver how to pass data to and from an application. More specifically, it describes how request information is passed in environment variables (such as request type, remote IP address), how the request body is passed in via standard input, and how the response is passed out via standard output. You can refer to the CGI specification for details.

To use your image:

user (client) request for page ---> webserver ---[CGI]----> Server side Program ---> MySQL Server.

Most if not all, webservers can be configured to execute a program as a 'CGI'. This means that the webserver, upon receiving a request, will forward the data to a specific program, setting some environment variables and marshalling the parameters via standard input and standard output so the program can know where and what to look for.

The main benefit is that you can run ANY executable code from the web, given that both the webserver and the program know how CGI works. That's why you could write web programs in C or Bash with a regular CGI-enabled webserver. That, and that most programming environments can easily use standard input, standard output and environment variables.

In your case you most likely used another, specific for PHP, means of communication between your scripts and the webserver, this, as you well mention in your question, is an embedded interpreter called mod_php.

So, answering your questions:

What exactly is CGI?

See above.

Whats the big deal with /cgi-bin/*.cgi? Whats up with this? I don't know what is this cgi-bin directory on the server for. I don't know why they have *.cgi extensions.

That's the traditional place for cgi programs, many webservers come with this directory pre configured to execute all binaries there as CGI programs. The .cgi extension denotes an executable that is expected to work through the CGI.

Why does Perl always comes in the way. CGI & Perl (language). I also don't know whats up with these two. Almost all the time I keep hearing these two in combination "CGI & Perl". This book is another great example CGI Programming with Perl Why not "CGI Programming with PHP/JSP/ASP". I never saw such things.

Because Perl is ancient (older than PHP, JSP and ASP which all came to being when CGI was already old, Perl existed when CGI was new) and became fairly famous for being a very good language to serve dynamic webpages via the CGI. Nowadays there are other alternatives to run Perl in a webserver, mainly mod_perl.

CGI Programming in C this confuses me a lot. in C?? Seriously?? I don't know what to say. I"m just confused. "in C"?? This changes everything. Program needs to be compiled and executed. This entirely changes my view of web programming. When do I compile? How does the program gets executed (because it will be a machine code, so it must execute as a independent process). How does it communicate with the web server? IPC? and interfacing with all the servers (in my example MATLAB & MySQL) using socket programming? I'm lost!!

You compile the executable once, the webserver executes the program and passes the data in the request to the program and outputs the received response. CGI specifies that one program instance will be launched per each request. This is why CGI is inefficient and kind of obsolete nowadays.

They say that CGI is deprecated. Its no more in use. Is it so? What is its latest update?

CGI is still used when performance is not paramount and a simple means of executing code is required. It is inefficient for the previously stated reasons and there are more modern means of executing any program in a web enviroment. Currently the most famous is FastCGI.


What exactly is CGI?

A means for a web server to get its data from a program (instead of, for instance, a file).

Whats the big deal with /cgi-bin/*.cgi?

No big deal. It is just a convention.

I don't know what is this cgi-bin directory on the server for. I don't know why they have *.cgi extensions.

The server has to know what to do with the file (i.e. treat it as a program to execute instead of something to simply serve up). Having a .html extension tells it to use a text/html content type. Having a .cgi extension tells it to run it as a program.

Keeping executables in a separate directory gives some added protection against executing incorrect files and/or serving up CGI programs as raw data in case the server gets misconfigured.

Why does Perl always comes in the way.

It doesn't. Perl was just big and popular at the same time as CGI.

I haven't used Perl CGI for years. I was using mod_perl for a long time, and tend towards PSGI/Plack with FastCGI these days.

This book is another great example CGI Programming with Perl Why not "CGI Programming with PHP/JSP/ASP".

CGI isn't very efficient. Better methods for talking to programs from webservers came along at around the same time as PHP. JSP and ASP are different methods for talking to programs.

CGI Programming in C this confuses me a lot. in C?? Seriously??

It is a programming language, why not?

When do I compile?

  1. Write code
  2. Compile
  3. Access URL
  4. Webserver runs program

How does the program gets executed (because it will be a machine code, so it must execute as a independent process).

It doesn't have to execute as an independent process (you can write Apache modules in C), but the whole concept of CGI is that it launches an external process.

How does it communicate with the web server? IPC?

STDIN/STDOUT and environment variables — as defined in the CGI specification.

and interfacing with all the servers (in my example MATLAB & MySQL) using socket programming?

Using whatever methods you like and are supported.

They say that CGI is depreciated. Its no more in use. Is it so?

CGI is inefficient, slow and simple. It is rarely used, when it is used, it is because it is simple. If performance isn't a big deal, then simplicity is worth a lot.

What is its latest update?

1.1

Tags:

Cgi