The cgi
module provides an object-oriented interface for
writing CGI and CGI-style programs. It provides an abstraction layer so that
the same code can be used with either standard CGI or replacement technologies
such as FastCGI.
Code to handle a request must subclass the abstract class
cgi.Handler
. This class uses a single
method process
which receives a
single parameter of type cgi.Request
, which
is used to retrieve information about the request and to send the response.
Instances of this subclass will be created to handle requests.
When the standard CGI protocol is used, a new process is created to handle
each request, but with more complicated protocols such as
FastCGI, a process may handle more than one request
simultaneously in multiple threads. However, even in this situation, each
instance of the cgi.Handler
subclass will
only be used to process one request at once. This means that the instance can
use self
to store per-request data.
A subclass of cgi.Request
is used to
to call the handler. Which subclass is used depends on the protocol used to
communicate with the web server. This module provides
cgi.CGIRequest
which implements the
standard CGI protocol, and also
cgi.GZipCGIRequest
which is the
same but uses zlib to compress
the response when the user's browser indicates it can do this.
Example:
import jon.cgi as cgi
class Handler(cgi.Handler):
def process(self, req):
req.set_header("Content-Type", "text/plain")
req.write("Hello, %s!\n" % req.params.get("greet", "world"))
cgi.CGIRequest(Handler).process()
Note: by default, output from the handler is buffered. If the output from
the script is going to be large (for example, if the output is not an HTML
file), then buffering should be disabled using
set_buffering
.
The base class for all exceptions defined by the cgi
module.
An exception class which is raised when cgi
object methods
are called out of order.
Request
objects provide information about a CGI request, as
well as methods to return a response. This class is not used directly, but
is subclassed depending on what protocol is being used to talk to the web
server.
params
The params
map contains the CGI form variables recovered
from the QUERY_STRING
, and in the case of POST
requests, stdin
. In the case of each
key, the name is a string and the type of the value depends on whether or not
the name has one of a number of special suffixes.
If the key has no special suffix, then the value is a string, or
None
. (None
occurs when a URL-encoded string contains
a name without a corresponding equals sign and value. If the string contains
a name and an equals sign but no value then this is represented as an empty
string.) If the key ends with the string "*"
then the value is a
sequence containing one or more values, each of which is either a string or
None
(to support multiple values with the same name, e.g. HTML
<select>
input fields). If the key ends with the string
"!"
then the value is a
mime.Entity
object (to support file uploads). If the key ends with the string
"!*"
then the value is a sequence of one or more
mime.Entity
objects.
If a form variable is found with a name ending in "!"
or
"!*"
but it did not arrive in the form of a MIME section then it
is ignored and is not placed into the map. If more than one value with the same
name is found and the name does not end in "*"
or "!*"
then only one of the values will be entered into the map, and the others will
be discarded. This means that, even in the face of malicious input, the types
of the values are guaranteed to match that indicated by their key's suffix.
Note that the suffixes must be present in the CGI variables themselves. The
programmer does not indicate to the cgi
module what CGI variables
he is expecting. Example:
<select multiple name="types*">
<option>gif</option><option>jpg</option><option>png</option>
</select>
stdin
A file or file-like object which represents the "standard input stream" for
the request. For example, for a genuine CGI request this will be a reference to
sys.stdin
.
Note: the first time you access the
params
variable, this stream may be
read to retrieve the form variables. Therefore, you must not
access both params
and stdin
during the same
request.
cookies
The cookies
variable is a Cookies.SimpleCookie
object which contains cookies passed to the server by the client.
environ
The environ
map contains the environment variables associated
with the request. All keys and values in the map are strings.
aborted
If the aborted
variable references a true value then the
request has been aborted (usually because the client has gone away). If the
request is aborted then all further output using the
write
method will be discarded.
The programmer may inspect the aborted
variable occasionally and
exit if the request has been aborted, but it is not necessary to do so.
__init__(self, handler_type)
handler_type
:
cgi.Handler
subclass
Create new Request
instance. Instances of
handler_type
will be created to handle requests.
The array of HTTP headers will be initialised to contain a
Content-Type
header with the value text/html;
charset=iso-8859-1
. If this is not appropriate then the content type
should be overridden by specifying a new one with the
set_header
method.
output_headers(self)
Output the accumulated array of HTTP headers. If the headers have already
been output then a
cgi.SequencingError
exception is
raised.
clear_headers(self)
Clear the accumulated array of HTTP headers. If the headers have already
been output then a
cgi.SequencingError
exception
is raised.
add_header(self, hdr, val)
hdr
: string
val
: string
Add a header to the array of HTTP headers. If the headers have already
been output then a
cgi.SequencingError
exception is
raised.
Example:
req.add_header("Set-Cookie", "foo=bar; path=/")
get_header(self, hdr, index=0)
hdr
: string
index
: integer
Retrieves a header from the array of HTTP headers (this is the array
of output headers the handler will be returning to the user agent,
not the input headers from the user agent). If there is more than
one header with the same name, the index
parameter is used to
specify which one is required. If the named header was not found, or there
were not enough occurrences of it to satisfy the index requirement,
None
is returned. Header names are matched case-insensitively.
set_header(self, hdr, val)
hdr
: string
val
: string
Add a header to the array of HTTP headers. If a header or headers of the
same name already exist in the array, then they are deleted before the new
header is added. If the headers have already been output then a
cgi.SequencingError
exception is
raised. Header names are matched case-insensitively.
Example:
req.set_header("Content-Type", "image/jpeg")
del_header(self, hdr)
hdr
: string
Remove all headers with the name hdr
from the array of HTTP
headers. If the headers have already been output then a
cgi.SequencingError
exception is
raised. Header names are matched case-insensitively.
append_header_value(self, hdr, val)
hdr
: string
val
: string
Add a value to a header that contains a comma-separated list of values
(e.g. Content-Encoding
, Vary
, etc). If the header
does not already exist, it is set to val
. If the header does
exist, and val
is not already in the list of values, it is added
to the list. If the headers have already been output then a
cgi.SequencingError
exception is
raised. Header names and values are matched case-insensitively.
Example:
req.append_header_value("Vary", "Accept-Language")
set_buffering(self, f)
f
: true or false value
Specify whether or not client output sent using
write
will be buffered. If buffering
is disabled when output has already been buffered then the existing buffer
will be flushed immediately. At the start of a new request, buffering defaults
to 'on'.
flush(self)
Flushes any buffered output to the client. If the HTTP headers array has not
already been sent then it will be sent before any other output. Generally
speaking, you do not need to call flush
, even if buffering is
enabled, because it is automatically called when the
Handler.process
method exits.
close(self)
Calls flush
and then closes the
output stream. It is essential that this method is called when the request is
complete, however in general you do not need to call it manually because it is
automatically called when the
Handler.process
method exits.
clear_output(self)
Discards any output that has been buffered. If output buffering is not
enabled then a cgi.SequencingError
exception is raised.
error(self, s)
s
: string
This is a placeholder method that must be over-ridden by a subclass of the
Request
class. It should log the string
parameter s
somewhere on the server (e.g. in the
error_log
). The string must not be output to
the client.
set_encoding(self, encoding, [inputencoding])
encoding
: string or None
inputencoding
: string or None
Sets the character encoding used for the response. The default encoding
is None
, which means that no encoding is performed (in which case
you cannot send unicode
objects to
write
and normal strings are output
unchanged). If you specify an encoding other than None
then you
can send unicode
objects to
write
and they will be encoded
correctly. Remember in this case you will probably want to call
set_header
to update the
Content-Type
header to indicate the character encoding you are
using.
inputencoding
is only used if encoding
is not
None
. Normally if you pass a non-unicode
object to
write
then it will be assumed to
be in Python's default character encoding. If you specify a
non-None
inputencoding
then it will be assumed to
be in that character encoding instead.
Example:
req.set_encoding("utf-8", "iso-8859-1")
req.set_header("Content-Type", "text/plain; charset=utf-8")
req.write("hello \xa1\n") # iso-8859-1 assumed due to inputencoding specified
above
req.write(unicode("hello \xa1\n", "cp850"))
set_form_encoding(self, encoding)
encoding
: string or None
Sets the character encoding used when reading form data from the browser.
It defaults to None
, but if set to the name of a character
encoding, the keys and values in the params
mapping will be Unicode strings instead of normal strings. Note that browsers
will generally send form data using the encoding used by the HTML of the
submitting page.
get_encoding(self)
Returns the character encoding being used for the response, or
None
if no encoding is being used.
get_form_encoding(self)
Returns the character encoding being used for form data, or
None
if no encoding is being used.
write(self, s)
s
: string
Sends the string parameter s
to the client. If buffering has
been enabled using
set_buffering
then the string
will not be sent to the client immediately but will be buffered in memory.
If buffering has not been enabled and the HTTP headers array has not already
been sent then it will be sent before any other output.
If you wish to be able to output unicode objects using this function, then
you should first call
set_encoding
to specify the
output character encoding.
traceback(self)
Calls traceback
to send a traceback
to the error log, and outputs a generic error page to the browser.
_handler_type
The _handler_type
variable is initialised by the
handler_type
parameter to the
__init__
method.
_init(self)
Initialises the instance ready for a new request.
_write(self, s)
s
: string
This is a placeholder method that must be over-ridden by a subclass of the
Request
class. It should output the string
parameter s
to the client as part of the response.
_flush(self)
This is a placeholder method that may be over-ridden by a subclass of the
Request
class. If whatever mechanism the
subclass's implementation of _write
uses can result in data being buffered then this method should ensure that the
data is flushed to the client.
_mergevars(self, encoded)
encoded
: string
This is a utility method for the use of subclasses of the
Request
class. It parses the URL-encoded
string parameter encoded
and merges the key/value pairs found
into the self.params
mapping.
_mergemime(self, contenttype, encoded)
contenttype
: string
encoded
: file-like object
This is a utility method for the use of subclasses of the
Request
class. The parameter
encoded
must provide a file
-like read
method which is then used to parse a MIME-encoded input stream.
contenttype
should contain the value of the
Content-Type
header for the stream (which should presumably
always indicate the multipart/form-data
type). MIME sections
found with Content-Disposition: form-data
are merged into the
self.params
mapping.
_read_cgi_data(self, environ, inf)
environ
: map
inf
: file-like object
This is a utility method for the use of subclasses of the
Request
class. Examines the environment
strings contained in the map parameter environ
as per the
standard CGI protocol. If the environ
variable
QUERY_STRING
is available then it is parsed using the
mergevars
method. If the
environ
variable REQUEST_METHOD
is
POST
then the inf
parameter (which must provide a
file
-like read
method) is used to read an input
stream which is passed to either the
mergevars
method or the
mergemime
method depending
on the environ
variable CONTENT_TYPE
. Finally,
if the environ
variable HTTP_COOKIE
is available
then it is parsed into the
self.cookies
instance variable.
GZipMixIn
is a class that can be mixed-in to a sub-class of the
Request
class to enable gzip compression
of responses to user agents that indicate they can accept it. Make sure you
specify the GZipMixIn
class before the transport class
on the class line.
Example:
class GZipCGIRequest(cgi.GZipMixIn, cgi.CGIRequest):
pass
gzip_level(self, level=6)
level
: integer
Specifies the compression level used by gzip for this request. The default
level if you do not call this method is 6
. This method can also
be used to disable compression for a particular request by setting
level
to 0
- for example if the
handler is returning an image file to the user then compression should be
disabled as images are already compressed. If the headers have already been
output then a cgi.SequencingError
exception is raised.
CGIRequest
subclasses the
Request
class to implement the standard
CGI protocol. Environment variables are read from os.environ
,
input is read from sys.stdin
, output goes to
sys.stdout
and errors go to sys.stderr
.
process(self)
Initialises the instance ready for a new request by calling the
_init
method, then reads the CGI
input and sets up the various instance variables. A
Handler
object of the type passed to the
CGIRequest.__init__
method is
then instantiated and its process
method is called. If an exception is thrown by this method then the
traceback
method is called to
display it.
Example:
cgi.CGIRequest(Handler).process()
For convenience, this class provides the standard
CGIRequest
class with the
GZipMixIn
class already mixed in.
Example:
cgi.GZipCGIRequest(Handler).process()
This is an abstract class which should be subclassed by the programmer to provide the code which handles a request.
process(self, req)
req
: object of type
cgi.Request
This method must be overridden by subclasses. It is called to process a
request. The req
parameter references the
Request
object (actually, an instance of a
subclass of Request
) which should be used
to inspect the request and to send the response.
Note that even in multithreaded situations such as
FastCGI, any individual instance of a
Handler
subclass will only have one process
method
executing at once.
traceback(self, req)
req
: object of type
cgi.Request
This method may be overridden by subclasses. It is called to handle an
exception thrown by the process
method. The default implementation calls
Request.traceback
to send a
traceback to the error log and output a generic error page to the browser.
This mix-in class provides a
traceback
method that outputs
debug information to the browser as well as to the error log. This class may be
used during development to aid debugging but should never be used in a
production environment since it will leak private information to the
browser.
Example:
class Handler(cgi.DebugHandlerMixIn, wt.Handler):
pass
traceback(self, req)
req
: object of type
cgi.Request
This method may be overridden by subclasses. It is called to handle an
exception thrown by the process
method. The default implementation calls
traceback
to send a traceback to both the
error log and the browser.
For convenience, this class provides the standard
Handler
class with the
DebugHandlerMixIn
class already
mixed in.
Example:
class Handler(cgi.DebugHandler):
def process(self, req):
req.set_header("Content-Type", "text/plain")
req.write("Hello, world!\n")
html_encode(raw)
raw
: any
Returns: string
HTML-encodes (using entities) characters that are special in HTML -
specifically at least all of &
<
>
"
and '
are
guaranteed to be encoded. raw
is passed to str
,
so almost any type can be passed in to this parameter.
Example:
>>> cgi.html_encode("<foo>")
'<foo>'
url_encode(raw)
raw
: any
Returns: string
URL-encodes (using %-escapes) characters that are special in URLs.
Characters that are special in HTML are guaranteed to be escaped, so the output
of this function is safe to embed directly in HTML without the need for a
further call to html_encode
.
raw
is passed to str()
, so almost any type can be
passed in to this parameter.
Example:
>>> cgi.url_encode("<foo>")
'%3Cfoo%3E'
url_decode(enc)
enc
: string
Returns: string
Converts +
to space characters in the enc
string
and then decodes URL %-escapes.
Example:
>>> cgi.url_decode("%3Cfoo%3E")
'<foo>'
traceback(req, html=0)
req
: object of type
cgi.Request
html
: true or false value
This function should only be called while an exception is being handled
(i.e. in an except
section). It emits a detailed traceback about
the exception to the server's error log. If html
references a
true value then the traceback is also sent as HTML to the browser. If
html
is false then the browser output is not altered in any way,
so it is up to the caller to arrange for suitable output to be sent.
$Id: cgi.html,v 0416d65875b7 2014/03/05 17:37:06 jon $