httpd
An implementation of an HTTP 1.1 compliant Web server, as defined in RFC 2616.
Documents the HTTP server start options, some administrative functions and also specifies the Erlang Web server callback API
COMMON DATA TYPES
Type definitions that are used more than once in this module:
boolean() = true | false
string() = list of ASCII characters
path() = string() - representing a file or directory path.
ip_address() = {N1,N2,N3,N4} % IPv4
| {K1,K2,K3,K4,K5,K6,K7,K8} % IPv6
hostname() = string() - representing a host ex "foo.bar.com"
property() = atom()
ERLANG HTTP SERVER SERVICE START/STOP
A web server can be configured to start when starting the inets
application or started dynamically in runtime by calling the
Inets application API inets:start(httpd, ServiceConfig)
, or
inets:start(httpd, ServiceConfig, How)
,
see inets(3) Below follows a
description of the available configuration options, also called
properties.
File properties
When the web server is started
at application start time the properties should be fetched from a
configuration file that could consist of a regular erlang property
list, e.i. [{Option, Value}]
where Option = property()
and Value = term()
, followed by a full stop, or for
backwards compatibility an Apache like configuration file. If the
web server is started dynamically at runtime you may still specify
a file but you could also just specify the complete property
list.
If this property is defined inets will expect to find all other properties defined in this file. Note that the file must include all properties listed under mandatory properties.
If this property is defined inets will expect to find all other properties defined in this file, that uses Apache like syntax. Note that the file must include all properties listed under mandatory properties. The Apache like syntax is the property, written as one word where each new word begins with a capital, followed by a white-space followed by the value followed by a new line. Ex:
{server_root, "/urs/local/www"} -> ServerRoot /usr/local/www
With a few exceptions, that are documented for each property that behaves differently, and the special case {directory, {path(), PropertyList}} and {security_directory, {Dir, PropertyList}} that are represented as:
<Directory Dir> <Properties handled as described above> </Directory>
Note!
The properties proplist_file and file are mutually exclusive.
Mandatory properties
The port that the HTTP server shall listen on. If zero is specified as port, an arbitrary available port will be picked and you can use the httpd:info/2 function to find out which port was picked.
The name of your server, normally a fully qualified domain name.
Defines the servers home directory where log files etc can be stored. Relative paths specified in other properties refer to this directory.
Communication properties
Defaults to any
. Note that any
is denoted *
in the apache like configuration file.
For ssl configuration options see ssl:listen/2
Defaults to ip_comm
.
Defaults to inet6fb4.
Note that this option is only used when the option
socket_type
has the value ip_comm
.
If given, sets a minimum bytes per second value for connections.
If the value is not reached, the socket will close for that connection.
The option is good for reducing the risk of "slow dos" attacks.
Erlang Web server API modules
Defines which modules the HTTP server will use to handle
requests. Defaults to: [mod_alias, mod_auth, mod_esi,
mod_actions, mod_cgi, mod_dir, mod_get, mod_head, mod_log,
mod_disk_log]
Note that some mod-modules are dependent on
others, so the order can not be entirely arbitrary. See the
Inets Web server Modules in the
Users guide for more information.
Limit properties
This property allows you to disable chunked transfer-encoding when sending a response to a HTTP/1.1 client, by default this is false.
Instructs the server whether or not to use persistent connections when the client claims to be HTTP/1.1 compliant, default is true.
The number of seconds the server will wait for a subsequent request from the client before closing the connection. Default is 150.
Limits the size of the message body of HTTP request. By the default there is no limit.
Limits the number of simultaneous requests that can be supported. Defaults to 150.
Limits the size of the message header of HTTP request. Defaults to 10240.
Limits the size of the HTTP request URI. By default there is no limit.
The number of request that a client can do on one connection. When the server has responded to the number of requests defined by max_keep_alive_requests the server close the connection. The server will close it even if there are queued request. Defaults to no limit.
Administrative properties
Where MimeType = string() and Extension = string(). Files delivered to the client are MIME typed according to RFC 1590. File suffixes are mapped to MIME types before file delivery. The mapping between file suffixes and MIME types can be specified as an Apache like file as well as directly in the property list. Such a file may look like:
# MIME type Extension text/html html htm text/plain asc txt
Defaults to [{"html","text/html"},{"htm","text/html"}]
When the server is asked to provide a document type which cannot be determined by the MIME Type Settings, the server will use this default type.
ServerAdmin defines the email-address of the server administrator, to be included in any error messages returned by the server.
ServerTokens defines how the value of the server header should look.
Example: Assuming the version of inets is 5.8.1, here is what the server header string could look like for the different values of server-tokens:
prod "inets" major "inets/5" minor "inets/5.8" minimal "inets/5.8.1" os "inets/5.8.1 (unix)" full "inets/5.8.1 (unix/linux) OTP/R15B" {private, "foo/bar"} "foo/bar"
By default, the value is as before, which is minimal
.
Defines if access logs should be written according to the common
log format or to the extended common log format.
The common
format is one line that looks like this:
remotehost rfc931 authuser [date] "request" status bytes
remotehost Remote rfc931 The client's remote username (RFC 931). authuser The username with which the user authenticated himself. [date] Date and time of the request (RFC 1123). "request" The request line exactly as it came from the client (RFC 1945). status The HTTP status code returned to the client (RFC 1945). bytes The content-length of the document transferred.
The combined
format is on line that look like this:
remotehost rfc931 authuser [date] "request" status bytes "referer" "user_agent"
"referer" The url the client was on before requesting your url. (If it could not be determined a minus sign will be placed in this field) "user_agent" The software the client claims to be using. (If it could not be determined a minus sign will be placed in this field)
This affects the access logs written by mod_log and mod_disk_log.
Defaults to pretty. If the error log is meant to be read
directly by a human pretty
will be the best
option. pretty
has the format corresponding to:
io:format("[~s] ~s, reason: ~n ~p ~n~n", [Date, Msg, Reason]).
compact
has the format corresponding to:
io:format("[~s] ~s, reason: ~w ~n", [Date, Msg, Reason]).
This affects the error logs written by mod_log and mod_disk_log.
URL aliasing properties - requires mod_alias
Where Alias = string() and RealName = string().
The Alias property allows documents to be stored in the local file
system instead of the document_root location. URLs with a path that
begins with url-path is mapped to local files that begins with
directory-filename, for example:
{alias, {"/image", "/ftp/pub/image"}}
and an access to http://your.server.org/image/foo.gif would refer to
the file /ftp/pub/image/foo.gif.
Where Re = string() and Replacement = string().
The ReWrite property allows documents to be stored in the local file
system instead of the document_root location. URLs are rewritten
by re:replace/3 to produce a path in the local filesystem.
For example:
{re_write, {"^/[~]([^/]+)(.*)$", "/home/\\1/public\\2"}}
and an access to http://your.server.org/~bob/foo.gif would refer to
the file /home/bob/public/foo.gif.
In an Apache like configuration file the Re is separated
from Replacement with one single space, and as expected
backslashes do not need to be backslash escaped so the
same example would become:
ReWrite ^/[~]([^/]+)(.*)$ /home/\1/public\2
Beware of trailing space in Replacement that will be used.
If you must have a space in Re use e.g the character encoding
\040
see re(3).
DirectoryIndex specifies a list of resources to look for
if a client requests a directory using a / at the end of the
directory name. file depicts the name of a file in the
directory. Several files may be given, in which case the server
will return the first it finds, for example:
{directory_index, ["index.hml", "welcome.html"]}
and access to http://your.server.org/docs/ would return
http://your.server.org/docs/index.html or
http://your.server.org/docs/welcome.html if index.html do not
exist.
CGI properties - requires mod_cgi
Where Alias = string() and RealName = string().
Has the same behavior as the Alias property, except that
it also marks the target directory as containing CGI
scripts. URLs with a path beginning with url-path are mapped to
scripts beginning with directory-filename, for example:
{script_alias, {"/cgi-bin/", "/web/cgi-bin/"}}
and an access to http://your.server.org/cgi-bin/foo would cause
the server to run the script /web/cgi-bin/foo.
Where Re = string() and Replacement = string().
Has the same behavior as the ReWrite property, except that
it also marks the target directory as containing CGI
scripts. URLs with a path beginning with url-path are mapped to
scripts beginning with directory-filename, for example:
{script_re_write, {"^/cgi-bin/(\\d+)/", "/web/\\1/cgi-bin/"}}
and an access to http://your.server.org/cgi-bin/17/foo would cause
the server to run the script /web/17/cgi-bin/foo.
If ScriptNoCache is set to true the HTTP server will by default add the header fields necessary to prevent proxies from caching the page. Generally this is something you want. Defaults to false.
The time in seconds the web server will wait between each chunk of data from the script. If the CGI-script not delivers any data before the timeout the connection to the client will be closed. Defaults to 15.
Where MimeType = string() and CgiScript = string().
Action adds an action, which will activate a cgi-script
whenever a file of a certain mime-type is requested. It
propagates the URL and file path of the requested document using
the standard CGI PATH_INFO and PATH_TRANSLATED environment
variables.
{action, {"text/plain", "/cgi-bin/log_and_deliver_text"}}
Where Method = string() and CgiScript = string().
Script adds an action, which will activate a cgi-script
whenever a file is requested using a certain HTTP method. The
method is either GET or POST as defined in RFC 1945. It
propagates the URL and file path of the requested document using
the standard CGI PATH_INFO and PATH_TRANSLATED environment
variables.
{script, {"PUT", "/cgi-bin/put"}}
ESI properties - requires mod_esi
Where URLPath = string() and AllowedModule = atom().
erl_script_alias marks all URLs matching url-path as erl
scheme scripts. A matching URL is mapped into a specific module
and function. For example:
{erl_script_alias, {"/cgi-bin/example", [httpd_example]}}
and a request to
http://your.server.org/cgi-bin/example/httpd_example:yahoo
would refer to httpd_example:yahoo/3 or, if that did not exist,
httpd_example:yahoo/2 and
http://your.server.org/cgi-bin/example/other:yahoo would
not be allowed to execute.
If erl_script_nocache is set to true the server will add http header fields that prevents proxies from caching the page. This is generally a good idea for dynamic content, since the content often vary between each request. Defaults to false.
If erl_script_timeout sets the time in seconds the server will wait between each chunk of data to be delivered through mod_esi:deliver/2. Defaults to 15. This is only relevant for scripts that uses the erl scheme.
Where URLPath = string() and AllowedModule = atom(). Same as erl_script_alias but for scripts using the eval scheme. Note that this is only supported for backwards compatibility. The eval scheme is deprecated.
Log properties - requires mod_log
Defines the filename of the error log file to be used to log server errors. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
Defines the filename of the access log file to be used to log security events. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
Defines the filename of the access log file to be used to log incoming requests. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
Disk Log properties - requires mod_disk_log
Defines the file-format of the log files see disk_log for more information. If the internal file-format is used, the logfile will be repaired after a crash. When a log file is repaired data might get lost. When the external file-format is used httpd will not start if the log file is broken. Defaults to external.
Defines the filename of the (disk_log(3)) error log file to be used to log server errors. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
Where MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the (disk_log(3)) error log file. The disk_log(3) error log file is of type wrap log and max-bytes will be written to each file and max-files will be used before the first file is truncated and reused.
Defines the filename of the (disk_log(3)) access log file which logs incoming security events i.e authenticated requests. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
Where MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the disk_log(3) access log file. The disk_log(3) access log file is of type wrap log and max-bytes will be written to each file and max-files will be used before the first file is truncated and reused.
Defines the filename of the (disk_log(3)) access log file which logs incoming requests. If the filename does not begin with a slash (/) it is assumed to be relative to the server_root.
Where MaxBytes = integer() and MaxFiles = integer(). Defines the properties of the disk_log(3) access log file. The disk_log(3) access log file is of type wrap log and max-bytes will be written to each file and max-files will be used before the first file is truncated and reused.
Authentication properties - requires mod_auth
{directory, {path(), [{property(), term()}]}}
Here follows the valid properties for directories
Defines a set of hosts which should be granted access to a
given directory.
For example:
{allow_from, ["123.34.56.11", "150.100.23"]}
The host 123.34.56.11 and all machines on the 150.100.23
subnet are allowed access.
Defines a set of hosts
which should be denied access to a given directory.
For example:
{deny_from, ["123.34.56.11", "150.100.23"]}
The host 123.34.56.11 and all machines on the 150.100.23
subnet are not allowed access.
Sets the type of authentication database that is used for the directory.The key difference between the different methods is that dynamic data can be saved when Mnesia and Dets is used. This property is called AuthDbType in the Apache like configuration files.
Sets the name of a file which contains the list of users and
passwords for user authentication. filename can be either
absolute or relative to the server_root
. If using the
plain storage method, this file is a plain text file, where
each line contains a user name followed by a colon, followed
by the non-encrypted password. If user names are duplicated,
the behavior is undefined. For example:
ragnar:s7Xxv7
edward:wwjau8
If using the dets storage method, the user database is
maintained by dets and should not be edited by hand. Use the
API functions in mod_auth module to create / edit the user
database. This directive is ignored if using the mnesia
storage method. For security reasons, make sure that the
auth_user_file
is stored outside the document tree of the Web
server. If it is placed in the directory which it protects,
clients will be able to download it.
Sets the name of a file which contains the list of user
groups for user authentication. Filename can be either
absolute or relative to the server_root
. If you use the plain
storage method, the group file is a plain text file, where
each line contains a group name followed by a colon, followed
by the member user names separated by spaces. For example:
group1: bob joe ante
If using the dets storage method, the group database is
maintained by dets and should not be edited by hand. Use the
API for mod_auth module to create / edit the group database.
This directive is ignored if using the mnesia storage method.
For security reasons, make sure that the auth_group_file
is
stored outside the document tree of the Web server. If it is
placed in the directory which it protects, clients will be
able to download it.
Sets the name of the authorization realm (auth-domain) for a directory. This string informs the client about which user name and password to use.
If set to other than "NoPassword" the password is required for all API calls. If the password is set to "DummyPassword" the password must be changed before any other API calls. To secure the authenticating data the password must be changed after the web server is started since it otherwise is written in clear text in the configuration file.
Defines users which should be granted access to a given directory using a secret password.
Defines users which should be granted access to a given directory using a secret password.
Htaccess authentication properties - requires mod_htaccess
Specify which filenames that are used for access-files. When a request comes every directory in the path to the requested asset will be searched after files with the names specified by this parameter. If such a file is found the file will be parsed and the restrictions specified in it will be applied to the request.
Security properties - requires mod_security
{security_directory, {path(), [{property(), term()}]}}
Here follows the valid properties for security directories
Name of the security data file. The filename can either absolute or relative to the server_root. This file is used to store persistent data for the mod_security module.
Specifies the maximum number of tries to authenticate a user has before the user is blocked out. If a user successfully authenticates when the user has been blocked, the user will receive a 403 (Forbidden) response from the server. If the user makes a failed attempt while blocked the server will return 401 (Unauthorized), for security reasons. Defaults to 3 may also be set to infinity.
Specifies the number of minutes a user is blocked. After this amount of time, he automatically regains access. Defaults to 60.
Specifies the number of minutes a failed user authentication is remembered. If a user authenticates after this amount of time, his previous failed authentications are forgotten. Defaults to 30.
Functions
info(Pid) ->
info(Pid, Properties) -> [{Option, Value}]
Properties = [property()]
Option = property()
Value = term()
Fetches information about the HTTP server. When called with only the pid all properties are fetched, when called with a list of specific properties they are fetched. Available properties are the same as the servers start options.
Note!
Pid is the pid returned from inets:start/[2,3]. Can also be retrieved form inets:services/0, inets:services_info/0 see inets(3)
info(Address, Port) ->
info(Address, Port, Properties) -> [{Option, Value}]
Address = ip_address()
Port = integer()
Properties = [property()]
Option = property()
Value = term()
Fetches information about the HTTP server. When called with only the Address and Port all properties are fetched, when called with a list of specific properties they are fetched. Available properties are the same as the servers start options.
Note!
Address has to be the ip-address and can not be the hostname.
reload_config(Config, Mode) -> ok | {error, Reason}
Config = path() | [{Option, Value}]
Option = property()
Value = term()
Mode = non_disturbing | disturbing
Reloads the HTTP server configuration without restarting the server. Incoming requests will be answered with a temporary down message during the time the it takes to reload.
Note!
Available properties are the same as the servers start options, although the properties bind_address and port can not be changed.
If mode is disturbing, the server is blocked forcefully and all ongoing requests are terminated and the reload will start immediately. If mode is non-disturbing, no new connections are accepted, but the ongoing requests are allowed to complete before the reload is done.
ERLANG WEB SERVER API DATA TYPES
ModData = #mod{} -record(mod, { data = [], socket_type = ip_comm, socket, config_db, method, absolute_uri, request_uri, http_version, request_line, parsed_header = [], entity_body, connection }).
To acess the record in your callback-module use
-include_lib("inets/include/httpd.hrl").
The fields of the mod
record has the following meaning:
data
[{InteractionKey,InteractionValue}]
is used to
propagate data between modules. Depicted
interaction_data()
in function type declarations.
socket_type
socket_type()
,
Indicates whether it is an ip socket or a ssl socket.
socket
ip_comm
or ssl
format
depending on the socket_type
.
config_db
config_db()
in function type
declarations.
method
"GET" | "POST" | "HEAD" | "TRACE"
, that is the
HTTP method.
absolute_uri
"http://ServerName:Part/cgi-bin/find.pl?person=jocke"
request_uri
Request-URI
as defined
in RFC 1945, for example "/cgi-bin/find.pl?person=jocke"
http_version
HTTP
version of the
request, that is "HTTP/0.9", "HTTP/1.0", or "HTTP/1.1".
request_line
Request-Line
as
defined in RFC 1945, for example "GET /cgi-bin/find.pl?person=jocke HTTP/1.0"
.
parsed_header
[{HeaderKey,HeaderValue}]
,
parsed_header
contains all HTTP header fields from the
HTTP-request stored in a list as key-value tuples. See RFC 2616
for a listing of all header fields. For example the date field
would be stored as: {"date","Wed, 15 Oct 1997 14:35:17 GMT"}
.
RFC 2616 defines that HTTP is a case insensitive protocol and
the header fields may be in lower case or upper case. Httpd will
ensure that all header field names are in lower case.
entity_body
Entity-Body
as defined
in RFC 2616, for example data sent from a CGI-script using the
POST method.
connection
true | false
If set to true the connection to the
client is a persistent connection and will not be closed when
the request is served.ERLANG WEB SERVER API CALLBACK FUNCTIONS
Functions
Module:do(ModData)-> {proceed, OldData} | {proceed, NewData} | {break, NewData} | done
OldData = list()
NewData = [{response,{StatusCode,Body}}] | [{response,{response,Head,Body}}] | [{response,{already_sent,Statuscode,Size}}]
StausCode = integer()
Body = io_list() | nobody | {Fun, Arg}
Head = [HeaderOption]
HeaderOption = {Option, Value} | {code, StatusCode}
Option = accept_ranges | allow | cache_control | content_MD5 | content_encoding | content_language | content_length | content_location | content_range | content_type | date | etag | expires | last_modified | location | pragma | retry_after | server | trailer | transfer_encoding
Value = string()
Fun = fun( Arg ) -> sent| close | Body
Arg = [term()]
When a valid request reaches httpd it calls do/1
in
each module defined by the Modules configuration
option. The function may generate data for other modules
or a response that can be sent back to the client.
The field data
in ModData is a list. This list will be
the list returned from the last call to
do/1
.
Body
is the body of the http-response that will be
sent back to the client an appropriate header will be
appended to the message. StatusCode
will be the
status code of the response see RFC2616 for the appropriate
values.
Head
is a key value list of HTTP header fields. The
server will construct a HTTP header from this data. See RFC
2616 for the appropriate value for each header field. If the
client is a HTTP/1.0 client then the server will filter the
list so that only HTTP/1.0 header fields will be sent back
to the client.
If Body
is returned and equal to {Fun,Arg}
,
the Web server will try apply/2
on Fun
with
Arg
as argument and expect that the fun either
returns a list (Body)
that is a HTTP-repsonse or the
atom sent if the HTTP-response is sent back to the
client. If close is returned from the fun something has gone
wrong and the server will signal this to the client by
closing the connection.
Module:load(Line, AccIn)-> eof | ok | {ok, AccOut} | {ok, AccOut, {Option, Value}} | {ok, AccOut, [{Option, Value}]} | {error, Reason}
Line = string()
AccIn = [{Option, Value}]
AccOut = [{Option, Value}]
Option = property()
Value = term()
Reason = term()
Load is used to convert a line in a Apache like
configuration file to a {Option, Value}
tuple. Some
more complex configuration options such as directory
and security_directory
will create an
accumulator.This function does only need clauses for the
options implemented by this particular callback module.
Module:store({Option, Value}, Config)-> {ok, {Option, NewValue}} | {error, Reason}
Line = string()
Option = property()
Config = [{Option, Value}]
Value = term()
Reason = term()
This function is used to check the validity of the configuration options before saving them in the internal database. This function may also have a side effect e.i. setup necessary extra resources implied by the configuration option. It can also resolve possible dependencies among configuration options by changing the value of the option. This function does only need clauses for the options implemented by this particular callback module.
Module:remove(ConfigDB) -> ok | {error, Reason}
ConfigDB = ets_table()
Reason = term()
When httpd is shutdown it will try to execute
remove/1
in each Erlang web server callback module. The
programmer may use this function to clean up resources
that may have been created in the store function.
ERLANG WEB SERVER API HELP FUNCTIONS
Functions
parse_query(QueryString) -> [{Key,Value}]
QueryString = string()
Key = string()
Value = string()
parse_query/1
parses incoming data to erl
and
eval
scripts (See mod_esi(3)) as defined in the standard
URL format, that is '+' becomes 'space' and decoding of
hexadecimal characters (%xx
).