NaviServer - programmable web server
4.99  5.0

[ Main Table Of Contents | Table Of Contents | Keyword Index ]

ns_http(n) 5.0.0a naviserver "NaviServer Built-in Commands"

Name

ns_http - HTTP client functionality

Table Of Contents

Synopsis

Description

The command ns_http provides HTTP and HTTPS client functionality. It can be used to create, dispatch, wait on and/or cancel requests and process replies from web servers, where NaviServer acts as a web client. This is important, for example, to use web services and REST interfaces.

COMMANDS

ns_http cancel id

Cancels queued HTTP/HTTPS request indntified by the id (of the request). The command returns an empty result. The purpose of this command is to terminate runaway requests or requests that have been timed out. Even completed requests can be cancelled if nobody is interested in the request result.

id

ID of the HTTP request to cancel.

ns_http cleanup

Cancels all pending HTTP/HTTPS requests issued in the current interpreter. At this point, any Tcl channels that have been optionally assigned to a task, will be automatically closed.

ns_http keepalives

Returns the number and slot (socket) usages for active or recent ns_http requests. The following example shows, how the socket opened to google.com is kept still for about 2 seconds. The slots for the sockets are allocated dynamically depending on the system load

 % ns_http run  -keepalive 2s https://google.com/
 % ns_http keepalives
  {slot 0 state waiting expire 1.999983 peer google.com:443 sock 82} {slot 1 state free}

The returned information is useful for monitoring and debugging busy HTTP client operations.

ns_http list ?id?

Lists running ns_http requests. When id was specified, returns a single list in form of: id url status. If id was not specified, returns a list of lists is return, where the elements have the described format. The value of the status can be one of done, running, or error.

id

Optional ID of the HTTP request to list.

ns_http queue ?-binary? ?-body value? ?-body_chan value? ?-body_file value? ?-body_size integer? ?-cafile value? ?-capath value? ?-cert value? ?-done_callback value? ?-expire time? ?-headers setId? ?-hostname value? ?-keep_host_header? ?-keepalive time? ?-method value? ?-outputchan value? ?-outputfile value? ?-partialresults? ?-proxy value? ?-raw? ?-response_header_callback value? ?-spoolsize memory-size? ?-timeout time? ?-unix_socket value? ?-verify? url

Opens a connection to the web server denoted in the url and returns (unless a -done_callback is specified) an id, which might be used later in ns_http wait or ns_http cancel to refer to this request. The command supports both HTTP and HTTPS URIs. The request is run in the default task queue in a dedicated per-queue thread.

The description of the available options is in the section OPTIONS below.

ns_http run ?-binary? ?-body value? ?-body_chan value? ?-body_file value? ?-body_size integer? ?-cafile value? ?-capath value? ?-cert value? ?-done_callback value? ?-expire time? ?-headers setId? ?-hostname value? ?-keep_host_header? ?-keepalive time? ?-method value? ?-outputchan value? ?-outputfile value? ?-partialresults? ?-proxy value? ?-raw? ?-response_header_callback value? ?-spoolsize memory-size? ?-timeout time? ?-unix_socket value? ?-verify? url

Sends an HTTP or HTTPS request and waits for the result. The command ns_http run is similar to ns_http queue followed by ns_http wait. The HTTP request is run in the same thread as the caller.

The description of the available options is in the section OPTIONS below. The result value is described in RETURN VALUE.

ns_http stats ?id?

Returns statistics from the currently running request in the form of a list of Tcl dictionaries. If the optional id was specified, just one dictionary containing details about the requested task will be returned, or empty if the task cannot be found. Otherwise, a list of dictionaries will be returned. The returned dictionary contains the following keys: task, url, requestlength, replylength, sent, received, sendbodysize, replybodysize, replysize. The task returns the ID of the HTTP task. The url returns the URL for the given task. The requestlength returns the length of the complete HTTP request, including header line, all the headers plus the optional request body. The replylength returns the value of the content-length as returned by the remote. This can be zero if the length of returned data is not known in advance. The member sent returns the number of bytes sent to the remote. This includes the header line, all the headers plus optional request body. The member received contains the number of bytes received from the remote. This includes the status line, all the headers plus the optional reply body. The member sendbodysize returns the number of bytes of the request body sent to the remote so far. The member replybodysize returns the number of bytes of the reply body received from the remote so far. The member replysize returns the number of bytes of the body received from the remote so far. The difference to the replybodysize is that this element tracks the number of body bytes prior to the optional deflate step for compressed contents, whereas the member replybodysize tracks the number of body bytes of the deflated contents. For uncompressed reply content, both replysize and replybodysize will have the same value.

id

Optional ID of the HTTP request to get statistics for.

ns_http taskthreads ?-timeout time? id

Returns a list of Tcl dictionaries containing information about the state of the ns_http task threads. Every dict in this list contains the keys name, running, and requests, where the name is the name of the task thread, running is the number of currently running ns_http tasks, and requests is the total number of requests processed so far by this task thread.

In the example below, there are two task queues defined.

 % ns_http taskthreads
 {name tclhttp.0 running 42 requests 21925} {name tclhttp.1 running 37 requests 25739}

The total number of task threads can be tailored via the configuration parameter nshttptaskthreads as shown below.

ns_http wait ?-timeout time? id

Waits for the queued command specified by the id returned from ns_http queue to complete. The specified -timeout specified the maximum duration of the request to complete. The time can be specified in any supported ns_time format.

On success, ns_http wait returns the same dictionary as the ns_http run. On error, leaves a descriptive error message in the interpreter result. On timeout, sets the Tcl variable to NS_TIMEOUT in addition to leaving the error message. The result value is described in RETURN VALUE.

id

ID of the HTTP request to wait for.

OPTIONS

-binary

transmit the content in binary form (as a Tcl byte-array) no matter what the content-type specifies.

-body body

transmit the content of the passed string as the request body. This option is mutually exclusive with -body_file and -body_chan. The implementation will try to guess the "content-type" of the body by checking the type of the passed body.

-body_chan value

transmit the content with the specified Tcl channel, which must be opened for reading, as the request body. The channel must be in blocking mode and should be seekable (unless the -body_size is specified). This option is mutually exclusive with -body and -body_file. Caller should put "content-type" header in passed -headers set since the implementation cannot guess the correct value. If none found, the "application/octet-stream" will be assumed. For ns_http queue command, the -body_chan channel will be dissociated from the current interpreter/thread and the ownership will be transferred to the thread that runs the request. Upon ns_http wait the channel is tossed back to the running interpreter and can be manipulated. It is the caller's responsibility to close the channel when not needed any more. The implementation will not do that (see ns_http cleanup for exception).

-body_file value

transmit the file with the specified filename as the request body. This option is mutually exclusive with -body and -body_chan. The implementation will try to guess the "content-type" of the body by checking the extension of the passed-in fn.

-body_size integer

specifies the expected size of the data which will be sent as the HTTP request body in bytes. This option must be used when sending body data via Tcl channels, which are not capable of seeking. It is optional if sending body data from memory or from a named file.

-cafile value

used for HTTPS URIs to specify the locations, at which CA certificates for verification purposes are located. The certificates available via cafile and capath are trusted. The cafile points to a file of CA certificates in PEM format. The file can contain several CA certificates.

-capath value

allows for HTTPS URIs to specify the locations, at which CA certificates for verification purposes are located. capath points to a directory containing CA certificates in PEM format. The files each contain one CA certificate. For more details, see https://www.openssl.org/docs/manmaster/ssl/SSL_CTX_load_verify_locations.html

-cert value

used for HTTPS URIs to use the specified client certificate. The certificates must be in PEM format and must be sorted starting with the subject's certificate (actual client or server certificate), followed by intermediate CA certificates if applicable, and ending at the highest level (root) CA.

-connecttimeout time

time to wait for connection setup and socket readable/writable state. The time can be specified in any supported ns_time format. When a domain name is resolved against several IP addresses, the provided timeout span is used for every IP address. When -connecttimeout is not specified, the value of -timeout is used.

-done_callback value

this callback will be executed as Tcl script when the request is completed. The provided call is appended with two arguments, a flag indicating a Tcl error and a data argument. When the flag has the value of 0, the call was successful, and the same dict as returned by ns_http run is provided as data. When the flag is 1, something went wrong. When the call was started with the option -partialresults, the data argument is as well the result dict. Otherwise, an error message is provided as dict. When -done_callback is used, ns_http queue returns empty and the user has no further control (wait, cancel) on the task.

-expire time

time to wait for the whole request to complete. Upon expiry of this timer, request processing is unconditionally stopped, regardless of whether the connection or some data to read/write is still pending. The time can be specified in any supported ns_time format.

-headers setId

headers is the ns_set ID containing the additional headers to include in the request.

-hostname value

used for HTTPS URIs to specify the hostname for the server certificate. This option has to be used, when the host supports virtual hosting, is configured with multiple certificates and supports the SNI (Server Name Indication) extension of TLS.

-keep_host_header

allows the Host: header field for the request to be passed in via the -headers argument, otherwise it is overwritten.

-keepalive time

when specified, set the keep-alive timeout of the connection to the specified duration. The time can be specified in any supported ns_time format.

-method value

Standard HTTP/HTTPS request method such as GET, POST, HEAD, PUT etc.

-outputchan value

receive the response content into the specified Tcl or ns_connchan channel. When a Tcl channel is used, it must be opened for writing and mist be in blocking mode. The option -outputchan is mutually exclusive with -outputfile. For ns_http queue command, the -outputchan channel will be dissociated from the current interpreter/thread and the ownership will be transferred to the thread that runs the request. Upon ns_http wait the channel is tossed back to the running interpreter and can be manipulated. It is the caller responsibility to close the channel when not needed any more. The implementation will not do that (see ns_http cleanup for exception).

-outputfile value

receive the response content into the specified filename. This option is mutually exclusive with -outputchan.

-partialresults

When specified, return also partial results. Additionally, no exception is raised and the error is returned in the dict member error. This option is useful to inspect partial results also when e.g. the server terminates the connection unexpectedly.

-proxy value

Controls whether to handle HTTP/HTTPS requests over an intermediate proxy. The argument must be a valid Tcl dictionary with (at least) the following keys: host, port. Optionally, a tunnel boolean key may be specified. The host must hold the hostname or IP address of the proxy server. The port is the TCP port of the proxy server. If host is omitted, no other keys from the dictionary are evaluated and the proxy connection is suppressed; the request is handled as if the option was not specified. If, however, host is specified, it will require the presence of the port, otherwise an error is thrown. The optional tunnel controls when to use the HTTP-tunnel facility. Without it, or if set to false, the HTTP connections are handled over the caching-proxy and HTTPS connections over the HTTP-tunnel. With the tunnel set to true, the HTTP-tunneling is used for both HTTP and HTTPS connections. Currently, no proxy authentication is supported. This will be added later.

-raw

delivers the content as-is (unmodified), regardless of the content encoding. This option is useful for tunneling modes.

-response_header_callback value

this callback will be executed when the response headers from the target of the ns_http request are received. The value is a Tcl command, which is called with an additional argument containing response data. The argument has the form of a Tcl dictionary with the keys status, phrase, headers, and outputchan. For example, in a reverse proxy configuration, an ns_connchan channel can be specified at the start of the request to a backend server via -outputchan, and when the response header fields from the backend server are received, these can be modified and sent via connchan back to the client.

 proc my_responseheaders_callback {d} {
   dict with d {
     set response "HTTP/1.1 $status $phrase\r\n"
     foreach {key value} [ns_set array $headers] {
        append response "$key: $value\r\n"
     }
     append response \r\n
     ns_connchan write $outputchan $response
   }
 }
-spoolsize memory-size

In case the result is larger than the given value, it will be spooled to a temporary file, a named file (see -outputfile) or the Tcl channel (see -outputchan). The value can be specified in memory units (kB, MB, GB, KiB, MiB, GiB).

-timeout time

time to wait for socket interactions (read/write). The value is used as well as a default value for -connecttimeout. The time can be specified in any supported ns_time format. When a domain name is resolved against several IP addresses, the provided timeout span is used for every IP address. The default timeout is 5s.

-unix_socket value

When specified, this parameter should contain the Unix Domain Socket (UDS) to connect to. For example, when a web server is listening on a Unix domain socket named /tmp/http.socket, ns_http can be used as the following, where the URL is still used for determining the host: request header field.

 % ns_http run -unix_socket /tmp/http.socket http://foo.org/
-verify

used for HTTPS URIs to specify that the server certificate should be verified. If the verification process fails, the TLS/SSL handshake is immediately terminated with an alert message containing the reason for the verification failure. If no server certificate is sent, because an anonymous cipher is used, this option is ignored.

RETURN VALUE

The commands ns_http run and ns_http wait return a dictionary containing the following elements:

status

time

headers

body

body_chan

error

file

https

outputchan

The first three members are always returned, the other elements are conditional. The status contains HTTP/HTTPS status code (200, 201, 400, etc). The time contains elapsed request time. The time value is in the ns_time format. The headers contains the name of the set with response headers.

When none of the output options are used, the result is received either in memory (member body) or in a spool file (member file). This decision depends on the value of -spoolsize. Both members are missing in the result dictionary, when outputchan was used.

For requests with a bodychan or outputchan, these values are added as well to result dictionary. For HTTPS requests, the result contains the member https with some low-level TLS parameters in a Tcl dictionary format.

EXAMPLES

First, a minimal example to retrieve a page with the HTTP GET method:

 % http run http://www.google.com
 status 200 time 0:174146 headers d0 body { ... }

Here is the same example, using separate ns_http queue and ns_http wait commands.

 % ns_http queue http://www.google.com
 http0
 
 % ns_http wait http0
 status 200 time 0:177653 headers d0 body { ... }

The second example is a code snippet making a request via HTTPS (note that HTTPS is supported only when NaviServer was compiled with OpenSSL support).

 % set result [ns_http run https://www.google.com]
 % dict get $result status
 302

If the returned data is too large to be retained in memory, you can use the -spoolsize to control when the content should be spooled to file. The spooled filename is contained in the resulting dict under the key file.

 % set result [ns_http run -spoolsize 1kB https://www.google.com]
 % dict get $result file
 /tmp/http.83Rfc5

For connecting to a server with virtual hosting that provides multiple certificates via SNI (Server Name Indication) the option -hostname is required.

The third example is a code snippet making a POST requests via HTTPS and provides url-encoded POST data. The example sets a larger timeout on the request, provides requests headers and returns reply-headers.

 ##################################################
 # Construct POST data using
 # query variable "q" with value "NaviServer"
 ##################################################
 set post_data [join [lmap {key value} {
     q NaviServer
 } {
   set _ "[ns_urlencode $key]=[ns_urlencode $value]"
 }] &]
 
 ##################################################
 # Submit POST request with provided "content-type"
 # to the "httpbin.org" site using HTTPS
 ##################################################
 set requestHeaders [ns_set create headers "content-type" "application/x-www-form-urlencoded"]
 
 set r [ns_http run -method POST \
   -headers $requestHeaders \
   -timeout 10.0 \
   -body $post_data \
   https://httpbin.org/anything]
 
 ##################################################
 # Output results from the result dict "r"
 ##################################################
 ns_log notice "status [dict get $r status]"
 ns_log notice "reply [dict get $r ]"
 ns_log notice "headers [dict get $r headers]"

The fourth example is a code snippet that sets a larger timeout on the request, provides an ns_set for the reply headers, and spools results to a file if the result is larger than 1000 bytes.

 set requestHeaders [ns_set create headers Host localhost]
 
 set h [ns_http queue -timeout 10.0 http://www.google.com]
 ns_http wait -result R -headers $requestHeaders -status S -spoolsize 1kB -file F $h
 
 if {[info exists F]} {
   ns_log notice "Spooled [file size $F] bytes to $F"
   file delete -- $F
 } else {
   ns_log notice "Got [string length $R] bytes"
 }

The next example is for downloading a file from the web into a named file or passed Tcl channel. Note the -spoolsize of zero, which will redirect all received data into the file/channel. Without the -spoolsize set, all the data would be otherwise stored in memory.

 % ns_http run -outputfile /tmp/reply.html -spoolsize 0 http://www.google.com
 status 302 time 0:132577 headers d2 file /tmp/reply.html
 % set chan [open /tmp/file.dat w]
 % ns_http run -outputchan $chan -spoolsize 0 http://www.google.com
 status 302 time 0:132577 headers d2 outputchan file22
 
 % close $chan

CONFIGURATION

The behavior of ns_http can be influenced by optional settings in the NaviServer configuration file. The behavior can be tailored partially globally (for all servers) and per server definition. Globally, the number of threads ns_http threads can be configured. These threads are used, when the request is started with ns_http queue or -done_callback are used.

 #---------------------------------------------------------------------
 # Global NaviServer parameters
 #---------------------------------------------------------------------
 ns_section ns/parameters {
    # ...
    # Configure the number of task threads for HTTP client requests
    # via ns_http. Per task thread, a separate queue is defined. For
    # common (Internet) usage, the default value of 1 is fully
    # sufficient.  For high-speed file uploads/downloads (10/100G
    # networks, fast I/O) the performance might be increased by
    # defining multiple task threads.
    #
    #ns_param    nshttptaskthreads  2     ;# default: 1
    # ...
 }

On the per-server configuration level, one can specify the default keep-alive timeout for outgoing HTTP requests, and the logging behavior. When logging is activated, the log file will contain information similar to the access.log of NaviServer (see nslog module), but for HTTP client requests.

 #---------------------------------------------------------------------
 # HTTP client (ns_http) configuration
 #---------------------------------------------------------------------
 ns_section ns/server/$server/httpclient {
    #
    # Set default keep-alive timeout for outgoing ns_http requests
    #
    ns_param    keepalive       5s       ;# default: 0s
    #
    # Configure log file for outgoing ns_http requests
    #
    ns_param     logging          on       ;# default: off
    ns_param     logfile          ${logroot}/httpclient.log
    ns_param     logrollfmt       %Y-%m-%d ;# format appended to log filename
    #ns_param    logmaxbackup     100      ;# 10, max number of backup log files
    #ns_param    logroll          true     ;# true, should server log files automatically
    #ns_param    logrollonsignal  true     ;# false, perform roll on a sighup
    #ns_param    logrollhour      0        ;# 0, specify at which hour to roll
 }

See Also

ns_connchan, ns_httptime, ns_set, ns_time, ns_urlencode, nslog

Keywords

HTTP, HTTP-client, HTTPS, SNI, TLS, certificate, configuration, global built-in, nslog, nssock, spooling