NaviServer - programmable web server
4.99  5.0

[ Main Table Of Contents | Table Of Contents | Keyword Index ]

ns_http(n) 5.0.0a naviserver "NaviServer Built-in Commands"

Name

ns_http - HTTP client functionality

Table Of Contents

Synopsis

Description

The command ns_http provides HTTP and HTTPS client functionality. It can be used to create, dispatch, wait on and/or cancel requests and process replies from web servers, where NaviServer acts as a web client. This is important, for example, to use web services and REST interfaces.

COMMANDS

ns_http queue ?-binary? ?-body_size size? ?-body B? ?-body_file fn? ?-body_chan chan? ?-cafile CA? ?-capath CP? ?-cert C? ?-raw? ?-donecallback call? ?-headers ns_set? ?-hostname HOSTNAME? ?-keepalive T? ?-keep_host_header? ?-method M? ?-spoolsize int? ?-outputfile fn? ?-outputchan chan? ?-partialresults? ?-timeout T? ?-expire T? ?-proxy dict? ?-verify? url

The command ns_http queue opens a connection to the web server denoted in the url and returns (unless a -donecallback is specified) an id, which might be used later in ns_http wait or ns_http cancel to refer to this request. The command supports both HTTP and HTTPS URIs. The request is run in the default task queue in a dedicated per-queue thread.

-binary

transmit the content in binary form (as a Tcl byte-array) no matter what the content-type specifies.

-body_size size

specifies the expected size of the data which will be sent as the HTTP request body. This option must be used when sending body data via Tcl channels, which are not capable of seeking. It is optional if sending body data from memory or from a named file.

-body body

transmit the content of the passed string as the request body. This option is mutually exclusive with -body_file and -body_chan. The implementation will try to guess the "Content-Type" of the body by checking the type of the passed body.

-body_file fn

transmit the file specified via filename as the request body. This option is mutually exclusive with -body and -body_chan. The implementation will try to guess the "Content-Type" of the body by checking the extension of the passed-in fn.

-body_chan chan

transmit the content specified via Tcl channel, which must be opened for reading, as the request body. The channel must be in blocking mode and should be seekable (unless the -body_size is specified). This option is mutually exclusive with -body and -body_file. Caller should put "Content-Type" header in passed -headers set since the implementation cannot guess the correct value. If none found, the "application/octet-stream" will be assumed. For ns_http queue command, the -body_chan channel will be dissociated from the current interpreter/thread and the ownership will be transferred to the thread that runs the request. Upon ns_http wait the channel is tossed back to the running interpreter and can be manipulated. It is the caller's responsibility to close the channel when not needed any more. The implementation will not do that (see ns_http cleanup for exception).

-cafile CA

used for HTTPS URIs to specify the locations, at which CA certificates for verification purposes are located. The certificates available via cafile and capath are trusted. The cafile points to a file of CA certificates in PEM format. The file can contain several CA certificates.

-capath CP

allows for HTTPS URIs to specify the locations, at which CA certificates for verification purposes are located. capath points to a directory containing CA certificates in PEM format. The files each contain one CA certificate. For more details, see https://www.openssl.org/docs/manmaster/ssl/SSL_CTX_load_verify_locations.html

-cert C

used for HTTPS URIs to use the specified client certificate. The certificates must be in PEM format and must be sorted starting with the subject's certificate (actual client or server certificate), followed by intermediate CA certificates if applicable, and ending at the highest level (root) CA.

-raw

delivers the content as-is (unmodified), regardless of the content encoding.

-donecallback call

this callback argument will be executed as Tcl script when the request is completed. The provided call is appended with two arguments, a flag indicating a Tcl error or not (integer value 1 or 0), and the dictionary as returned by ns_http run. When this option is used, ns_http queue returns empty and the user has no further control (wait, cancel) on the task.

-headers ns_set

headers is the ns_set ID containing the additional headers to include in the request.

-hostname HOSTNAME

used for HTTPS URIs to specify the hostname for the server certificate. This option has to be used, when the host supports virtual hosting, is configured with multiple certificates and supports the SNI (Server Name Indication) extension of TLS.

-keepalive T

when specified, set the keep-alive timeout of the connection to the specified duration. The time can be specified in any supported ns_time format.

-keep_host_header

allows the Host: header field for the request to be passed in via the -headers argument, otherwise it is overwritten.

-method method

Standard HTTP/HTTPS request method such as GET, POST, HEAD, PUT etc.

-spoolsize int

In case the result is larger than the given value, it will be spooled to a temporary file, a named file (see -outputfile) or the Tcl channel (see -outputchan). The value can be specified in memory units (kB, MB, GB, KiB, MiB, GiB).

-outputfile fn

receive the response content into the specified filename. This option is mutually exclusive with -outputchan.

-outputchan chan

receive the response content into the specified Tcl channel, which must be opened for writing. The channel must be in blocking mode. This option is mutually exclusive with -outputfile. For ns_http queue command, the -outputchan channel will be dissociated from the current interpreter/thread and the ownership will be transferred to the thread that runs the request. Upon ns_http wait the channel is tossed back to the running interpreter and can be manipulated. It is the caller responsibility to close the channel when not needed any more. The implementation will not do that (see ns_http cleanup for exception).

-partialresults

When specified, return also partial results. Additionally, no exception is raised and the error is returned in the dict member error. This option is useful to inspect partial results also when e.g. the server terminates the connection unexpectedly.

-expire T

time to wait for the whole request to complete. Upon expiry of this timer, request processing is unconditionally stopped, regardless of whether the connection or some data to read/write is still pending. The time can be specified in any supported ns_time format.

-timeout T

time to wait for connection setup and socket readable/writable state. The time can be specified in any supported ns_time format. When a domain name is resolved against several IP addresses, the provided timeout span is used for every IP address. The default timeout is 5s.

-verify

used for HTTPS URIs to specify that the server certificate should be verified. If the verification process fails, the TLS/SSL handshake is immediately terminated with an alert message containing the reason for the verification failure. If no server certificate is sent, because an anonymous cipher is used, this option is ignored.

-proxy dict

Controls whether to handle HTTP/HTTPS requests over an intermediate proxy. The argument must be a valid Tcl dictionary with (at least) the following keys: host, port. Optionally, a tunnel boolean key may be specified. The host must hold the hostname or IP address of the proxy server. The port is the TCP port of the proxy server. If host is omitted, no other keys from the dictionary are evaluated and the proxy connection is suppressed; the request is handled as if the option was not specified. If, however, host is specified, it will require the presence of the port, otherwise an error is thrown. The optional tunnel controls when to use the HTTP-tunnel facility. Without it, or if set to false, the HTTP connections are handled over the caching-proxy and HTTPS connections over the HTTP-tunnel. With the tunnel set to true, the HTTP-tunneling is used for both HTTP and HTTPS connections. Currently, no proxy authentication is supported. This will be added later.

ns_http run ?-binary? ?-body_size size? ?-body B? ?-body_file fn? ?-body_chan chan? ?-cafile CA? ?-capath CP? ?-cert C? ?-raw? ?-donecallback call? ?-headers ns_set? ?-hostname HOSTNAME? ?-keepalive T? ?-keep_host_header? ?-method M? ?-spoolsize int? ?-outputfile fn? ?-outputchan chan? ?-timeout T? ?-expire T? ?-verify? url

Send an HTTP request and wait for the result. The command ns_http run is similar to ns_http queue followed by ns_http wait. The HTTP request is run in the same thread as the caller.

ns_http wait ?-timeout T? id

Waits for the queued command specified by the id returned from ns_http queue to complete. Returns a Tcl dictionary with the following keys: status, time, headers. The status contains HTTP/HTTPS status code (200, 201, 400, etc). The time contains elapsed request ns_time. The headers contains the name of the set with response headers. If the request body was given over a Tcl channel, it will add the key body_chan to the result dictionary. If the response was not spooled in the file nor channel, it will return body key with the response data. Otherwise it will return either file if the response was spooled into the named (or temporary) file or outputchan if the request was spooled into a Tcl channel. For HTTPS requests, it will add https with some low-level TLS parameters in a Tcl dictionary format. The specified -timeout specified the maximum duration of the request to complete. The time can be specified in any supported ns_time format.

On success, ns_http wait returns the same dictionary as the ns_http run. On error, leaves a descriptive error message in the interpreter result. On timeout, sets the Tcl variable to NS_TIMEOUT in addition to leaving the error message.

id

ID of the HTTP request to wait for.

ns_http cancel id

Cancel queued HTTP/HTTPS request by the ID (of the request) as returned by ns_http queue command. The command returns an empty result. The sense behind this command is to be able to terminate runaway requests or requests that have been timed out. Even completed requests can be cancelled if nobody is interested in the request result.

id

ID of the HTTP request to cancel.

ns_http cleanup

Cancel all pending HTTP/HTTPS requests issued in the current interpreter. At this point, any Tcl channels that have been optionally assigned to a task, will be automatically closed.

ns_http list ?id?

If id was specified, returns a single list in form of: "id url status". If id was not specified, returns a list of N elements, where each element is in itself a list in the form of: "id url status". The status can be one of: "done", "running", "error".

id

Optional ID of the HTTP request to list.

ns_http stats ?id?

Returns statistics from the currently running request in the form of a list of Tcl dictionaries. If the optional id was specified, just one dictionary containing details about the requested task will be returned, or empty if the task cannot be found. Otherwise, a list of dictionaries will be returned. The returned dictionary contains the following keys: task, url, requestlength, replylength, sent, received, sendbodysize, replybodysize, replysize. The task returns the ID of the HTTP task. The url returns the URL for the given task. The requestlength returns the length of the complete HTTP request, including header line, all the headers plus the optional request body. The replylength returns the value of the Content-Length as returned by the remote. This can be zero if the length of returned data is not known in advance. The member sent returns the number of bytes sent to the remote. This includes the header line, all the headers plus optional request body. The member received contains the number of bytes received from the remote. This includes the status line, all the headers plus the optional reply body. The member sendbodysize returns the number of bytes of the request body sent to the remote so far. The member replybodysize returns the number of bytes of the reply body received from the remote so far. The member replysize returns the number of bytes of the body received from the remote so far. The difference to the replybodysize is that this element tracks the number of body bytes prior to the optional deflate step for compressed contents, whereas the member replybodysize tracks the number of body bytes of the deflated contents. For uncompressed reply content, both replysize and replybodysize will have the same value.

id

Optional ID of the HTTP request to get statistics for.

EXAMPLES

First, a minimal GET example:

 % ns_http queue http://www.google.com
 http0
 % ns_http wait http0
 status 302 time 0:97095 headers d0 body { ... }

The second example is a code snippet making a request via HTTPS (note that HTTPS is supported only when NaviServer was compiled with OpenSSL support).

 % set result [ns_http run https://www.google.com]
 % dict get $result status
 302

If the returned data is too large to be retained in memory, you can use the -spoolsize to control when the content should be spooled to file. The spooled filename is contained in the resulting dict under the key file.

 % set result [ns_http run -spoolsize 1kB https://www.google.com]
 % dict get $result file
 /tmp/http.83Rfc5

For connecting to a server with virtual hosting that provides multiple certificates via SNI (Server Name Indication) the option -hostname is required.

The third example is a code snippet making a POST requests via HTTPS and provides url-encoded POST data. The example sets a larger timeout on the request, provides requests headers and returns reply-headers.

 #######################
 # construct POST data
 #######################
 set post_data [join [lmap {key value} {
     q NaviServer
 } {
   set _ "[ns_urlencode $key]=[ns_urlencode $value]"
 }] &]
 
 #######################
 # submit POST request
 #######################
 set requestHeaders [ns_set create headers "Content-type" "application/x-www-form-urlencoded"]
 
 set r [ns_http queue -method POST \
   -headers $requestHeaders \
   -timeout 10.0 \
   -body $post_data \
   https://httpbin.org/anything]
 
 #######################
 # output results
 #######################
 ns_log notice "status [dict get $r status]"
 ns_log notice "reply [dict get $r ]"
 ns_log notice "headers [dict get $r headers]"

The fourth example is a code snippet that sets a larger timeout on the request, provides an ns_set for the reply headers, and spools results to a file if the result is larger than 1000 bytes.

 set requestHeaders [ns_set create headers Host localhost]
 
 set h [ns_http queue -timeout 10.0 http://www.google.com]
 ns_http wait -result R -headers $requestHeaders -status S -spoolsize 1kB -file F $h
 
 if {[info exists F]} {
   ns_log notice "Spooled [file size $F] bytes to $F"
   file delete -- $F
 } else {
   ns_log notice "Got [string length $R] bytes"
 }

The next example is for downloading a file from the web into a named file or passed Tcl channel. Note the -spoolsize of zero, which will redirect all received data into the file/channel. Without the -spoolsize set, all the data would be otherwise stored in memory.

 % ns_http run -outputfile /tmp/reply.html -spoolsize 0 http://www.google.com
 status 302 time 0:132577 headers d2 file /tmp/reply.html
 % set chan [open /tmp/file.dat w]
 % ns_http run -outputchan $chan -spoolsize 0 http://www.google.com
 status 302 time 0:132577 headers d2 outputchan file22
 
 % close $chan

CONFIGURATION

The behavior of ns_http can be influenced by optional settings in the NaviServer configuration file. The behavior can be tailored partially globally (for all servers) and per server definition. Globally, the number of threads ns_http threads can be configured.

 #---------------------------------------------------------------------
 # Global NaviServer parameters
 #---------------------------------------------------------------------
 ns_section ns/parameters {
    # ...
    # Configure the number of task threads for HTTP client requests
    # via ns_http. Per task thread, a separate queue is defined. For
    # common (Internet) usage, the default value of 1 is fully
    # sufficient.  For high-speed file uploads/downloads (10/100G
    # networks, fast I/O) the performance might be increased by
    # defining multiple task threads.
    #
    #ns_param    nshttptaskthreads  2     ;# default: 1
    # ...
 }

On the per-server configuration level, one can specify the default keep-alive timeout for outgoing HTTP requests, and the logging behavior. When logging is activated, the log file will contain information similar to the access.log of NaviServer (see nslog module), but for HTTP client requests.

 #---------------------------------------------------------------------
 # HTTP client (ns_http) configuration
 #---------------------------------------------------------------------
 ns_section ns/server/server1/httpclient {
    #
    # Set default keep-alive timeout for outgoing ns_http requests
    #
    ns_param	keepalive       5s       ;# default: 0s
    #
    # Configure log file for outgoing ns_http requests
    #
    ns_param	logging		on       ;# default: off
    ns_param	logfile		${logroot}/httpclient.log
    ns_param	logrollfmt	%Y-%m-%d ;# format appended to log filename
    #ns_param	logmaxbackup	100      ;# 10, max number of backup log files
    #ns_param	logroll		true     ;# true, should server log files automatically
    #ns_param	logrollonsignal	true     ;# false, perform roll on a sighup
    #ns_param	logrollhour	0        ;# 0, specify at which hour to roll
 }

See Also

ns_httptime, ns_set, ns_urlencode, nslog

Keywords

HTTP, HTTP-client, HTTPS, SNI, TLS, certificate, configuration, global built-in, nslog, nssock, spooling