ns_http - NaviServer Built-in Commands

Name

ns_http - HTTP client functionality

Table Of Contents
Synopsis
Description
COMMANDS
OPTIONS
RETURN VALUE
EXAMPLES
CONFIGURATION
See Also
Keywords

Synopsis

ns_http cancel id
ns_http cleanup
ns_http keepalives
ns_http list ?id?
ns_http queue ?-binary? ?-body value? ?-body_chan value? ?-body_file value? ?-body_size integer? ?-cafile value? ?-capath value? ?-cert value? ?-connecttimeout time? ?-done_callback value? ?-expire time? ?-headers setId? ?-hostname value? ?-insecure? ?-keep_host_header? ?-keepalive time? ?-maxresponse memory-size? ?-method value? ?-outputchan value? ?-outputfile value? ?-partialresults? ?-proxy value? ?-raw? ?-response_data_callback value? ?-response_header_callback value? ?-spoolsize memory-size? ?-timeout time? ?-unix_socket value? url
ns_http run ?-binary? ?-body value? ?-body_chan value? ?-body_file value? ?-body_size integer? ?-cafile value? ?-capath value? ?-cert value? ?-connecttimeout time? ?-done_callback value? ?-expire time? ?-headers setId? ?-hostname value? ?-insecure? ?-keep_host_header? ?-keepalive time? ?-maxresponse memory-size? ?-method value? ?-outputchan value? ?-outputfile value? ?-partialresults? ?-proxy value? ?-raw? ?-response_data_callback value? ?-response_header_callback value? ?-spoolsize memory-size? ?-timeout time? ?-unix_socket value? url
ns_http stats ?id?
ns_http taskthreads
ns_http wait ?-timeout time? id

The command ns_http provides HTTP and HTTPS client functionality. It can be used to create, dispatch, wait on and/or cancel requests and process replies from web servers, where NaviServer acts as a web client. This is important, for example, to use web services and REST interfaces.

COMMANDS

ns_http cancel id

Cancels queued HTTP/HTTPS request indntified by the id (of the request). The command returns an empty result. The purpose of this command is to terminate runaway requests or requests that have been timed out. Even completed requests can be cancelled if nobody is interested in the request result.

id: ID of the HTTP request to cancel.

ns_http cleanup

Cancels all pending HTTP/HTTPS requests issued in the current interpreter. At this point, any Tcl channels that have been optionally assigned to a task, will be automatically closed.

ns_http keepalives

Returns the number and slot (socket) usages for active or recent ns_http requests. The following example shows, how the socket opened to google.com is kept still for about 2 seconds. The slots for the sockets are allocated dynamically depending on the system load

 % ns_http run  -keepalive 2s https://google.com/
 % ns_http keepalives
  {slot 0 state waiting expire 1.999983 peer google.com:443 sock 82} {slot 1 state free}

The returned information is useful for monitoring and debugging busy HTTP client operations.

ns_http list ?id?

Lists running ns_http requests. When id was specified, returns a single list in form of: id url status. If id was not specified, returns a list of lists is return, where the elements have the described format. The value of the status can be one of done, running, or error.

id: Optional ID of the HTTP request to list.

ns_http queue ?-binary? ?-body value? ?-body_chan value? ?-body_file value? ?-body_size integer? ?-cafile value? ?-capath value? ?-cert value? ?-connecttimeout time? ?-done_callback value? ?-expire time? ?-headers setId? ?-hostname value? ?-insecure? ?-keep_host_header? ?-keepalive time? ?-maxresponse memory-size? ?-method value? ?-outputchan value? ?-outputfile value? ?-partialresults? ?-proxy value? ?-raw? ?-response_data_callback value? ?-response_header_callback value? ?-spoolsize memory-size? ?-timeout time? ?-unix_socket value? url

Opens a connection to the web server denoted in the url and returns (unless a -done_callback is specified) an id, which might be used later in ns_http wait or ns_http cancel to refer to this request. The command supports both HTTP and HTTPS URIs. The request is run in the default task queue in a dedicated per-queue thread.

The description of the available options is in the section OPTIONS below.

ns_http run ?-binary? ?-body value? ?-body_chan value? ?-body_file value? ?-body_size integer? ?-cafile value? ?-capath value? ?-cert value? ?-connecttimeout time? ?-done_callback value? ?-expire time? ?-headers setId? ?-hostname value? ?-insecure? ?-keep_host_header? ?-keepalive time? ?-maxresponse memory-size? ?-method value? ?-outputchan value? ?-outputfile value? ?-partialresults? ?-proxy value? ?-raw? ?-response_data_callback value? ?-response_header_callback value? ?-spoolsize memory-size? ?-timeout time? ?-unix_socket value? url

Sends an HTTP or HTTPS request and waits for the result. The command ns_http run is similar to ns_http queue followed by ns_http wait. The HTTP request is run in the same thread as the caller.

The description of the available options is in the section OPTIONS below. The result value is described in RETURN VALUE.

ns_http stats ?id?

Returns statistics from the currently running request in the form of a list of Tcl dictionaries. If the optional id was specified, just one dictionary containing details about the requested task will be returned, or empty if the task cannot be found. Otherwise, a list of dictionaries will be returned. The returned dictionary contains the following keys: task, url, requestlength, replylength, sent, received, sendbodysize, replybodysize, replysize. The task returns the ID of the HTTP task. The url returns the URL for the given task. The requestlength returns the length of the complete HTTP request, including header line, all the headers plus the optional request body. The replylength returns the value of the content-length as returned by the remote. This can be zero if the length of returned data is not known in advance. The member sent returns the number of bytes sent to the remote. This includes the header line, all the headers plus optional request body. The member received contains the number of bytes received from the remote. This includes the status line, all the headers plus the optional reply body. The member sendbodysize returns the number of bytes of the request body sent to the remote so far. The member replybodysize returns the number of bytes of the reply body received from the remote so far. The member replysize returns the number of bytes of the body received from the remote so far. The difference to the replybodysize is that this element tracks the number of body bytes prior to the optional deflate step for compressed contents, whereas the member replybodysize tracks the number of body bytes of the deflated contents. For uncompressed reply content, both replysize and replybodysize will have the same value.

id: Optional ID of the HTTP request to get statistics for.

ns_http taskthreads

Returns a list of Tcl dictionaries containing information about the state of the ns_http task threads. Every dict in this list contains the keys name, running, and requests, where the name is the name of the task thread, running is the number of currently running ns_http tasks, and requests is the total number of requests processed so far by this task thread.

In the example below, there are two task queues defined.

 % ns_http taskthreads
 {name tclhttp.0 running 42 requests 21925} {name tclhttp.1 running 37 requests 25739}

The total number of task threads can be tailored via the configuration parameter nshttptaskthreads as shown below.

ns_http wait ?-timeout time? id

Waits for the queued command specified by the id returned from ns_http queue to complete. The specified -timeout specified the maximum duration of the request to complete. The time can be specified in any supported ns_time format.

On success, ns_http wait returns the same dictionary as the ns_http run. On error, leaves a descriptive error message in the interpreter result. On timeout, sets the Tcl variable to NS_TIMEOUT in addition to leaving the error message. The result value is described in RETURN VALUE.

id: ID of the HTTP request to wait for.

OPTIONS

-binary

transmit the content in binary form (as a Tcl byte-array) no matter what the content-type specifies.

-body body

transmit the content of the passed string as the request body. This option is mutually exclusive with -body_file and -body_chan. The implementation will try to guess the "content-type" of the body by checking the type of the passed body.

-body_chan value

transmit the content with the specified Tcl channel, which must be opened for reading, as the request body. The channel must be in blocking mode and should be seekable (unless the -body_size is specified). This option is mutually exclusive with -body and -body_file. Caller should put "content-type" header in passed -headers set since the implementation cannot guess the correct value. If none found, the "application/octet-stream" will be assumed. For ns_http queue command, the -body_chan channel will be dissociated from the current interpreter/thread and the ownership will be transferred to the thread that runs the request. Upon ns_http wait the channel is tossed back to the running interpreter and can be manipulated. It is the caller's responsibility to close the channel when not needed any more. The implementation will not do that (see ns_http cleanup for exception).

-body_file value

transmit the file with the specified filename as the request body. This option is mutually exclusive with -body and -body_chan. The implementation will try to guess the "content-type" of the body by checking the extension of the passed-in fn.

-body_size integer

specifies the expected size of the data which will be sent as the HTTP request body in bytes. This option must be used when sending body data via Tcl channels, which are not capable of seeking. It is optional if sending body data from memory or from a named file.

-cafile value

specifies for HTTPS requests a PEM file containing certificates to validate the peer server (unless the option -insecure is used). All certificates in this file are trusted. Typically, the file has the name ca-bundle.crt and contains the top-level certificates. When the specified filename is not on an absolute location, the file is assumed to be in the home directory of NaviServer. The default value is ca-bundle.crt, but can be altered in the CONFIGURATION file. For more details, see SSL_CTX_load_verify_locations from the OpenSSL documentation.

-capath value

specifies for HTTPS requests a directory containing trusted certificates to validate the peer server (unless the option -insecure is used). Each file in this directory must be in PEM format and must contain exactly one certificate. When the specified directory is not an absolute path, it is assumed to be in the home directory of NaviServer. The default value is certificates, but can be altered in the CONFIGURATION file. For more details, see SSL_CTX_load_verify_locations from the OpenSSL documentation.

-cert value

used for HTTPS URIs to use the specified client certificate. The certificates must be in PEM format and must be sorted starting with the subject's certificate (actual client or server certificate), followed by intermediate CA certificates if applicable, and ending at the highest level (root) CA.

-connecttimeout time

time to wait for connection setup and socket readable/writable state. The time can be specified in any supported ns_time format. When a domain name is resolved against several IP addresses, the provided timeout span is used for every IP address. When -connecttimeout is not specified, the value of -timeout is used.

-done_callback value

this callback will be executed as Tcl script when the request is completed. The provided call is appended with two arguments, a flag indicating a Tcl error and a data argument. When the flag has the value of 0, the call was successful, and the same dict as returned by ns_http run is provided as data. When the flag is 1, something went wrong. When the call was started with the option -partialresults, the data argument is as well the result dict. Otherwise, an error message is provided as dict. When -done_callback is used, ns_http queue returns empty and the user has no further control (wait, cancel) on the task.

-expire time

time to wait for the whole request to complete. Upon expiry of this timer, request processing is unconditionally stopped, regardless of whether the connection or some data to read/write is still pending. The time can be specified in any supported ns_time format.

-headers setId

headers is the ns_set ID containing the additional headers to include in the request.

-hostname value

used for HTTPS URIs to specify the hostname for the server certificate. This option has to be used, when the host supports virtual hosting, is configured with multiple certificates and supports the SNI (Server Name Indication) extension of TLS.

-insecure

used for HTTPS URIs to specify that the server certificate should NOT be verified. By default, the identity of the peer server is checked for all HTTPS requests. If the verification process fails, the TLS/SSL handshake is terminated with an alert message containing the reason for the verification failure. The default for this parameter can be specified in the CONFIGURATION file.

-keep_host_header

allows the Host: header field for the request to be passed in via the -headers argument, otherwise it is overwritten.

-keepalive time

when specified, set the keep-alive timeout of the connection to the specified duration. The time can be specified in any supported ns_time format.

-method value

Standard HTTP/HTTPS request method such as GET, POST, HEAD, PUT etc.

-maxresponse memory-size

Specifies the maximum number of bytes allowed in a response. If the response exceeds this limit, an exception is triggered.

If the response includes a content-length header, the value is used for comparison, and the request is stopped after processing the header.
If no content-length header is present, the request is canceled once the number of received bytes exceeds the specified value.

The value can be specified using memory units such as kB, MB, GB, KiB, MiB, or GiB.

-outputchan value

receive the response content into the specified Tcl or ns_connchan channel. When a Tcl channel is used, it must be opened for writing and mist be in blocking mode. The option -outputchan is mutually exclusive with -outputfile. For ns_http queue command, the -outputchan channel will be dissociated from the current interpreter/thread and the ownership will be transferred to the thread that runs the request. Upon ns_http wait the channel is tossed back to the running interpreter and can be manipulated. It is the caller responsibility to close the channel when not needed any more. The implementation will not do that (see ns_http cleanup for exception).

-outputfile value

receive the response content into the specified filename. This option is mutually exclusive with -outputchan.

-partialresults

When specified, return also partial results. Additionally, no exception is raised and the error is returned in the dict member error. This option is useful to inspect partial results also when e.g. the server terminates the connection unexpectedly.

-proxy value

Controls whether to handle HTTP/HTTPS requests over an intermediate proxy. The argument must be a valid Tcl dictionary with (at least) the following keys: host, port. Optionally, a tunnel boolean key may be specified. The host must hold the hostname or IP address of the proxy server. The port is the TCP port of the proxy server. If host is omitted, no other keys from the dictionary are evaluated and the proxy connection is suppressed; the request is handled as if the option was not specified. If, however, host is specified, it will require the presence of the port, otherwise an error is thrown. The optional tunnel controls when to use the HTTP-tunnel facility. Without it, or if set to false, the HTTP connections are handled over the caching-proxy and HTTPS connections over the HTTP-tunnel. With the tunnel set to true, the HTTP-tunneling is used for both HTTP and HTTPS connections. Currently, no proxy authentication is supported. This will be added later.

-raw

delivers the content as-is (unmodified), regardless of the content encoding. This option is useful for tunneling modes.

-response_data_callback

Specifies a Tcl callback to be invoked whenever a block of response data is received during an HTTP client request. The callback adds a single argument to the provided command, which is a Tcl dictionary containing the following keys:

headers: ns_set containing the response header fields.
data: received data block (part of the response).
outputchan: (optional) - the name of the output channel, if provided.

The callback's return value is interpreted as follows:

Returning ``TCL_OK`` continues normal processing of the received data, respecting the output options.
Returning ``TCL_BREAK`` (e.g., via return -code break) flushes the response buffer. The received data block is not appended to the spool file nor to the in-memory receive body string (important to avoid bloat for streaming requests).

This callback facilitates streaming response processing, allowing applications to process incremental results - such as those produced by generative AI services - immediately as they become available.

 proc my_response_data_callback {x d} {
   dict with d {
     # Log messages containing the length of the received data chunks
     ns_log notice received [string length $data] bytes
   }
 }
 
 # Example invocation
 ns_http run -response_data_callback ::my_response_data_callback https://orf.at

-response_header_callback value

Specifies a Tcl callback to be invoked as soon as the response headers from the target of an ns_http request arrive. The callback is passed one argument in form of a Tcl dictionary with these keys:

status: The received HTTP response status code of the response.
phrase: The HTTP response status phrase.
headers: ns_set containing the response header fields.
outputchan: (optional) The output channel name, if specified.

If the callback raises an exception, the HTTP request is immediately aborted.

Here is a reverse-proxy pattern, where we capture, tweak, and forward headers from a detached connection (in the background) via ns_connchan:

 proc my_response_header_callback {d} {
   # Unpack the response dictionary
   dict with d {
     # Start with the status line
     set response "HTTP/1.1 $status $phrase\r\n"
     # Append each header field
     foreach {key value} [ns_set array $headers] {
        append response "$key: $value\r\n"
     }
     # End of headers
     append response \r\n
     # Send headers back to the client
     ns_connchan write $outputchan $response
   }
 }
 
 # Example ns_http call using the callback:
 set chan [ns_connchan detach]
 # ...
 ns_http run \
    -outputchan $chan \
    -response_header_callback my_response_header_callback \
    $url

-spoolsize memory-size

In case the result is larger than the given value, it will be spooled to a temporary file, a named file (see -outputfile) or the Tcl channel (see -outputchan). The value can be specified in memory units (kB, MB, GB, KiB, MiB, GiB).

-timeout time

time to wait for socket interactions (read/write). The value is used as well as a default value for -connecttimeout. The time can be specified in any supported ns_time format. When a domain name is resolved against several IP addresses, the provided timeout span is used for every IP address. The default timeout is 5s.

-unix_socket value

When specified, this parameter should contain the Unix Domain Socket (UDS) to connect to. For example, when a web server is listening on a Unix domain socket named /tmp/http.socket, ns_http can be used as the following, where the URL is still used for determining the host: request header field.

 % ns_http run -unix_socket /tmp/http.socket http://foo.org/

RETURN VALUE

The commands ns_http run and ns_http wait return a dictionary containing the following elements:

status: The received HTTP response status code of the response.
time: The elapsed request time in ns_time format.
headers: ns_set containing the response header fields.
body: (optional) The received body of the response in form of a string.
body_chan: (optional) The input channel name for the body, if specified.
error: (optional) Error message.
file: (optional) The received body of the response in form of a file.
https: (optional) TLS negotiation results for HTTPS requests
outputchan: (optional) The output channel name, if specified.

The first three members are always returned, the other elements are conditional. When none of the output options were used, the result is received either in memory (member body) or in a spool file (member file). This decision depends on the value of -spoolsize. Both members are missing in the result dictionary, when outputchan was used.

EXAMPLES

First, a minimal example to retrieve a page with the HTTP GET method:

 % http run http://www.google.com
 status 200 time 0:174146 headers d0 body { ... }

Here is the same example, using separate ns_http queue and ns_http wait commands.

 % ns_http queue http://www.google.com
 http0
 
 % ns_http wait http0
 status 200 time 0:177653 headers d0 body { ... }

The second example is a code snippet making a request via HTTPS (note that HTTPS is supported only when NaviServer was compiled with OpenSSL support).

 % set result [ns_http run https://www.google.com]
 % dict get $result status
 302

If the returned data is too large to be retained in memory, you can use the -spoolsize to control when the content should be spooled to file. The spooled filename is contained in the resulting dict under the key file.

 % set result [ns_http run -spoolsize 1kB https://www.google.com]
 % dict get $result file
 /tmp/http.83Rfc5

For connecting to a server with virtual hosting that provides multiple certificates via SNI (Server Name Indication) the option -hostname is required.

The third example is a code snippet making a POST requests via HTTPS and provides url-encoded POST data. The example sets a larger timeout on the request, provides requests headers and returns reply-headers.

 ##################################################
 # Construct POST data using
 # query variable "q" with value "NaviServer"
 ##################################################
 set post_data [join [lmap {key value} {
     q NaviServer
 } {
   set _ "[ns_urlencode $key]=[ns_urlencode $value]"
 }] &]
 
 ##################################################
 # Submit POST request with provided "content-type"
 # to the "httpbin.org" site using HTTPS
 ##################################################
 set requestHeaders [ns_set create headers "content-type" "application/x-www-form-urlencoded"]
 
 set r [ns_http run -method POST \
   -headers $requestHeaders \
   -timeout 10.0 \
   -body $post_data \
   https://httpbin.org/anything]
 
 ##################################################
 # Output results from the result dict "r"
 ##################################################
 ns_log notice "status [dict get $r status]"
 ns_log notice "reply [dict get $r ]"
 ns_log notice "headers [dict get $r headers]"

The fourth example is a code snippet that sets a larger timeout on the request, provides an ns_set for the reply headers, and spools results to a file if the result is larger than 1000 bytes.

 set requestHeaders [ns_set create headers Host localhost]
 
 set h [ns_http queue -timeout 10.0 http://www.google.com]
 ns_http wait -result R -headers $requestHeaders -status S -spoolsize 1kB -file F $h
 
 if {[info exists F]} {
   ns_log notice "Spooled [file size $F] bytes to $F"
   file delete -- $F
 } else {
   ns_log notice "Got [string length $R] bytes"
 }

The next example is for downloading a file from the web into a named file or passed Tcl channel. Note the -spoolsize of zero, which will redirect all received data into the file/channel. Without the -spoolsize set, all the data would be otherwise stored in memory.

 % ns_http run -outputfile /tmp/reply.html -spoolsize 0 http://www.google.com
 status 302 time 0:132577 headers d2 file /tmp/reply.html

 % set chan [open /tmp/file.dat w]
 % ns_http run -outputchan $chan -spoolsize 0 http://www.google.com
 status 302 time 0:132577 headers d2 outputchan file22
 
 % close $chan

CONFIGURATION

The behavior of ns_http can be influenced by optional settings in the NaviServer configuration file. The behavior can be tailored partially globally (for all servers) and per server definition. Globally, the number of threads ns_http threads can be configured. These threads are used, when the request is started with ns_http queue or -done_callback are used.

 #---------------------------------------------------------------------
 # Global NaviServer parameters
 #---------------------------------------------------------------------
 ns_section ns/parameters {
    # ...
    # Configure the number of task threads for HTTP client requests
    # via ns_http. Per task thread, a separate queue is defined. For
    # common (Internet) usage, the default value of 1 is fully
    # sufficient.  For high-speed file uploads/downloads (10/100G
    # networks, fast I/O) the performance might be increased by
    # defining multiple task threads.
    #
    #ns_param    nshttptaskthreads  2     ;# default: 1
    # ...
 }

On the per-server configuration level, one can specify the default keep-alive timeout for outgoing HTTP requests, security default for peer server certificate validation, and the logging behavior. When logging is activated, the log file will contain information similar to the access.log of NaviServer (see nslog module), but for HTTP client requests.

 #---------------------------------------------------------------------
 # HTTP client (ns_http, ns_connchan) configuration
 #---------------------------------------------------------------------
 ns_section ns/server/$server/httpclient {
   #
   # Set default keep-alive timeout for outgoing ns_http requests
   #
   ns_param    keepalive       5s       ;# default: 0s
 
   #
   # Security configuration:
   # See: HTTP client security configuration
   # ...
 
   #
   # Configure logging options for outgoing ns_http requests
   #
   ns_param     logging          on       ;# default: off
   ns_param     logfile          httpclient.log ;# default: [ns_info home]/logs/httpclient.log
   ns_param     logrollfmt       %Y-%m-%d ;# format appended to log filename
   #ns_param    logmaxbackup     100      ;# 10, max number of backup log files
   #ns_param    logroll          true     ;# true, should server log files automatically
   #ns_param    logrollonsignal  true     ;# false, perform log rotation on SIGHUP
   #ns_param    logrollhour      0        ;# 0, specify at which hour to roll
 }

Keywords

CAfile, CApath, HTTP, HTTP-client, HTTPS, PEM, SNI, TLS, certificate, configuration, global built-in, logging, nslog, nssock, spooling, streaming