textutil-cmds - NaviServer Built-in Commands

Name

textutil-cmds - Utility commands for processing text

Table Of Contents
Synopsis
Description
COMMANDS
EXAMPLES
See Also
Keywords

Synopsis

ns_parsehtml ?-noangle? ?-onlytags? ?--? html
ns_quotehtml html
ns_unquotehtml html
ns_striphtml html
ns_reflow_text ?-width integer? ?-offset integer? ?-prefix value? ?--? text
ns_trim ?-subst? ?-delimiter value? ?-prefix value? ?--? text

Description

These commands support common tasks of processing text chunks in NaviServer applications.

COMMANDS

ns_parsehtml ?-noangle? ?-onlytags? ?--? html

Parses the provided HTML text into a tagged list, where each list item starts with a type tag indicating the element's content. Possible type tags are comment, pi, tag, or text. For tag elements, the type tag is followed by the parsed string in the first list element and by a Tcl dict in the second list element containing the parsed HTML attributes.

When the option ?-noangle? is specified, the angle brackets (less and greater sign) are removed in the result.

When the option ?-onlytags? is specified, only tag elements are returned, without the leading type tag. This can be used for checking whether tags in an HTML snippets needs to be closed.

 % ns_parsehtml {hello <b>foo</b> anchor <a href="/foo">world</a>.}
 {text {hello }} {tag <b> {b {}}} {text foo} {tag </b> /b} {text { anchor }} {tag {<a href="/foo">} {a {href /foo}}} {text world} {tag </a> /a} {text .}
 
 % ns_parsehtml -noangle  {hello <b>foo</b> anchor <a href="/foo">world</a>.}
 {text {hello }} {tag b {b {}}} {text foo} {tag /b /b} {text { anchor }} {tag {a href="/foo"} {a {href /foo}}} {text world} {tag /a /a} {text .}
 
 % ns_parsehtml -onlytags {hello <b>foo</b> anchor <a href="/foo">world</a>.}
 {b {}} /b {a {href /foo}} /a

ns_quotehtml html

Returns the contents of HTML with certain characters that are special in HTML replaced with an escape code. The resulting text can be literally displayed in a webpage with an HTML renderer. Specifically:

: & becomes &
: < becomes <
: > becomes >
: ' becomes '
: " becomes "

All other characters are unmodified in the output.

ns_unquotehtml html

This is essentially the inverse operation of ns_quotehtml and replaces the named and numeric entities in decimal or hexadecimal notation contained in the provided string by their native characters. ASCII control characters are omitted.

ns_striphtml html

Returns the contents of html with all HTML tags removed. This function replaces as well all known HTML4 named entities and numeric entities in decimal or hexadecimal notation by its UTF-8 representations and removes HTML comments. ASCII control characters are omitted.

ns_reflow_text ?-width integer? ?-offset integer? ?-prefix value? ?--? text

Reflow a text to the specified length. The arguments width (default 80) and offset (default 0) are integers referring to number of characters. The prefix can be used to prefix every resulting line with a constant string.

ns_trim ?-subst? ?-delimiter value? ?-prefix value? ?--? text

Multi-line trim with optional delimiter or prefix. The command is useful, when not the full indentation from the source code file (with a indentation depending on the nesting level) should be preserved on the output (such as SQL statements, HTML markup, etc.).

When neither -delimiter or -prefix is specified all leading whitespace is stripped from the result. When -delimiter is specified, the delimiter is stripped as well. The specified delimiter has to be a single character.

When -prefix is used the specified string will be stripped from lines starting exactly with this prefix (example: use -prefix >> to strip the prefix >> from every line starting with it. This option is mutual exclusive with the option -delimiter.

Optionally, substitution can be used, which is applied before trimming (not really needed but sometimes convenient).

EXAMPLES

 % ns_quotehtml "Hello World!"
 Hello World!
 
 % ns_quotehtml "The <STRONG> tag is used to indicate strongly emphasized text."
 The &lt;STRONG&gt; tag is used to indicate strongly emphasized text.
 
 % ns_quotehtml {<span class="foo">}
 &lt;span class=&#34;foo&#34;&gt;

 % ns_reflow_text -width 15 -prefix "> " "one two three four five six seven eight nine ten"
 > one two three
 > four five six
 > seven eight
 > nine ten

 % ns_striphtml "<MARQUEE direction='right'><BLINK>Hello World!</BLINK></MARQUEE>"
 Hello World!

 % ns_trim {
    SELECT object_id, object_name
    FROM   acs_objects
    WHERE  object_id > 10000
 }
 SELECT object_id, object_name
 FROM   acs_objects
 WHERE  object_id > 10000
 
 % ns_trim -delimiter | {
    | <ul>
    |   <li> one
    |   <li> two
    |   <li> three
    | </ul>
 }
  <ul>
    <li> one
    <li> two
    <li> three
  </ul>
 ns_trim -prefix "> " {
 > line 1
 > line 2 
 }
  
 line 1
 line 2

Keywords

HTML, encoding, entity, quote, server built-in, text