World Playground Deceit.net

Adding keyword parameters to Tcl procs


Two things that really annoy me with some programming languages: the lack of keyword (optional, always named and order agnostic) parameters and when the language has a builtin feature not made available to the user.

Tcl hits the nail here, because a lot of its standard commands have UNIX-like options but it doesn't make any of this available in either proc or apply.

But since proc can be wrapped "easily" (more later), a few hours gave me this:

source .../util.tcl
namespace path {::util ::tcl::mathop ::tcl::mathfunc}

proc* p {-flag {-opt foo} x args} {
    puts [lmap v {flag opt x args} {string cat "$v=[set $v]"}]
}
p 1 a;                # => flag=0 opt=foo x=1 args=a
p -opt bar -flag 1 a; # => flag=1 opt=bar x=1 args=a
p -flag;              # => flag=0 opt=foo x=-flag args=

# Relatively constrained overhead
proc  p1 {x} {}
proc* p2 {-flag {-opt foo} x} {}
timerate {p1 1}; # => 0.127049 µs/# 7870986 # 7870986 #/sec 1000.000 net-ms
timerate {p2 1}; # => 0.794317 µs/# 1258943 # 1258943 #/sec 1000.000 net-ms

Pretty nice? As you can see in the third example, I even implemented that Tcl behaviour where mandatory positional parameter count is used to disambiguate them from options (which is how you can do puts $string without having to safeguard it with a -- marker).

How it's made §

Addendum about the aforementioned "easily" to show how the sausage is made; definitely not for the faint of heart.

proc proc* {name args body} {
    if {[llength $args] == 0 || ![string match "-*" [lindex $args 0]]} {
        proc $name $args $body
        return
    }
    while {[llength $args] && [string match "-*" [lindex $args 0]]} {
        set args [lassign $args[set args {}] arg]
        switch [llength $arg] {
            1 { # Flag
                lappend optargs [set var [string range $arg 1 end]]
                lappend optinit 0
                lappend optswitch $arg [list set $var 1]
            }
            2 { # Option
                lassign $arg opt init
                lappend optargs [set var [string range $opt 1 end]]
                lappend optinit $init
                lappend optswitch $opt "set [list $var] \[lindex \$args \[incr i]]"
            }
            default {error "$arg: unknown option format"}
        }
    }
    lappend optswitch default {error "$arg: unknown option"}
    # Count mandatory positional args to disambiguate with options
    set mcount 0
    foreach arg $args {
        if {[llength $arg] != 1 || $arg eq "args"} break
        incr mcount
    }
    tailcall proc $name args [quasiquote {
        lassign `$optinit `@$optargs
        for {set i 0; set end [expr {[llength $args] - `$mcount}]} {$i < $end} {incr i} {
            set arg [lindex $args $i]
            if {![string match "-*" $arg] || $arg eq "--"} break
            switch -- $arg `$optswitch
        }
        apply `[list [list {*}$optargs {*}$args] $body]` \
            `@[join [lmap o $optargs {string cat \$$o}]]` {*}[lrange $args $i end]
    }]
}

Not too bad, you might say. And I agree, some very basic preprocessing and code generation at play. No, the problem resides in that innocuous quasiquote command used to do the final generation.

You see, Tcl has a pretty poor metaprogramming story as it lacks a way to selectively subst within strings, so what everybody does is… roll their own string based templating system! In fact, I don't think a hacker can wield Tcl for some time without making something like it (cf the wiki).

In my case, here's the revolting but working result (ignore the ? and move helpers):

# CL-like quasiquoting, with `$ and `[...]` for unquoting and `@$ and `@[...]` for splicing
# NB: ` chosen for its lack of standard meaning in Tcl and arrays not supported with `$
#
# Example:
#     % set v {1 2}
#     % {*}[quasiquote {
#           puts "`$v `@$v `[list {*}$v 3]` `@[list {*}$v 3]`"
#       }]
#     {1 2} 1 2 {1 2 3} 1 2 3
proc quasiquote {script} {
    # Quote all subst sensitive characters then selectively "unquote" some regions
    set tmp [string map {\\ \\\\ $ \\$ \[ \\\[} $script]
    set tmp [string map {`@\\$ $} [move tmp]]
    set tmp [regsub -all {`\\\$([[:alnum:]_:]+|\{[^\}]+\})} [move tmp] {[list $\1]}]
    # Find region closer via [string first] instead of {`@?\\\[.*?\]} because Tcl's regexp
    # engine can't mix greediness (https://wiki.tcl-lang.org/page/Drawbacks+of+Tcl%27s+Regexps)
    for {set start 0} {[regexp -indices -start $start {`@?\\\[} $tmp ropener]} {} {
        lassign $ropener rstart ropener_end
        set rend [+ [string first {]`} $tmp [+ $ropener_end 1]] 1]
        set splice [== [- $ropener_end $rstart] 3]
        set repl [string map {\\\\ \\ \\$ $ \\\[ \[} \
                          [string range $tmp $ropener_end [- $rend 1]]]
        set tmp [string replace [move tmp] $rstart $rend [? {$splice} {$repl} {\[list $repl\]}]]
        set start [+ $rstart [string length $repl] 1]
    }
    tailcall subst $tmp
}

So horrible it breaks Emacs' (arguably fragile) tcl-mode indentation. I think there's only one valid reaction to the words "metaprogramming via regexp", especially coming from CL which inspired this: