Code Observation: Goal's fmt.tbl Function

What is the Goal programming language?

From Goal's README:

Goal is an embeddable array programming language with a bytecode interpreter, written in Go. The command line intepreter can execute scripts or run in interactive mode. Goal shines the most in common scripting tasks, like handling columnar data or text processing. It is also suitable for exploratory programming.

Created by anaseto, Goal is in the K language family of array programming languages originally developed by Arthur Whitney.

I've been using Goal for daily analysis, API client, and general scripting needs, as well as the foundation for an interactive programming environment designed for basic data analysis, database querying, and plotting that I'm building.

This post, however, is another in the series of Code Observations, so let's get to it.

Environment details:

$ goal --version
go      go1.22.0
path    codeberg.org/anaseto/goal/cmd/goal
mod     codeberg.org/anaseto/goal       (devel)
build   -buildmode=exe
build   -compiler=gc
build   DefaultGODEBUG=httplaxcontentlength=1,httpmuxgo121=1,panicnil=1,tls10server=1,tlsrsakex=1,tlsunsafeekm=1
build   CGO_ENABLED=1
build   CGO_CFLAGS=
build   CGO_CPPFLAGS=
build   CGO_CXXFLAGS=
build   CGO_LDFLAGS=
build   GOARCH=arm64
build   GOOS=darwin
build   vcs=git
build   vcs.revision=c56e5a641072e98ad83ed5a521c1de2c455b6550
build   vcs.time=2024-05-29T15:42:01Z
build   vcs.modified=false

Walk-through of Goal's fmt.tbl function

Terminology Note: The terms "verb" and "noun" refer to syntactic constructs of Goal/K that roughly map to "function" and "data". The terms "monadic" and "dyadic" mean "one-argument" and "two-argument".

Goal does not define a fmt.tbl verb by default, but it is defined as a function in a lib folder in its main repository.

When invoked with a dictionary shaped as a table (string keys with flat arrays of equal length), it prints something like the following (retrieved from running the examples/dplyr.goal file in Goal's repository):

=== Table 87x4 ===
            name height  mass height+10
---------------- ------ ----- ---------
"Luke Skywalker"  172.0  77.0     182.0
         "C-3PO"  167.0  75.0     177.0
         "R2-D2"   96.0  32.0     106.0
   "Darth Vader"  202.0 136.0     212.0
   "Leia Organa"  150.0  49.0     160.0

For reference, here is the entire implementation of fmt.tbl:

/ tbl[t;r;c;f] outputs dict t as table, assuming string keys and flat columns,
/ and outputs at most r rows and c columns, using format string f for floating
/ point numbers. Example: tbl[t;5;8;"%.1f"].
tbl:{[t;r;c;f]
  (nr;nc):(#*t;#t); say"=== Table ${nr}x$nc ==="
  t:t[!r&nr;(c&nc)@!t] / keep up to r rows and c columns
  k:!t; v:(..?[(@x)¿"nN";p.f$x;$'x])'.t; w:(-1+""#k)|(|/-1+""#)'v
  (k;v):(-w)!'´(k;v); say"\n"/,/" "/(k;"-"*w;+v)
}

NB: If you're following along at the REPL, either import this definition (if you have the Goal repo checked out locally) or evaluate the above code and then create the fmt.tbl as shown below:

  / Either import
  import "<location of Goal repo>/lib/fmt.goal"

  / Or eval the previous code listing and then make this variable
  / to follow along with the examples in this article
  fmt.tbl:tbl

What does the fmt.tbl function produce?

Let's construct a simple table of data on the world's longest rivers to make sure we know how to invoke fmt.tbl and the output we should expect before dissecting its implementation:

  / Data from Wikipedia: https://en.wikipedia.org/wiki/List_of_river_systems_by_length
  t:..[name:("Nile";"Amazon";"Yangtze";"Mississippi")
       length:(6650;6400;6300;6275)
       drainage:(3254555;7000000;1800000;2980000)]

The .. as used to define our t table (dictionary) is referred to as "dict fields". It's syntactic sugar that allows us to define the dictionary contents as key1:value1;key2:value2 instead of the more K-like keys!values.

Let's print the entire table:

 fmt.tbl[t;4;3;""]

Which produces:

=== Table 4x3 ===
         name length drainage
------------- ------ --------
       "Nile"   6650  3254555
     "Amazon"   6400  7000000
    "Yangtze"   6300  1800000
"Mississippi"   6275  2980000
1

So it prints a header with the table's dimensions; column names along the first row; followed by data rows with columnar alignment.

Let's see what happens if we ask for different numbers of rows and columns:

  fmt.tbl[t;100;100;""] / More rows and columns than present
=== Table 4x3 ===
         name length drainage
------------- ------ --------
       "Nile"   6650  3254555
     "Amazon"   6400  7000000
    "Yangtze"   6300  1800000
"Mississippi"   6275  2980000
1

  fmt.tbl[t;2;3;""] / 2 rows, 3 columns
=== Table 4x3 ===
    name length drainage
-------- ------ --------
  "Nile"   6650  3254555
"Amazon"   6400  7000000
1


  fmt.tbl[t;2;1;""] / 2 rows, 1 column
=== Table 4x3 ===
    name
--------
  "Nile"
"Amazon"
1

  fmt.tbl[t;0;1;""] / 0 rows, 1 column
=== Table 4x3 ===
name
----
1

  fmt.tbl[t;0;0;""] / 0 rows, 0 columns
=== Table 4x3 ===

1

  fmt.tbl[t;-2;-2;""] / -2 rows, -2 columns
=== Table 4x3 ===
length drainage
------ --------
  6300  1800000
  6275  2980000

We see that specifying more rows/columns than present prints the entire table; using positive numbers selects as many rows/columns from the beginning of their arrays; using negative numbers selects as many rows/columns from the end of their arrays.

Regardless of the number of rows/columns requested, the table header Table 4x3 remains constant, representing the total size of the underlying table.

While fmt.tbl prints to STDOUT, it returns the value 1.

Given this understanding, we're now ready to take fmt.tbl apart piece by piece.

How is the fmt.tbl function implemented?

For this entire section, assume we're invoking the fmt.tbl function like this, requesting 3 rows, 2 columns, and floats formatting to one decimal point:

  fmt.tbl[t;3;2;"%.1f"]
=== Table 4x3 ===
     name length
--------- ------
   "Nile"   6650
 "Amazon"   6400
"Yangtze"   6300
1

The code begins with code comments, which start with /:

/ tbl[t;r;c;f] outputs dict t as table, assuming string keys and flat columns,
/ and outputs at most r rows and c columns, using format string f for floating
/ point numbers. Example: tbl[t;5;8;"%.1f"].

The first line of its definition is:

tbl:{[t;r;c;f]

The name tbl is assigned the value found on the right of :. The { marks the beginning of lambda notation for defining your own functions. By default this lambda notation binds x, y, and z to the first, second, and third arguments of the function respectively. If, however, you want to use different names for the function's arguments, they must be supplied in square brackets. This lambda expression does that, specifying that it takes 4 arguments with names t for table, r for rows, c for columns, and f for the format string for floating point numbers.

NB: If you're evaluating these code examples at the Goal REPL (recommended), you should now define these as top-level variables so you can run the examples that follow:

  r:3; c:2; f:"%1.f"   / spaces optional

So if the function will be invocable by the name tbl, why does this article talk about fmt.tbl?

This definition is found in lib/fmt.goal. If you use Goal's import facility like import "../lib/fmt.goal", Goal prepends the file name fmt with a connecting . to the variables found in the script. Only global variables can contain . in their name; otherwise it refers to the verb ..

With the function's signature defined, let's see how processing begins:

  (nr;nc):(#*t;#t); say"=== Table ${nr}x$nc ==="

We see another assignment with :, but this time the names on the left are provided as a list (in parentheses, separated by ;). This syntax provides destructuring assignment; we expect to find an array of as many values on the right as there are variable names on the left of :.

The (#*t;#t) expression for our example table produces:

  (#*t;#t)
4 3

We know that our table has 4 rows and 3 columns, but let's look at this expression more closely.

Aside: Array/List literals in Goal

Literal arrays can be written in a few different ways. For simple ones, you can just write adjacent data literals:

  2 4 6 8
2 4 6 8

  "two""four""six""eight"
"two" "four" "six" "eight"

  2"four"6"eight"
2 "four" 6 "eight"

In more complex syntactic situations, a pair of surrounding parentheses, with items separated by ; can be used to produce an array, which is how (#*t;#t) is written. It's a list of two items, which are determined by evaluating #*t and #t respectively.

What are parens used for?

From Goal's FAQ documentation:

Goal uses parens for two things: list creation and controlling operation precedence. List creation happens when one or more semicolons ; appear within parens. The semicolon is used as item separator, and items are evaluated left-to-right. Otherwise, the parens are used to control precedence of operations, as is usual in mathematics and most languages. Lists with a single item are created using the “enlist” monadic form `,x`.

The #*t expression consists of two verbs followed by a noun. In Goal, juxtaposition of verbs is equivalent to function composition (the B combinator).

How do tacit compositions work?

From Goal's FAQ documentation:

Goal’s tacit compositions are similar to other K dialects, but they are just sugar for a lambda or a lambda projection. A composition is formed from any kind of expression that ends in a verb, and simply produces an equivalent function with the implicit arguments added at the end. If the last verb is monadic, the function takes just one argument `x`. If it is dyadic, the function takes two arguments `x` and `y`. Dyadic built-in operators can be made monadic by appending a `:`. Dyadic keyword verbs can be made monadic by adding `::` at the end.
Most compositions translate easily into a lambda, but when compositions make use of non-constant expressions, they are represented as a lambda projection. In particular, compositions do not capture global variables: those get automatically passed as extra arguments.

Given there is just one noun argument in the expression #*t, this is composition of monadic # called with the value of monadic * applied to the argument t. Study the following exploration at the Goal REPL to understand how these verbs compose, how using parentheses affects their interpretation, and equivalent ways to express this:

  #*
{#x*y}   / Expects * to be dyadic by default

  #*:    / Colon after a verb specifies that is to be called monadically
{#*:x}

  *:t    / Monadic * is "first", in this case the values of the first dictionary entry
"Nile" "Amazon" "Yangtze" "Mississippi"

  #(*:t) / Monadic # gives the length of the array
4

  (#*:)t / Treat #*: as a syntactic unit producing a new composite function
4

  #*t    / Right-to-left evaluation, * is clearly monadic given noun on the right and verb on the left
4

Given that exploration, it should be clear that the second expression #t returns the length of the dictionary-as-table, i.e., the number of dictionary entries or columns, which is 3.

Back to the implementation of fmt.tbl: Trimming the dictionary-as-table

The expression we're analyzing in this section will produce a new dictionary based on the original t, but with only the requested number of rows and columns.

The variable nr stands for "number of rows" and is assigned 4; nc for "number of columns" and is assigned 3.

With those variables defined, the function immediately prints the table header using the say verb, which appends a newline to what is printed.

The contents of the table header are defined in a single string. This is possible because Goal provides powerful string manipulation facilities, including in this case string interpolation with ${nr}x$nc. String interpolation begins with $ and only requires surrounding names with {} if syntactic disambiguation is required.

With the table header printed, let's see how c and r are used to limit the columns and rows printed next:

  t:t[!r&nr;(c&nc)@!t] / keep up to r rows and c columns

  t
!["name" "length"
  ("Nile" "Amazon" "Yangtze";6650 6400 6300)]

Here t is re-assigned using :, so after this expression, the function body no longer has access to the original value of our table.

Square brackets in Goal are one way to invoke a verb/function (and as a specialization, a way to retrieve items out of arrays and dictionaries). In this case, we can think about invoking the table t as a function of its rows and columns. In that mindset, the t as a function is taking 2 arguments, separated by ;. Let's analyze each one separately.

The first expression in the square brackets is !r&nr.

Before we go further, remember that built-in verbs have special syntactic power (unlike user-defined functions): they can be invoked using infix notation (1+2 instead of having to do +[1;2]), and when juxtaposed they form function compositions. Keep these two characteristics in mind as we analyze the expressions that follow.

Let's identify the "parts of speech" of this expression first, v for "verb" and n for "noun":

! r & nr
v n v n

In the above expression, reading from right to left, the & verb has two noun arguments passed to it, r and nr. Dyadic & returns the minimum/lesser value of its arguments:

  r     / desired number of rows
3

  nr    / total number of rows
4

  r&nr
3       / 3 is less than 4

Replacing r&nr with what it evalutes to, we have !3 to evaluate. Monadic ! returns an enumeration. Here's how it behaves:

  !3      / ascending, starting at 0
0 1 2

  !0
!0        / empty numeric array

  !-3
-3 -2 -1  / ascending, starting at x

Rephrasing the expression with parentheses is another way to visualize the evaluation order:

  !(r&nr)
0 1 2

So the expression !r&nr can be read as "the enum of the lesser of the requested rows or total rows", which in this case is the list 0 1 2.

As the first argument to invoking our table t like a function, the list 0 1 specifies that we want the 0th and 1st rows.

Now let's move on to the second argument which specifies the columns we want: (c&nc)@!t

Let's identify initial parts of speech again:

( c & nc ) @ ! t
  n v n    v v n

We can confidently simplify expressions within parentheses before proceeding. The expression (c&nc) calculates the lesser of c (desired columns) and nc (total columns):

  c
2
  nc
3
  c&nc
2

Which gives us this expression and parts of speech:

2 @ ! t
n v v n

What does this arrangement of nouns and verbs give us? Is ! applied monadically to t and then @ applied dyadically to 2 and the value of !t? Is ! applied dyadically to 2 and t and then @ applied monadically to 2!t? Does the verb-verb juxtaposition produce a new dyadic verb that accepts 2 and t as arguments? Let's explore to find out:

  2@!t             / the implementation's behavior
"name" "length"

  @!               / without surrounding nouns, interpreted as monadic @ and dyadic !
{@x!y}

  (@!)[2;t]        / nope
"d"

  @!:              / force monadic !
{@!:x}

  (@!:)[2;t] 
'ERROR lambda: too many arguments: got 2, expected 1

  @:!              / force monadic @
{@:x!y}

  (@:!)[2;t]       / nope again
"d"

  2@!              / lambda from partial application; expects 2 args, we only have 1 left
{2@x!y}

  2@!:             / partial application, force monadic !
{2@!:x}

  (2@!:)[t]        / found it!
"name" "length"

The above suggests that the right-most verb of a verb chain is assumed dyadic unless a noun right-argument is supplied; that the left-most verb will be invoked monadically by default, but dyadically if a left argument is supplied; and that all other verbs in between will be invoked monadically. Let's see if this generalizes:

  *@-!    / right-most verb dyadic, rest to the left monadic
{*@-x!y}

  2*@-!   / left-most verb dyadic, right-most dyadic, in-between ones monadic
{2*@-x!y}

  *@-!2   / all monadic
"I"

  2*@-!2  / left-most dyadic, rest to the right monadic
"II"

Put in perhaps simpler rules of thumb:

  • Start reading from right to left.
  • If it's a noun, it evaluates to itself.
  • If it's a verb and it has a right noun argument, it's monadic unless it also has an immediate left noun argument.
  • If it's a verb in final position with no noun arguments, it's dyadic.

This allows us to read (and write) expressions from right to left without having to keep a large evaluation context in our head.

So in the expression (c&nc)@!t going from right to left, the verb ! is invoked monadically with t as its argument; the verb @ is invoked dyadically with the noun return value of (c&nc) on its left and the noun return value of !t on its right; and & is invoked dyadically with c and nc as arguments.

Monadic ! on a dictionary returns the dictionary's keys. Dyadic @ with numeric left argument i and array right argument y takes i items from y, padding if needed. Dyadic & as we already learned is min.

Given all of this, the expression (c&nc)@!t can be read as "take the lesser of the requested columns or total columns number of items from the list of keys found in t".

Substituting the evaluations of the two expressions we've analyzed in depth gives us this expression:

  t[0 1 2;"name" "length"]
!["name" "length"
  ("Nile" "Amazon" "Yangtze";6650 6400 6300)]

So for the remainder of this code observation t is now a dictionary of two keys, each with a 3-item array as its value.

String formatting the smaller dictionary-as-table

Now that our table is sized correctly, we need to build up the string representation of the column headers and data rows.

Next up, we define a variable k with the column names (keys) of our dictionary-as-table:

  :k:!t
"name" "length"

Monadic ! when given a dictionary returns the dictionary's keys. This function expects the given dictionary to have string keys.

Next we produce string values for all the columns in our table and assign it to v:

  :v:(..?[(@x)¿"nN";p.f$x;$'x])'.t    / initial : makes this print the value assigned
("\"Nile\"" "\"Amazon\"" "\"Yangtze\""
 "6650" "6400" "6300")

The .. expression inside parentheses is syntactic sugar for a lambda expression. Simplified, this whole expression is:

{...}'.t

Which is "apply the lambda expression to each of the values in t", where "each value" is an array of values (a column in our table). Now let's see what the fields expression .. evaluates to:

  ..?[(@x)¿"nN";p.f$x;$'x]
{[p0;x]?[(@x)¿"nN";p0$x;$'x]}["%.1f";]

This is a lambda expression expecting 2 arguments, partially applied with one argument via the ["%.1f";] at the end. The supplied function argument is the value of f. Since Goal/K does not support functional closures but does support functional projections (partial application), this is a technique for capturing the value of an external variable. The .. expression knows to capture this value because of the p.f, where p. is specialized syntax for "projects the variable that follows" which in this case is f.

The .. syntax is especially powerful when working with dictionaries, but in this instance it is used solely for the ease of defining a function that needs to effectively close over the value of f.

Now let's dig into the body of this function's definition, knowing that its primary responsibility is producing a string for every value in our dictionary t. Given that, this function's x will be each individual column's array of values from our table, which could be either an array of strings (one of the river names) or an array of integers (the river lengths or drainage areas).

The function's body consists of one large ?[...] expression, which is called cond and acts like an "if" statement in other languages. It contains three clauses, the first of which is a condition, where 0 represents false and 1 represents true. If the first clause is true, the second clause is evaluated; if false, the third clause is evaluated.

The condition of our cond is (@x)¿"nN". Dyadic ¿ has an alias in and for each item in its left argument indicates whether it is a member of its right argument. In this case, (@x) produces a single value, which is the type of our item x. So the in check here returns 1 if the type of the item x is either a floating-point number (type n) or an array of floating-point numbers (type N).

For our specific table, we know that it does not contain floating-point values at all, so only the third clause of this function's definition will be evaluated for our table. We can verify this by putting a debugging backslash into the definition temporarily:

  (..?[(\@x)¿"nN";p.f$x;$'x])'.t
"S"
"I"
("\"Nile\"" "\"Amazon\"" "\"Yangtze\""
 "6650" "6400" "6300")

This prints out "S" for the array of strings and "I" for the array of integers. If you're not convinced, put the backslash in other places as well to see how this expression evaluates.

Were one of our columns to have floating-point values, then our floating-point format argument f would be used and "%.1f"$x would be evaluated, formatting all items in x to a string representation of the number, truncated to display only one decimal value.

In our case, the simpler $'x clause is evaluated, which simply produces the default string value for each item in x. Note that while the dyadic $ for format is pervasive (no extra ' required), the monadic $ is not, since it would be unclear whether you wanted to form a string from an entire array or for each of its items.

Remember that fmt.tbl takes an argument f that is the format string that should be used to format floating-point numbers. In our case, we've specified %.1f which ensures floating-point numbers are printed to only one decimal place.

Finding the maximum width of each column

Now that we have all our values in the string format they'll be printed in, we can calculate the desired widths of each column and store that in w:

  :w:(-1+""#k)|(|/-1+""#)'v
9 6

If we look at our original invocation of fmt.tbl, we'll see that the first column is 9 characters wide (due to "Yangtze") and the second column is 6 characters wide (due to length). Now let's piece together how this is calculated.

To determine the width of a column, the number of characters of the widest item needs to be determined, considering both the column names (keys of t) and the column values (values of t). This expression, at its top level, does just that:

(-1+""#k)  |  (|/-1+""#)'v
           ^
           max verb

Let's look at the left and right arguments to | in turn.

  (-1+""#k)  / the implementation's behavior
4 6

/ Let's explore

  k
"name" "length"

  #k
2

  ""#k      / 1 greater than the length of the string
5 7

  #'k       / Oh right, in Goal strings are atoms, not arrays
1 1

  -1+""#k   / Subtract one to get actual length of each string
4 6

In (-1+""#k) starting right-to-left, we see first a dyadic use of # where the left argument is the empty string. For a left string argument to #, it finds the number of non-overlapping occurrences of that string within the right argument. For the empty string, the implementation considers both the beginning and the end of the word to match the empty string, so its length is 1 greater than the number of Unicode code points. This is a more succint way to get the length of a string than converting the string to characters via "c"$y and then counting that with #.

Now let's dig into the right argument to |, starting with identifying the parts of speech. For this expression, we use a to denote adverbs:

( | / - 1 + "" # ) ' v
  v a v n v n  v   a n 

Whereas verbs are right-associative, adverbs are left-associative, they "cling to" and modify the form to their left. The adverb / modifies the max verb |, changing it from a verb that returns the max of two arguments to a verb that takes one argument and performs a fold or reduction using |, returning the maximum value of all values in its array argument:

  |/2 3 9 4 5
9

  |/0 1 0 0     / logical "or" 
1

If we look at the entire parenthesized expression (|/-1+""#), we notice that it's different than previous ones we've encountered so far. The expressions (-1+""#k) and earlier (@x) both end with a noun, and thus produce a noun value. The expression (|/-1+""#), however, ends with a verb and thus produces a function value:

  (-1+""#k)
4 6            / noun (array)

  @6650        / 6550 is an example value from our table t
"i"            / noun (string)

  (|/-1+""#)
{|/-1+""#x}    / function

Such function values can also be modified with adverbs, and so in this example we see the each adverb ', changing it from a function that would work on v as a whole to one that applies itself to each item within v, producing a new list of the return values of that function application.

NB: For functional programmers: / is roughly equivalent to reduce/fold, and ' is equivalent to map.

Now that we understand the overall structure of this expression, let's analyze it in depth at the Goal REPL:

  (|/-1+""#)'v    / the implementation's behavior
9 4

/ Let's explore

  v
("\"Nile\"" "\"Amazon\"" "\"Yangtze\""
 "6650" "6400" "6300")

  ""#v
(7 9 10
 5 5 5)

  -1+""#v
(6 8 9
 4 4 4)

  :xs:-1+""#v   / let's play with just the data as xs
(6 8 9
 4 4 4)

  */xs          / put * between the top-level items, which are themselves lists
24 32 36

  (6 8 9)*4 4 4
24 32 36

  */'xs         / put * between the items of each top-level item
432 64

  (6 * 8 * 9;4 * 4 * 4)
432 64

/ Now with our max function

  |/xs    / maximum width for each of our 3 rows
6 8 9

  |/'xs   / maximum width for each of our 2 columns
9 4

If you find the above use of * or | confusing, remember that many built-in verbs are pervasive. The * verb called with two numbers is simple multiplication, while * called with arrays does item-wise multiplication.

We're now very close to understanding how w (the maximum width of each column) is defined. All that is left is replacing the expressions we've walked through with their values:

  (-1+""#k)|(|/-1+""#)'v   / whole expression for w
9 6

  (-1+""#k)                / left argument
4 6

  (|/-1+""#)'v             / right argument
9 4

  (9 6)|(9 4)              / simplified expression
9 6

  (9 | 9;6 | 4)            / pervasive item-wise max spelled out explicitly
9 6

We can visually confirm that our table, when printed, has a first column with width 9 and second column with width 6.

String padding for alignment

In order for our table to print with columnar alignment, string values either have to already be the maximum width of their respective column, or they have to be padded with spaces to meet that width.

Remember that k is our list of column names as strings, and v is our list of column lists with all values as strings.

The next expression in the implementation of fmt.tbl mutates these bindings to add the necessary padding to have all the values in each column be the width required of that column:

  (-w)!'´(k;v)
("     name" "length"
 ("   \"Nile\"" " \"Amazon\"" "\"Yangtze\"";"  6650" "  6400" "  6300"))

Once again, let's do parts of speech, simplify, and explore the evaluation:

( - w ) ! ' ´ ( k ; v )
  v n   v a a   n   n

Right-to-left, we first have a list with k and v as items:

  (k;v)
("name" "length"
 ("\"Nile\"" "\"Amazon\"" "\"Yangtze\"";"6650" "6400" "6300"))

We then have two adverbs and the verb !. The !' means apply ! to each item in its argument; that verb (not the original !, but the modified !') is then modified further with the eachright adverb ´. This fully modified !'´ verb expects two arguments; it takes its first (left) argument and partially applies it to the !' verb, and then calls that partially-applied verb to each item in its second (right) argument.

Before we can work out what behavior this dyadic !' will have, we need to evaluate its left argument, since the type and even sign of its left argument affects the behavior of the ! verb. Its left argument is (-w):

  -w
-9 -6

So the left argument is an array of numbers, and the right argument is an array of arrays of strings. When ! is invoked with a left numeric and right string argument, it performs string padding based on the numbers. If positive, padding is added to the right side of the string; if negative, to the left:

  9!"name"
"name     "

  -9!"name"
"     name"

  / dyadic each, k
  (-9 -6)!'"name" "length"    
"     name" "length"

  / dyadic each, v
  (-9 -6)!'("\"Nile\"" "\"Amazon\"" "\"Yangtze\"";"6650" "6400" "6300")
("   \"Nile\"" " \"Amazon\"" "\"Yangtze\""
 "  6650" "  6400" "  6300")

  / eachright spelled out explicitly:
  (((-9 -6)!'"name" "length");((-9 -6)!'("\"Nile\"" "\"Amazon\"" "\"Yangtze\"";"6650" "6400" "6300")))
("     name" "length"
 ("   \"Nile\"" " \"Amazon\"" "\"Yangtze\"";"  6650" "  6400" "  6300"))

For those following along at the REPL, make sure to reassign k and v with these values:

  :(k;v):(-w)!'´(k;v)
("     name" "length"
 ("   \"Nile\"" " \"Amazon\"" "\"Yangtze\"";"  6650" "  6400" "  6300"))

Now that every value is the correct width relative to its column, it's time to stitch together our rows and columns with newlines and spaces before finally printing to the STDOUT.

Stitch with newlines and spaces, then print!

The / adverb we've used for folding, when its left argument is a string instead of a verb, it functions as "join strings" using the left argument as separator.

With that knowledge, let's take a look at the final expression of fmt.tbl:

  say"\n"/,/" "/(k;"-"*w;+v)
     name length
--------- ------
   "Nile"   6650
 "Amazon"   6400
"Yangtze"   6300
1

One last time, let's identify parts of speech:

say "\n" / , / " " / ( k ; "-" * w ; + v )
v   n    a v a n   a   n   n   v n   v n

Inside the parenthesized expression, we see two semicolons ;, meaning we have 3 items in this array. Our k list of column names remains unchanged; we have "-" repeated w times using dyadic *; and finally we flip (transpose) our arrays in v so that they are oriented row-wise for printing (instead of column-wise).

  k
"     name" "length"

  "-"*w
"---------" "------"

  +v
("   \"Nile\"" "  6650"
 " \"Amazon\"" "  6400"
 "\"Yangtze\"" "  6300")

  (k;"-"*w;+v)
("name" "length"
 "---------" "------"
 ("\"Nile\"" "6650";"\"Amazon\"" "6400";"\"Yangtze\"" "6300"))

  #(k;"-"*w;+v)
3

  #'(k;"-"*w;+v)
2 2 3

The parenthesized expression is now a list of lists. Each inner list needs to be stitched together as single strings, joined with spaces. The " "/y expression will do just that:

  " "/(k;"-"*w;+v)
("name length"
 "--------- ------"
 "\"Nile\" 6650" "\"Amazon\" 6400" "\"Yangtze\" 6300")

  #" "/(k;"-"*w;+v)
3

  #'" "/(k;"-"*w;+v)
1 1 3

In the last expression, we see that the first two arrays have been joined into a single string, but the last one is still a list of 3 strings. All of these now need to be joined together with a newline to form the final table string that will be printed.

What happens if we try to join this 1 1 3 array with newline directly?

  say"\n"/" "/(k;"-"*w;+v)
     name length --------- ------    "Nile"   6650
 "Amazon"   6400
"Yangtze"   6300
1

The "\n"/y formulation tried to join the first two strings, but there was nothing to join. It then found the third array with 3 items and joined them with newlines. Instead, we want all of the lines to be joined, so we need to flatten this list.

We can use enlist , to join arrays together, and we can modify , with / to do so over all the arrays in our argument, and then join that flattened array with newlines:

  say"\n"/,/" "/(k;"-"*w;+v)
     name length
--------- ------
   "Nile"   6650
 "Amazon"   6400
"Yangtze"   6300
1

This can be a little confusing, because / in this expression is used both for joining strings and for a fold/reduction. If the left side of / is a verb, it's a fold; if it's a string, it's string joining; see the Goal help for all other uses of /.

Wrapping Up

One of the primary virtues of array programming languages is concision. Even while writing this post, I've copied and pasted the entire implementation multiple times to keep it within visual distance of what I'm writing in my editor.

At 4 lines and around 40 verb/function and adverb calls, fmt.tbl accomplishes quite a bit in a short syntactic space:

  • Calculates total rows and columns and prints a header with that data (nr, nc)
  • Narrows the table to the requested row x column size (t)
  • Formats the table values as strings (v)
  • Calculates the maximum width of each column (w)
  • Pads the column names and table values with spaces to meet those widths (k, v)
  • Builds and prints a single string of the table's column names and data

The definition of fmt.tbl also highlights a number of Goal/K language features:

  • Function literal syntax ({})
  • Named arguments, and as distinct from the APL language family, supporting more than 2 arguments ([t;r;c;f])
  • Multiple assignment ((nr;nc):(#*t;#t))
  • Expression separation with ; for more than one expression per line
  • String interpolation ("=== Table ${nr}x$nc ===")
  • Indexing into a dictionary (t[...])
  • Verb composition (#*)
  • Field expression syntax (..) with variable projection (p.f$x)
  • Cond expression (?[...])
  • Verb usage for: # * say ! & @ ¿ $ . | - +
    • Including different behaviors supported by these verbs based on arity and types
  • Adverb usage for: ' / ´

I hope you've found this code observation helpful. Writing it has certainly helped solidify my understanding of Goal and given me good practice with some of its core features.

I want to thank anaseto for reviewing a draft of this post and providing great feedback and corrections. Any remaining mistakes are my own.


Tags: k-language code-observation array-programming goal-language

Copyright © 2024 Daniel Gregoire