Playing With Lua Undefined Behaviors

Lua is a scripting language suitable for embedded systems. In fact, its interpreter has a small memory footprint and an high speed (there is also a JIT compiler available on some architectures).

The expression "undefined behavior" describes

the result of executing computer code written in a programming language for which the language specification does not prescribe how that code should be handled. - Wikipedia

Now that you know what I'm talking about, I'm gonna show you an odd thing with Lua tables. But, wait! What is a table in Lua?

Brief Description Of Lua Tables

According to the official documentation:

Tables are the sole data structuring mechanism in Lua; they can be used to represent ordinary arrays, sequences, symbol tables, sets, records, graphs, trees, etc.

When using tables as arrays, there are some helps for common operations, such as the operator # that returns the number of elements and the function table.sort() for sorting. Unfortunately, if used with tables that are not array may behave in a strange way.

The Function

Take this simple function:

function print_sort_print(a)  
    print("Number of elements: "..#a)
    print("Access with ipairs")
    for k,v in ipairs(a) do
        print("\t"..k.."="..v[1]..v[2]..tostring(v[3]))
    end
    print("Access with pairs")
    for k,v in pairs(a) do
        print("\t"..k.."="..v[1]..v[2]..tostring(v[3]))
    end
    print("--------------------")
    table.sort(a,
        function (a,b)
            if (a == nil) then
                return false
            end
            if (b == nil) then
                return true
            end
            if (a[3] == b[3]) then
                return (a[1] <= b[1])
            end
            return (a[3] <= b[3])
        end)
    print("Number of elements: "..#a)
    print("Access with ipairs")
    for k,v in ipairs(a) do
        print("\t"..k.."="..v[1]..v[2]..tostring(v[3]))
    end
    print("Access with pairs")
    for k,v in pairs(a) do
        print("\t"..k.."="..v[1]..v[2]..tostring(v[3]))
    end
end  

It takes a matrix prints its items (with two functions ipairs() specific for arrays, and pairs() that is more general), than sorts the table and prints it again.

The custom sorting function, orders by the third element of each row of the matrix.

Oddity Number 1

Suppose I'm inserting some hard coded values into an array and, for a typo, I miss a number (the 8, in the following snippet).

a = {}  
a[1] = {'a', 'b', 5}  
a[2] = {'c', 'd', 1}  
a[3] = {'e', 'f', 2}  
a[4] = {'g', 'h', 9}  
a[5] = {'i', 'j', 3}  
a[6] = {'i', 'j', 2}  
a[7] = {'i', 'j', 8}  
a[9] = {'i', 'j', 7}

print_sort_print(a)  

The result is:

Number of elements: 7  
Access with ipairs  
    1=ab5
    2=cd1
    3=ef2
    4=gh9
    5=ij3
    6=ij2
    7=ij8
Access with pairs  
    1=ab5
    2=cd1
    3=ef2
    4=gh9
    5=ij3
    6=ij2
    7=ij8
    9=ij7
--------------------
Number of elements: 7  
Access with ipairs  
    1=cd1
    2=ef2
    3=ij2
    4=gh9
    5=ij3
    6=ab5
    7=ij8
Access with pairs  
    1=cd1
    2=ef2
    3=ij2
    4=gh9
    5=ij3
    6=ab5
    7=ij8
    9=ij7

As you can see, the item count ignores the last item, behaving like an array, but the sorting fails. Without the two initial ifs in the sort function, it would have raise an exception while trying to access the third element of a nil value.

Oddity Number 2

The above situation appears to happen with every missing position, except the third. Look what's the result now:

b = {}  
b[1] = {'a', 'b', 5}  
b[2] = {'c', 'd', 1}  
b[4] = {'e', 'f', 2}  
b[5] = {'g', 'h', 9}  
b[6] = {'i', 'j', 3}  
b[7] = {'i', 'j', 2}  
b[8] = {'i', 'j', 8}  
b[9] = {'i', 'j', 7}

print_sort_print(b)  
Number of elements: 9  
Access with ipairs  
    1=ab5
    2=cd1
Access with pairs  
    1=ab5
    2=cd1
    4=ef2
    5=gh9
    6=ij3
    7=ij2
    8=ij8
    9=ij7
--------------------
Number of elements: 8  
Access with ipairs  
    1=cd1
    2=ef2
    3=ab5
    4=ij2
    5=ij3
    6=ij7
    7=ij8
    8=gh9
Access with pairs  
    1=cd1
    2=ef2
    3=ab5
    4=ij2
    5=ij3
    6=ij7
    7=ij8
    8=gh9

The number of items is 9 (?) before the sorting and then (oddly enough) 8, but the array is still unsorted.

Conclusions

First of all, don't play with undefined behaviors. Even if you think you can find a pattern, nobody tells you that this will work in any situation. In addition, I suggest you to do really exhaustive tests for dynamically typed languages, because there's no enforcement on data.


Cover image taken from Wikimedia Commons (public domain).

Luca Sommacal

Luca Sommacal

Italian developer (mainly in C for embedded platforms), Linux learner, addicted to rock music, history, science and few other things. Follow me on Twitter

comments powered by Disqus