About

I am a software developer in Seattle, building a new AI software company.

Ads

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Recent Posts

Ads


« April 2007 | Main | June 2007 »

May 15, 2007

Coming Along

NStatic is not yet in beta. I have a couple of people on NStatic, and am still fixing a number of bugs.

I also have events going on in my life. (surprise) My parents and siblings flew in the first week of May to visit me, which I targeted mid-to-late date for closure, since these events are disruptive to my flow. I have also been unexpectedly invited to a Microsoft compiler lab event next week to attend, which will most likely be focused on recent announcements such as Silverlight, DLR and the Orcas beta.

It'll be beta when it's ready; stuff takes time to cook. There are no more major changes or additions; the existing feature set is good enough.

Symbolic Computing

I picked up from Slashdot that Mathematica 6, a program for doing computer algebra, was released. Among the features are equational theorem proving, which is similar to the work that I am doing.

More than any other product, Mathematica embodies symbolic computing, and this recent post on the Wolfram blog, Symbolic Programming Visualized hints why. The Mathematica language is based entirely on repeated transformations on symbolic expressions through pattern-matched rules.

I have never seen any mention of symbolic computing in anyone's dreams for future programming languages. Instead, I see a laundry list of incremental features for the next "big" programming language. Why can't the language be small, incorporating only the most expressive ideas.

Symbolic computing is the blind spot of the technology community. Even Turing thought of both brains and computers as manipulating symbols. Methinks, several years from now, we will probably see again the next revenge of LISP, which are operations on symbols.

According to Microsoft researchers in their roadmap  "towards 2020 science," symbolic computation will not be integrated into programming languages until 2012. (We do see a precursor in LINQ expression trees, being introduced in Visual Studio Orcas. There are also glimpses in some of the newer functional languages.) This gives me five years headstart.

If we want intelligent software, symbolic computing is the way to get there.

Quantification

Spec# and JML both support quantification operators like forall and exist as seen in lines 9 and 10 of this Spec# code sample:

   1:  class ArrayList { 
   2:  void Insert(int index , object value)
   3:      requires 0 <= index && index <= Count
   4:          otherwise ArgumentOutOfRangeException;
   5:      requires !IsReadOnly && !IsFixedSize
   6:          otherwise NotSupportedException;
   7:      ensures Count == old(Count) + 1;
   8:      ensures value == this[index ];
   9:      ensures Forall{int i in 0 : index ; old(this[i]) == this[i]};
  10:      ensures Forall{int i in index : old(Count); old(this[i]) == this[i + 1]};

I don’t actually directly support quantifiers in the reasoning system used by NStatic. However, because the system completely support functions, quantifiers are actually redundant; I can reduce quantifiers to equivalent unquantified functional expressions.

The most common use of quantifiers seems to be iterating over all elements of an array. In such cases, quantifier keywords can be imitated by C# language constructs.

For the expression Forall{int i in index : old(Count); old(this[i]) == this[i + 1]} from the above code.

     for (int i = index; i < oldCount; i++)
           Debug.Assert(oldArray[i] == newArray[i+1]);

Alternatively and equivalently, the same test could be rewritten as a function called inside of an Assert statement.

    bool TestArray(int[] oldArray, int[] newArray, int index, int oldCount)
    {
        for (int i = index; i < oldCount; i++)
            if (oldArray[i] != newArray[i+1])
                return false;
        return true;
    }

    Debug.Assert(TestArray(oldArray, newArray, index, oldCount);

In the first case, the Assert statement was inside the loop and, in the second case, it was left outside. In either case, they both perform the same operation and are, therefore, normalized to same internal representation.

This illustrates the basic operation of NStatic. Within the limits of the simplifier inside NStatic, straightforward, equivalent fragments of code are normalized to the same representation, regardless of constructs used.

We could define quantification operators more generally as higher-order functions:

[Conditional("CODEANALYSIS")]
bool ForAll(int start, int end, Func<int, bool> predicate)
{
     for (int i = start; i <=end; i++)
           if (!predicate(i))
                 return false;
     return true;
}

[Conditional("CODEANALYSIS")]
bool ThereExists(int start, int end, Func<int, bool> predicate)
{
     for (int i = start; i <=end; i++)
           if (predicate(i))
                 return true;
     return false;
}
Debug.Assert(ForAll(i, index, oldCount-1, delegate(int i) { return oldArray[i] == newArray[i+1]; });

Other types of quantifiers could also be created this way such an existential uniqueness operator. The initial release of NStatic though will probably not have enough simplifications to enable these other quantifiers, due more to lack of time rather than any implementation difficulties.

I inserted the "Conditional" attribute is there simply to avoid compilation into code. Otherwise, the C# compiler would happily compile the code and, if the difference between start and end happened to be a large value like int.MaxValue, the runtime execution of this code could hang for a long time.

NStatic doesn't execute recursive functions and loops, unless such expressions are determined to terminate in a short amount of time and space. Instead, NStatic evaluates loops and recursive functions symbolically by recognizing patterns in the loop and simplifying them, allowing its analysis of the same expression to proceed quickly.  The simplified representation of the loop is then used as its value.

For quantifying over non-indexable sets, we could have functions over collections or functions that walk through composite data structures. To quantify over all the objects in the heap, though, it would seem we may need a special quantifier keyword or class invariants, which I won't be adding until support for specifications in a future version.