Principle of Most Power

« Human-like Reasoning | Main | Playing Chess with God »

January 11, 2008

Principle of Most Power

Turing Completeness

Yesterday was Donald Knuth's seventieth birthday. I used to own the three volumes of his Art of Computer Programming, and have even quoted the book once in my blog regarding corountines. Although the content was valuable, I could not tolerate the use of the MIX assembly language with which Knuth used to implement his algorithms.

Donald Knuth also invented TeX, a markup language for typesetting. As proof of its greatness, Mark Chu-Carroll notes TeX remains dominant 30 years after Knuth wrote the original version. It also happens to be Turing complete. There are other widely used languages, which are Turing-complete, such as PostScript for generating graphical images. The template facility in C++ is Turing-complete by accident, and is used for metaprogramming and performing all sorts of optimizations that used to be done by the compiler.

Principle of Least Power

I encountered a few years ago, Tim Berners-Lee "Principles of Design," which lists a number of design principles for the web architecture.  One of the principles listed in the "Principle of Least Power," which states that we should use a low power language that is easy to implement and use.

In choosing computer languages, there are classes of program which range from the plainly descriptive (...HTML) though logical languages of limited power ...which include limited propositional logic, though declarative languages which verge on the Turing Complete (PDF) through those which are in fact Turing Complete though one is led not to use them that way (XSLT, SQL) to those which are unashamedly procedural (Java, C).

The choice of language is a common design choice. The low power end of the scale is typically simpler to design, implement and use, but the high power end of the scale has all the attraction of being an open-ended hook into which anything can be placed: a door to uses bounded only by the imagination of the programmer.

Computer Science in the 1960s to 80s spent a lot of effort making languages which were as powerful as possible. Nowadays we have to appreciate the reasons for picking not the most powerful solution but the least powerful. The reason for this is that the less powerful the language, the more you can do with the data stored in that language. If you write it in a simple declarative from, anyone can write a program to analyze it in many ways. The Semantic Web is an attempt, largely, to map large quantities of existing data onto a common language so that the data can be analyzed in ways never dreamed of by its creators. If, for example, a web page with weather data has RDF describing that data, a user can retrieve it as a table, perhaps average it, plot it, deduce things from it in combination with other information. At the other end of the scale is the weather information portrayed by the cunning Java applet. While this might allow a very cool user interface, it cannot be analyzed at all. The search engine finding the page will have no idea of what the data is or what it is about. This the only way to find out what a Java applet means is to set it running in front of a person.

The principle makes sense for the Web in which a new device or a new application with unforeseeable constraints may need to interact with or browse the Internet. However, I do not think that we should automatically apply this principle to other domains, yet I have seen this particular design principle crop up again in a few blog posts since then. There are widely used "standard" languages like PostScript, mentioned earlier, that are Turing-complete. When I look at XAML, I see a language crying for Turing power, because I see a lot of half-measures like resources, triggers, property expressions, templates, etc.

Another View -- Principle of Most Power

In "Software Factories and the modeling problem," Steve Maine posits that a triangular tradeoff between efficiency, generality, and precision exists for any modeling language, for which I don't think is necessarily true. I think that his examples of languages favoring generality but sacrificing efficiency involve procedural languages like C# and Cw, which are more complicated that purely functional languages.

My own work over the years have focused on designing a language that is Turing-complete in some respects and provides many of the advantages of functional and logical languages. My concern was that some of these declarative languages used in academia were not practical or fast enough for building applications.

The primary language is C#, but programs in the secondary language are constructed and executed at runtime. Being functional, the language consists of immutable symbolic expressions, which are much easier to analyze that procedural programs. The language also incorporates various forms of logical reasoning. The language is used for representing all types of data and knowledge including documents, mathematical expressions and natural language.

My other concern was non-termination. Execution of the language is through partial simplification, always terminating in a reasonable amount of time, enabling its use within an interactive application. I satisfy myself with a notion of Turing-expressiveness in which the partially simplified expression is extensionally equivalent to its fully simplified value.

Many Different Languages

Usually, advocates of the "Principle of Least Power" also believe that, in the future, people will be using all sorts of languages, because no one language can serve every purpose.

I do not agree with the idea of a need for a dozen different languages. I guess I would prefer just one. With all the new LINQ features, I am seriously considering moving all my scripts from various dynamic languages to C# 3.0 with Dot Net Script or Code Runner.Net.

One of the original goals of C++ was to end the use of many specialized languages by supporting the creation of libraries for custom data types through templates, operator overloading, and object orientation. One problem with operator overloading in the "C"-based languages was operator "overloading," because there were very few standard operators offered in the language.

Emerging trends include metaprogramming and integrated support for DSLs within the languages, allowing these DSLs to use the symbols and rich computational capabilities of the host language.

I think the various advantages and disadvantages of dynamic and static languages (and other language classifications) are based on existing implementations, and they should disappear over time.

This is not to say there won't be many languages in the future. We now have a common language runtime and a dynamic language runtime that raises the base level of new languages with automatic support for garbage collection and reflection. Microsoft is also working on the Phoenix project that promises to a common compiler infrastructure to enable many different languages to benefit from advances in parsing, compiler, and optimization technologies at the same time.

Comments