About

I am a software developer in Seattle, building a new AI software company.

Ads

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Recent Posts

Ads


« July 2007 | Main | September 2007 »

August 22, 2007

Source Code

Scott Hanselman's post on Weekly Source Code 2 reminded me of my desire to post some of my source code. I'll probably do just that, but, with me and my short-term memory loss, I often forget to follow up until after a few years go by.

A couple of my classes come to mind:

  • Rational number class. This is an implementation of a general-purpose 16-byte data type that can precisely represent rational numbers (up to a certain precision), long (both signed and unsigned), and doubles. The code is written in IL to take advantage of extended-precision floating point. More info...
  • Fast, persistent hash-tables and array. This is one of my two sets of persistent collection classes. Similar to persistent arrays used by the COQ theorem prover for fast, functional union-find data structures, these data types are as fast as their imperative cousins and are ideal for backtracking. Unfortunately, my implementation is not thread-safe. My second set of persistent data structures is fully "functional" and thread-safe, but is more connected to my code-base.

If I haven't posted any source code by the end of September, give me a ring.


I have been meaning to post some regular expression matching implementations from my post in December: one using closures and another less efficient implementation using iterators. Both are functional, but imperative implementations can be compact too. I won't get around to doing it until after I ship NStatic, though.

Several years ago, I wrote a fast pattern-matcher and a slower unification engine (bidirectional pattern-matching) for expressions. Both algorithms provided a superset of regular-expression capabilities including wildcards and disjunctions.  (For unification, one regular expression pattern might be matched against another pattern.) They also handled other phenomena such as associativity and commutativity. By going completely functional, I was able to merge both algorithms into one, fast algorithm and dramatically reduce the total number of lines down to about 500.

I had written a regular expression engine for a college class based on NDFAs, the code for which was substantial. I also know that Microsoft's Rotor implementation includes a large number of files, so I was surprised by the compactness and directness of my functional pattern matcher implementation.

Since then, I have been (re-)writing most of my code in the functional style with the same consistent results. Programming in a functional manner entails heavy use of closures (anonymous methods), which is only really practical in C# 2.0+, and greatly eliminates the need for temporary or accidental classes.

August 18, 2007

Scalability

I wrote my NStatic tool to perform quickly with the goal of scaling linearly with the size of the code base. Analysis of a single terminal function is done in a single pass and is basically linear with the size of a function. Performance is impacted more by the call depth of some of the functions than by the number of functions in the code base.  Even with deep calls, I found ways of keeping analysis tractable through lazy and cached evaluations.

The main issues I am currently grappling with are some scalability issues that I discovered while scanning Rotor (the Shared Source Common Language Implementation). I found some interesting errors in Rotors, which I'll eventually post in my blog.

I also found some gotchas as well. My tool zipped through hundreds of functions in a split second before stalling in a function, which wasn't all that complex. Some of my transformations, while seemingly simple, exhibit exponential characteristics. This occurs when a function application duplicates an argument.

f(x) => x op x

Repeated forward use of such transformations can lead to large expressions that take long to evaluate or even print.

I found a general, elegant solution that I have been implementing. Ideally, the normal form should be at least as compact as the original form [considering the fully expanded lambda expression representation of f], preferably more so. With constant and atomic arguments, this is the case. For more complex arguments, the reverse transformation is applied: The left hand side of the transform becomes the normal form. It's counterintuitive, but there's no reason why we can't continue to simplify this normal form as the two sides of the equation have the same value.

August 17, 2007

The Western Tradition

One of the many sets of books in my father's library include the "The Story of Civilization" collection by Will & Ariel Durant, parts of which I read. I did not have the patience to read through all eleven volumes, but I did go through his paperback, "Story of Philosophy". I was looking forward to taking a "History of Civilization" course in college, but unfortunately Harvard did not offer such a course.

About ten years ago, I located at the local Bellevue library a set of videos on the "Western Tradition," based on a series of college lectures by UCLA professor Eugen Weber, which I enjoyed. The videos contain thousands of images from the Metropolitan Museum of Art and include cool music. It's perfect for those people who rather watch the movie than read the book. I just discovered that the videos are available for free on-demand viewing at the Annenberg Foundation's Learner.org website.

Ribbon UI

Patrick Smacchia, developer of NDepend, asked me how he could improve the user interface of his product. I referred to my experience with the three main component suites that I use in NStatic and am satisfied with: These are DevComponent's DotNetBar, Actipro Software's SyntaxEditor, and Developer Express's DXExperience; they are all outstanding. I really care as much about form as function, and they all deliver both.

NDepend's user interface was somewhat unorthodox, and I felt that he could make a better impression with a standard user interface, which apparently he is moving to in version 2.4. Patrick just put up a post on "UI Matters: Menus and Toolbars vs Office 2007 Ribbons."

He raised the question of whether the Ribbon user interface was more appropriate that the traditional "command-bar" user interface used in Visual Studio and repeated a remark from Bob Powell, a UI guy.

"If developers are your users, the closer your UI to Visual Studio, the better."

NStatic, being a developer tool, Patrick suggested, should resemble Visual Studio rather than Office. He also mentioned that if one I integrated my product into VS the user experience impact would be minimal.

I have actually thought about the proximity of the user interface to Visual Studio as can be seen in my prior post on Toolbar Icons in 2004.

I do believe that an application has to match the Office user interface to be respected. First, most users are familiar with Microsoft applications. Users expect other applications to use the same shortcuts and operate the same way. They tend to be more receptive to an application that feels natural than to one that requires learning a new interaction model. Second, users are most likely to pay for a product that appears similar to another product that they have already paid for. Users often judge the quality of products through non-merit-based, surface-level heuristics, such as the conformance to a well-known, well-respected standard like the Office user interface.

While I have examined any data, met with any users, or conducted any usability studies, I thought the Ribbon interface was the best path for me to take.

Much of the VS UI functionality has been retained: I switched my text editor to Actipro Software because of its higher fidelity to Visual Studio. The window docking behavior is preserved and many of Visual Studio's task panes have equivalents in the new NStatic application. I am determined to reproduce many of the shortcuts and behavior as much as possible, so that learning costs are minimal. As many VS developers also use Office, learning the ribbon is not likely to be a problem.

The Ribbon UI, while apparently a dramatic improvement, can be seen as a logical evolution of the work on combined menus/toolbars (aka, "command bars") that Office group started in Office 97 and continued forward ever since. I was never a fan of the original command bar work (which merged menus/toolbars and flatten the look to make them more web-like) and emphasized rarely used features like detachable menus, custom toolbars, and vertical/bottom placement. The command bar interface always felt dated and sub par compared with what can be seen on the web and other applications.

The new ribbon is more discoverable, properly designed by actual industrial designers, makes better use of the space, and has a lot more useful properties such as hosting controls. It's also visually arresting and many readers have called the NStatic interface, stunningly beautiful--a compliment that I would not expect with the traditional command bar interface. Office 2007 is slick, and Office sales seem to reflect that.

The ribbon fits in with my strategy much better: My goals including building a number of applications, some of which are non-development, off of the same code base, and minimizing unnecessary differences between applications. I may attempt to release a ribbon-based notepad. One disadvantage with the ribbon, though, is that it requires more commands to fill up the user interface.

Symbolic Ray-Tracing

I thought of an approach to ray-tracing that could result in faster performance using symbolic evaluation. I mention it here to illustrate the benefits of thinking symbolically and also because I don't see myself writing a ray tracer anytime in the next decade.

Images typically contain stretched out areas that correspond to the surfaces of individual objects. While each pixel in an area may have a unique color value, all of the pixels may easily computed by a single, simple function, even for textured surfaces. In theory, the applicable area of a function could be extended to the full dimensions of the image by using conditionals, but such a function would be unwieldy. 

Instead of calculating a color value for each ray we shoot out from each screen pixel, we could produce a partially evaluated function that returns a color. (This could work recursively in the event of reflected surfaces.) The computation for a single pixel would suffer, but the function could be reused to fill out the area occupied by adjacent related pixels, which also share the same function. Search and anti-aliasing costs for these pixels would be eliminated.

One approach to area detection is to produce a second function, which returns a boolean value indicating whether the first function is still valid for a given point, based on the information gathered from intersection tests of the original pixel. The second function basically serves as the boundary test for the flood fill operation. With this approach, we have the choice of making our first function a compiled function constructed from closures, a partially evaluated function represented as data, or a combination of both. Compiled functions might be faster, but data representations, being symbolic, could produce exact results.

Since this approach reduces work for typical images, I suspect that high quality images could one day be produced in the same amount of time that lowest quality images currently take without special hardware. 

August 09, 2007

LISP

Randal Munroe posted some "XKCD" comics on LISP, which I thought were  especially relevant to my situation.

This one below drawn a while back is called ""LISP" and captures my fascination with functional programming and its remarkable ability to express simply and elegantly everything about the world.

This more recent one called "LISP Cycles" conveys my attempt to use the light side of the "force," functional programming, to overcome the dark side, imperative programming (and --shhh!-- singlehandedly defeat the evil empire in the process.)