About

I am a software developer in Seattle, building a new AI software company.

Ads

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Recent Posts

Ads


« Anonymous Recursion | Main | Parameterized Unit Tests »

March 16, 2007

Code and Data

Usually, when I mention functional programming to other developers, they will unconsciously close their arms and legs—a sure sign of resistance. These developers are naturally suspicious of another new, better “programming” paradigm, so I have learned to zip my mouth. They don’t see the possibility of code becoming more declarative and looking exactly like specification. My blog traffic has diminished after I have started talked about programming in a more functional way.

In the MVP summit, I talked to a developer, who was a known functional programming advocate, about how I merge the notions of code and data.

I mentioned my static analysis tool, NStatic, and how I use everywhere an “expressions” data structure, which is like an immutable abstract syntax tree, but also a functional language in which I can evaluate any expression to its normal form. Expressions are used to represent such things are traditional algebraic expressions, code, natural language, and all sorts of other documents.

Document Operations & Transforms

In the case of a wordprocessor that I am also developing, documents are represented as expressions. In addition, document operations are constructed as functions in my expression language which are then applied to the document expression to return an entirely new expression.

The advantage of this approach is that I can support application-wide functionality by applying a transform to all document operations in my application. Transforms are functions that convert any function into a new function by recursively walking through an expression and applying changes based on pattern-matching (very  much like a derivative or Fourier transform). Control-flow operators like lambdas and conditionals themselves have higher-order transformations that work with any function, and thus any transform.

Transforms can only easily be implemented over code written in a functional way. Code in other paradigms would need to be converted to a functional representation first, as I have done with NStatic, and then converted back (or retained) after the transformation. A common refrain among FP enthusiasts is that functional code is easier to prove, but what the real benefit is that functional code is easier for a computer to analyze and manipulate--ie, easier to transform.

Have you ever written an application in which the same transformation needed to be manually coded into a number of different classes? The transform is stored inside our head and then mentally applied to an each method.

Such transformations couldn’t be previously automated because functions were opaque and the language mechanisms like virtual functions are too coarse. With transforms, new application-wide functionality can be written just once, not interleaved within code in several other unrelated classes. I can also easily integrate thirty-party functionality or externalize features this way, because the transformations are not burned in at compile-time.

In Microsoft Secrets, Michael Cusumano wrote about Excel's "Am I Done Yet" list in page 319, a list of concerns that must be considered for every new application feature added:

Microsoft projects have also been compiling metric-based checklists to help determine feature and product completion. For example, Excel has a six-page list of criteria entitled "Am I Done Yet?" This groups completion criteria into twenty-six categories, such as menu commands, printing, interaction with the operating system, and application interoperability (see Table 5.4). A program manager, developer, or test uses the checklist to help evaluate whether a feature is complete.

I wonder how many of these items in the entire "Am I Done Yet" list would just disappear, if program features were written using transforms. Probably a lot. Transforms would make things like revision marking, selection tracking, simultaneously editing a lot easier to write. The application developer could just focus on writing the essential feature and be rest assured that transforms would take care of much of the rest. 

Code & Data And Lisp

The developer whom I spoke to in the summit then brought up that Lisp supports treating code as data, simply by quoting code and performing explicit evaluation using the “EVAL” operator. Having coded in the language in the late ‘80s, I was already familiar with Lisp.

The problem with LISP is that it is not as tight and clean as it could be. Unfortunately, the way it mixes code and data is seen as the correct approach, and thus people look no further at alternative cleaner approaches.  LISP still distinguishes between code and data: Functions are opaque (their definitions are inaccessible) just as in imperative programming languages. Also, code stored as data can only be executed once all free symbols are removed. For example, (+ x 1 2) would evaluate to undefined symbol error or, if the symbol x were quoted, a type-mismatch error rather than the correct approach (+ x 3).

However, if the standard LISP functions could already operate on symbols, then both quoting and calling the EVAL function, would become unnecessary. In addition, if execution proceeded lazily, LISP macros would also become unnecessary as every function would also be a macro.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/5190/16993282

Listed below are links to weblogs that reference Code and Data:

Comments

I am quite interested by your posts on practical functional programming, but I find examples that are actually comprehensible by mortals are quite few and far between. Could you recommend any source code that you've run across that's instructive?

Exposing an operation as a non-opaque function also makes it easier for the rest of the application to reason about the history of the document.

It also leads to compact form of undo when manipulation documents in a functional way. Our undo trail simply records all the functions applied to the document, rather than or in addition to the actual changes applied.

Wes, you may have seen this before, but Kanren (http://kanren.sourceforge.net/) is an implementation of prolog-style logic programming in Scheme. It brings some of what you want to a Lisp language.

Would you say your expression construct is equivalent to the Expression class used in Linq to represent lambda expressions?

Can you give some examples of the kinds of transforms you might use?

I'm totally bought into your passion for FP - but I find it hard to visualize. I understand the raw language constructs (mainly via my experience with Javascript, but also through the new features in C#2). And I understand how purely functional data structures work - they remind me of the copy-on-write memory in WinNT. But I don't understand the code side. I don't understand how a "model" can notify several "views" of changes in a purely functional way. I don't understand the bigger coding picture.

I would love to read through some of your code to try and understand.

And with the massive push into dynamic languages, which tend to also be very functional, I _need_ to understand.

Wes, like other commenters, I am excited by a lot of what you say in your blog, but am not able to fully understand it.

I remember being on a Lisp course at university in the early 80s and getting a real instinctive buzz from the idea of self-modifying code (before I'd been taught that it was evil), and your posts are reigniting that for me.

Also what you are saying reminds me of what I understand of the goals of AOP, which I also find very interesting although so far I haven't "got it" and feel like I'm looking at a round peg and a square hole.

It would be great if you could put together a reading list to help an FP neophyte get to grips with your ideas.

Anthony, I'll look at Kanren, so far my initial preview looks interesting.

Damien, Expression trees are very similar to my "expressions". However, the current interface makes it very hard to walk through an interface, because the arguments of each expression is not indexable. Also, LINQ expression trees are not extensible.

I'll mention some kinds of transforms in a later post.

Adding to what the others say, your posts on functional programming are the main reason I read the blog.

If you haven't already, take a look at REBOL (www.rebol.com), just for fun. There is no code, only data that may get evaluated at some point.

When you get deeper into it, the flexibility is fantastic; but it can throw people getting started, as the code/data duality in other langs does.

Fun stuff and, whatever language you use, really makes you think differently.

FWIW, I love reading your posts on functional programming. Very informative, however, I don't always understand everything.

Wes, I think you need to post some code. This last post was very intesting, but can only be illuminated with actual code.

I always get some interesting new links to look at: I'll check out REBOL and Kanren.

Ian, I'll talk more about history of FP in some near future post.

Damien, as per your suggestion, I'll include more code in later posts.

Wesner, thank you for your postings on FP.
Not all developers are suspicious of new programming paradigms. I have been programming in C, C++, and Java for 10+ years and started thinking about FP recently.

I am interested in doing FP with java (and perhaps c#) since "real" functional languages are not in mainstream now and I like to add FP to my everyday programming in easy and natural way.

Keep blogging on FP, please

You might want to take a gander at Lua... I have recently started using this language as a way to directly express/store/manipulate "objects" - complete with inherited methods - completely inside of a traditional database.

Post a comment

If you have a TypeKey or TypePad account, please Sign In