240 posts categorized ".NET"

December 28, 2013

C# for Systems Programming

By way of the article “A glimpse into a new general purpose programming language under development at Microsoft” from the Lambda the Ultimate weblog, I came across Joe Duffy’s new post on C# for Systems Programming.

Microsoft may be developing a native version of C# with additional extensions for systems programming that would compete with languages like C++, and D. Duffy’s work on immutability and isolation in C# is directly related to this native version of C# and mostly likely would not be part of CLR C#.  While I use the term “native C#,” “systems C#” might be more accurate as garbage collection will likely still have a diminished role in this new world.

C++ is constrained by compatibility, leading to obsolete, complex and inelegant syntax. C++ is also not type-safe. A new version of C# for systems programming is advantageous in a number of ways. It would leverage past efforts with CLR C# and allow Microsoft to evolve the language faster than any standards body and in a way that best meets its needs. Porting code between systems and CLR C# might be a trivial recompilation at least in the CLR-to-native direction. This effort appears to involve extensive collaboration with or may even be entirely based inside the research group, which should mean the introduction of many new and advanced ideas from academia.

Joe lists six major categories of features being introduced into systems C#:

  1. Lifetime understanding (with stack allocation and deterministic destruction)
  2. Side-effects understanding. (Immutability and isolation)
  3. Async programming at scale.
  4. Typesafe systems programming.
  5. Modern error model (with code contracts).
  6. Modern frameworks.

A prior post of Duffy’s hinted at the development of a new framework in which he was calling out for interested new hires. This new framework might be yet another framework built entirely to support systems C#. Duffy is rumored to be working on Midori, a new distributed, concurrent OS. A deleted job post for a Midori-based programming language called M#. Midori may be the testbed for systems C# and its supporting libraries.

December 10, 2013

Mads on the Future of C# (6.0)

Mads Torgersen, head program manager of C#, presented the “Future of C#” at the NDC London conference on Friday morning 12/6. Most of the “probable” new features announced moved C# to a more succinct, functional programming style.

  1. Import type members into namespace.
  2. Succinct syntax for primary constructors.
  3. Readonly properties
  4. Property expressions (property lambdas)
  5. Method expressions
  6. Parameter arrays for IEnumerable interfaces
  7. Succinct null checking
  8. Multiple return values
  9. Constructor type inference

Mads provided a code demonstration of each new feature with before and after illustrations of how the code would be implement today versus the future. The code example below is taken from the presentation.

Embedded image permalink

using System;
using System.Text;
using System.Threading.Task;
// import types into namespace
using System.Math;

// primary constructor
public class Point(int x, int y)
    // readonly property
    public int X { get; } = x;
    public int Y { get; } = y;

    // property expressions
    public double Dist => Sqrt( X * X, Y * Y );

    // method expressions
    public Point Move(int dx, int dy) => new Point(X + dx, Y + dy);
    // parameter arrays for IEnumerable
    public static Point Avg(params IEnumerable<Point> points)
        // monadic null checking
        if (points?.FirstOrDefault()?.X ?? -1 )

    public void GetCoordinates(out int x, out int y) { x = X; y = Y; }

class Program
    static void Main(string[] args)
        var p = new Point(1, 5);

        // multiple return values
        p.GetCoordinates(out var x, out var y);

// constructor type parameter inference - 
var t = new Tuple(1,2); // infers Tuple<T1, T2>

The multiple return values features appears very tentative as some in the audience would prefer better syntax support for tuples. Mads took additional questions from the audience. Structural typing is “on the radar.” Mads would like to “get rid of FXCop” being a post-build process and “build in code contracts” instead of requiring a IL-rewriting step.



December 09, 2013

Immutable & Isolated Types Likely in Future C#

Based on several pieces of evidence, immutable and isolated type designators are likely in a future version of C# and the common language runtime (CLR). These features will likely debut on .NET 5.0 in the next major iteration of Visual Studio, Visual Studio 2014.

  1. Immutable types in imperative language, US patent application filed in June 2008.
  2. Type system support for memory isolation permissions, US patent grant filed in Apr 2009.
  3. Unique and Reference Immutability for Parallelism, Microsoft Research October 2012
  4. Imperative + Functional == Smile.
  5. Joe Duffy on Uniqueness and Reference Immutability for Parallelism.
  6. Perspectives on Concurrency and Parallelism. Channel 9 Expert to Expert interview with Joe Duffy and Eric Meijer
  7. Mentions by Mad Torgersen and other C# team members about potential syntax support for immutability

The first patent application on immutability describes a new class modifier keyword “immutable” in C# that requires all fields in the class be “readonly” and themselves immutable.

public immutable class String {
     public readonly int Length;
     public readonly LinkedList<char> Chars;
class ImmutableList<T> where T is immutable {. . . }

The second patent includes additional information about type system support for three different isolation permission modifiers. The three permissions are immutable permission, read permission and write permission. Some pseudocode is included in the patent with tentative syntax that is likely to change.

class C {
 private T:w m_f;
 public C( ) { m_f = new T( ); }
 public isolated:r T:r Get ( ) { return m_f; }
 public isolated:w void Set(T:w obj) { m_f = obj; }

The key developer for this feature is Joe Duffy, who is referenced in all these links. Anders Hejlsberg is also a contributor to both patents, signifying likely inclusion into a future version of C#. The immutable types patent, in fact, lists only these two persons as inventors.

The theory behind immutability and isolation is expounded on in the Microsoft Research paper. The synopsis for the paper is:

A key challenge for concurrent programming is that side-effects (memory operations) in one thread can affect the behavior of another thread. In this paper, we present a type system to restrict the updates to memory to prevent these unintended side-effects. We provide a novel combination of immutable and unique (isolated) types that ensures safe parallelism (race freedom and deterministic execution). The type system includes support for polymorphism over type qualifiers, and can easily create cycles of immutable objects. Key to the system's flexibility is the ability to recover immutable or externally unique references after violating uniqueness without any explicit alias tracking. Our type system models a prototype extension to C# that is in active use by a Microsoft team. We describe their experiences building large systems with this extension. We prove the soundness of the type system by an embedding into a program logic.

The proposed extensions to C# uses more plausible syntax with four new keywords: writable, readable, isolated, and immutable.

immutable IntBox freeze(isolated IntBox b) {
  // implicitly convert b to readable
  // implicitly recover immutability;
  // the input context was all isolated
  return b;
int countElements(readable ElementList lst)

Joe Duffy elaborates more on the idea in a Channel 9 interview, a blog post, and another interview with a journalist.

While it may appear that these features are motivated by parallelism, the primary focus is not implicit parallelism according to the blog post “Imperative + Functional == :- )”. Joe summarizes the goals of the new keywords and concepts with four points, one of which includes marrying the best of functional and and imperative programming styles.

  1. Create a single language that incorporates the best of functional and imperative programming. Both offer distinct advantages, and we envisioned marrying the two.
  2. Codify and statically enforce common shared-memory state patterns, such as immutability and isolation, with minimal runtime overhead (i.e., virtually none).
  3. Provide a language model atop which can be built provably safe data, task, and actor-oriented concurrency abstractions. Both implicit and explicit. This includes, but is not limtied to, parallelism.
  4. Do all of this while still offering industry-leading code quality and performance, rivaling that of systems-level C programs. Yet still with the safety implied by the abovementioned goals.


November 27, 2013

“In” Parameters Proposal for C# 6.0

I propose a new feature that will make manipulation of value types more efficient, produce more readable code, and encourage greater use of the functional programming style. I disclaim any ownership to this idea.

Currently, there are two C# keywords that allow parameters to be passed by reference, “ref” and “out.” I will confine the discussion to “ref” parameters.

The “ref” keyword has several disadvantages:

  1. For large value type structures, copying by value is wasteful. It may be more more efficient to passed a structure by reference. One such example of a real world data structure is the Matrix3D data type in WPF.
  2. The variable passed by reference may be modified by the function.
  3. The ref keyword must precede all arguments passed to a ref parameter resulting in ugly syntax.
  4. Only L-values may be passed by reference. L-value are variables and other expressions  that may appear in left-hand side of an assignment. These include including locals, field accesses, pointer dereferences, and array element accesses. L-values do not include property accesses, which do not have an identifiable machine address, nor do they include read-only fields, which are restricted by the CLR.
  5. Ref parameters can not be the target of an extension method. Extension methods calls are expensive for value types, because the “this” parameter is copied by value. This also means that value types can’t have mutable extension methods. This is in direct contrast to instance methods, which pass the value type “this” parameter by reference.
  6. Ref parameters can not be used in operator overloading functions.

The implementation of an “in” parameter would be very similar to a “ref” parameter.

  1. The signature of a function with an “in” parameter would be identical to that of an “ref” parameter in MSIL, just as “out” parameters are today. An InAttribute would be applied to an “in” parameter in a manner similar to how the OutAttribute is currently applied to “out” parameters.
  2. The “in” keyword would not be required at the calling site as “ref” and “out” are currently required today.
  3. An L-value passed to an “in” parameter would be prevented by the compiler from being modified by the function called.
  4. R-values could be passed by reference to an “in” parameter. This is one of the benefits of using const references in C#. For instance, one could write Determinant(matrix1 * matrix2 + matrix3)) and the three temporary values would be passed by reference in addition to the three variables.

Other languages such as Ada have an “in” parameter, though I do not know if they behave similarly.

UPDATE: The keyword “in” is very must like “const” references in C++, but it is actually much easier to introduce constness behavior in .NET without the cascading mess of introducing const to other parts of the codebase and libraries.

An “in” variable is not modified if it is passed to another method through an “in” parameter or a regular copy-by-value parameter. It may be modified by an instance method or passage through “ref” and “out” parameter in another method. It’s also modified if any of fields are directly modified in method, possibly through inlining of short methods. After just-in-time compilation, short methods and property accesses are inlined. Simple property getter will become direct field accesses and it would be possible to ascertain whether modifications are made to the data structure.

If there are still potential modifications through instance method calls, the “in” parameter can be copied to a local variable by the compiler before being used. The compiler can prevent modifications to public fields or, alternatively, prevent modifications from propagating to the caller by using a local copy.

An “in” parameter isn’t strictly necessary as the compiler and runtime can already implement copy-by-value this way for large structures. However, the difference in behavior is visible in multithreaded code and through closures containing writable captured variables, if the value referenced is not stored on the stack.

July 14, 2013

Anders On C# 6.0 at BUILD 2013

Anders Hejlsberg and Charles Torre talked about a future version of C#, version 6.0, at Windows Build 2013.  http://channel9.msdn.com/Events/Build/2013/9-006 (34:30)

Visual Studio 2013 is the minor release version in an annual major/minor release cycle. The version of .NET included in VS  2013 is 4.5.1 which is an in-place upgrade of version 4.5. Major changes are expected for Visual Studio 2014 and .NET 5.0.

C# 6.0 will feature an all new C#-based compiler based on Roslyn, which will be released with VS 2014. Roslyn is currently available as a community technology preview, which opens up new compiler services for third parties. The C# team aims to make Roslyn as close to the performance of the native compiler as possible (within a factor of 2 during development and better on release), and, in some cases, Roslyn already outperforms the native compiler due to a more modern immutable AST architecture that supports parallel compilation. The immutable AST architecture is detailed in a recently filed patent.

Roslyn will enable C# to innovate faster due to a better designed codebase in a more productive language. This is good news, because languages like Scala, F#, D, and arguably C++ are currently innovating at a faster pace than C# and Java.

Anders stated that a number of frequently requested features are already being introduced into the new compiler. One of these features includes succinct syntax for class definitions, such as those found in Scala and F#. This feature may have already been introduced in TypeScript, with which C# has been exchanging design ideas. In TypeScript, properties can be declared directly from the constructor.

class Engine implements IEngine {
    constructor(public horsePower: number, public engineType: string) { }

TypeScript/Javascript will not be replacing C# and is not a substantial diversion from Anders’s “day job,” which remains C#. A transcribed excerpt of the video interview is included here:

Charles: We noticed that there were a lot of questions about C# and Roslyn. So now Anders has kindly agreed to hangout for a little while longer and we will get to some of your questions. Right off the bat, Max wants to know if there will be another version of C# in Visual Studio 2013.

Anders: We switched to a much faster cadence. So that release will be... And we are targeting for Roslyn the next one after. We are going to start previewing Roslyn as soon as we can, because we are actually now at a point where we can. The Roslyn compilers are done and we are compiling pretty much all the managed code we can find in-house and externally. Literally we code-scraped CodePlex and Github. We keep finding little issues.

We have a very high level of stability at this point and we are working on the language services and its all finally getting there.

Charles: So, Al nice to see you again. As Anders just said, we are not going to ship in 2013, but you will probably see a preview in some point in time before 2015.

Anders: The sooner the better.

Charles: Shaggy wants to know of any new updates of Roslyn. Is there a new CTP soon?

Anders: We talked about CTPS and I think ... The way we are thinking about it right now is that we need to get VS 2013 out. We have done prepwork in VS2013 such that we can just drop the Roslyn compiler VSIX extensions, which you can enable/disable to test them. All of that infrastructure is done... so once we 2013 out, we will be able to start previewing Roslyn.

Charles: Just to be clear, the goal is that Roslyn will be the compiler for C# at some point.

Anders: Yes.. and VB and the new language services in the IDE. It's all written on top of Roslyn. So there's a huge amount of work there. It's taken longer than we thought and I will readily admit that. The thing that you can't ignore is that there are millions and millions and millions of ... and I venture to say ... probably even billions of lines of code of C# out there and we got to compile all of them exactly. We can't just go and say, oh that doesn't work anymore. This is about obtaining the elusive 100% and backwards compatibility. There are things in the old compiler that are clearly wrong. We do the right thing for this but then in this particular case, we do the wrong thing. Now should we keep the bug, because if we fixed the bug then that breaks. Is it a new warning? We are spending a lot of time getting that just right..

Charles: Now one of the questions ... The  old compiler is written in native code C/and C++. So are you noticing an increase in compiler time. So what are the performance characteristics of Roslyn..

Anders: Right from the get go of the Roslyn project, we set some tough goals. At no point during development do we want to be more the 2X of the old compiler. That was just during development and that we wanted to at least reach the same speeds as the native compiler in the final release. The way we get there is different.

One of things about the Roslyn compiler is that it's a much more modern codebase written using more modern functional algorithms and that means a lot of the data structures in the compiler are immutable like the entire ASTs. You can work on the entire API for that and it's like one big immutable data structure.

What immutability gives you is sharing for free. And that means this compiler unlike the older compiler works beautifully on multicore systems. If you have a multicore machine, we can farm out a lot of compilation work on separate cores... Yes, if you have 4 separate compilations, we can completely spin out 4 separate processes. That's a given. But even for a single compilation, we can internally parallelize IL generation for example... Each method is individually independent of each other, so we can get IL.. the emit phase can go 3-4 times faster because of the ability to take advantage of the all the latent parallelism. So in the aggregate if you look at how the new compiler is performing against the old compiler... it's actually in many cases beating the old compiler, particularly for large projects. It scales a lot better and scalability ... You wouldn't believe the size of some of these projects that people build in C#. They're enormous.

Typing responsiveness is vastly better with Roslyn. There are a lot of advantages we get out of this.

Charles: You still work on C#?

Anders: C# is my day job. I'm been pitching in on TypeScript... There was some downtown in the Roslyn project since it took longer than we thought...

Charles: Because you spend a lot of your time on TypeScript, have you learned anything from that work that you could apply back to C#?

Anders: Yeah... that's a good question. I am coming away from it with a new appreciation for the value of the dynamism or dynamic typing being completely effortless. We have support for dynamic typing in C# but sometimes it's not as smooth as how I would like it to be. For example, making it easy to dynamically dot into a JSON document... That's actually not as easy as it should be today.

Those are things that we are actually looking at right now. From a language design, it goes both ways. We actually ... We have been thinking about ways of making C# classes more succinct by looking at some things we can do with F# where you can put parameters in classes and capture them in methods. And then you can get into a very succinct style of writing code and we used some of that early on in a model of classes that we implemented in typescript.. Then the Ecmascript committee went in a different direction and we align with that instead, but that was interesting to apply work that hadn't been seen yet in C# in Typescript. We may revisit that once we get done with Roslyn.. We actually have now in earnest... design meetings for C# 6.0. So there's a bunch of new features there and we looked at all the new features that people have logged over time in C#. It's time to finally fix this issue.. We have a hitlist of features that we should now just do in the Roslyn codebase, because now that we have a codebase that is so much easier to do in. There was a big cost in the old codebase and that cost has gone down dramatically, so we should be able to move a lot faster.

Charles: The world has changed... Developers are writing mobile applications.
There are many ways it impacts.... For example, dynamic data is getting more and more important. The ability to talk to web services and so forth in a completely frictionless manner. In mobile devices and small form factors, battery consumption is important so native code is becoming more important. It costs battery performance to jit an app. So delivering native code to deices is getting more important.

Anders: Third parties out there like Xamarin like ...

Charles: Is the effort on TypeScript slowing down C#?

Anders: Actually, strangely enough, no because getting to completion to the Roslyn compiler has taken longer...

January 25, 2012

Microsoft AI Initiatives

Several computer science classes focus on algorithms. These include classes in data structures, artificial intelligence, computer graphics and numerical computing.

Some of these data structures are quite involved and I have felt that they should be incorporated inside system libraries. Many of the classical data structures have in the 1990s become a staple of standard libraries such as the Standard Template Library of C++ and with the frameworks included with the Java and .NET runtimes. However, libraries for numerical computing (manipulating matrices and performing statistics), handling artificial intelligence, or doing computationally geometry have still not found themselves as full-class citizens in modern APIs, although 3D graphics do have some presence.

There have been some recent activity in developing consumable AI libraries in the past few years at Microsoft.

With SQL Server 2005, Microsoft incorporated various AI and data mining packages: decision trees, association rules, naïve Bayes, sequence clustering, time series, neural nets, and text mining. A few years ago, Microsoft developed the Windows Solver Foundation libraries that include optimization, solvers, and latent term-rewriting functionality. A Technical Computing Initiative was launched, but some of the players involved have left the company and the output from the initiative remains to be seen. It's also not clear the goals of this initiative.

Microsoft had for a long while made available a Speech API, but its recognition capabilities are somewhat weak and frustrating. There is still no general purpose Natural Language API; this is somewhat complicated by the need to support multiple languages.

Recent developer events have introduced new libraries from research:

  • Infer.NET supports probabilistic inference. The application of this library though is quite limited.
  • A more promising library called Semantic Engine includes a range of technologies from Machine Learning, Computer Vision, Natural Language and others.


There are some downsides to most of these new libraries. They are based on managed code and currently have restrictions that prohibit non-internal commercial use.

November 11, 2007

The Awkwardness of Functional Programming

Both Reddit’s main page and programming subreddit includes a popular post “Admitting that functional programming can be awkward.” Each of these subreddits have elicited numerous interesting responses.

In it, James Hague recounts how a semi-successful Mac game he wrote called Bumbler is trivial to write in C, but that a purely functional approach appeared to be barely doable and may well be the wrong paradigm for these types of problems:

What's interesting is that it would be trivial to write this in C. Some incrementing, some conditions, direct calls to sound playing routines and insect spawning functions, reading and writing from a pool of global counters and state variables. For a purely functional approach, I'm sure the data flow could be puzzled out...assuming that everything was all perfectly planned and all the behaviors were defined ahead of time. It's much more difficult to take a pure movement function and say "okay, what I'd like is for this object to gravitationally influence other objects once it has bounced off of the screen edges three times." Doable, yes. As directly implementable as the C equivalent? No way.

That's one option: to admit that functional programming is the wrong paradigm for some types of problems. Fair enough. I'd put money on that. But it also may be that almost no one has been thinking about problems like this, that functional programming attracts purists and enamored students. In the game example above, some of the issues are solvable, they just need different approaches. Other issues I don't know how to solve, or at least I don't have solutions that are as straightforward as writing sequential C. And there you go...I'm admitting that functional programming is awkward in some cases. It's also extremely useful in others.

It’s not always obvious, perhaps because people often continue to think imperatively when finding an functional approach. Rather than thinking that functional programming is not very compatible with game development, there might be a radically different way of thinking about a problem. I have encountered one computer engineering thesis called “Functional Programming and 3D Games” which specifically deals with the topic.

I have been in a similar situation a few times, but I have always been able to come up with an elegant, often unexpected, solution after thinking about the problem from a high level.

In a functional pearl paper on the Zipper data structure, Gerard Huet writes about the apparent unsuitability of functional approach to many data structures, but then he proposed a new functional data structure called the Zipper to rectify these problems:

The main drawback to the purely applicative paradigm of programming is that many efficient algorithms use destructive operations in data structures such as bit vectors or character arrays or other mutable hierarchical classification structures, which are not immediately modeled as purely applicative data structures. A well known solution to this problem is called functional arrays (Paulson, 1991). For trees, this amounts to modifying an occurrence in a tree non-destructively by copying its path from the root of the tree. This is considered tolerable when the data structure is just an object local to some algorithm, the cost being logarithmic compared to the naive solution which copies all the tree. But when the data structure represents some global context, such as the buffer of a text editor, or the database of axioms and lemmas in a proof system, this technique is prohibitive. In this note, we explain a simple solution where tree editing is completely local, the handle on the data not being the original root of the tree, but rather the current position in the tree.

One of his examples of a tricky data structure—the database of axioms and lemmas in a proof system—hit close to home. I was trying to move my proof system to a more functional approach, which would offer me numerous benefits including immutability, automatic support for backtracking and shared copies as well as easy inclusion inside any other data structure. I did come up with a solution by utilizing a binary tree implementation with filters at each node, aggregated from each of the child nodes. These filters allow me to perform an arbitrary search in an efficient divide-and-conquer manner, whereby a subtree in which no child contains a desired property is left unexplored. It was faster and more light-weight than my previous imperative attempts, and also simpler, because I no longer needed to maintain and synchronize a separate data structure for indexing.

October 08, 2007

Immutable Data Structures in C#

In the post on Path Finding using A*, Eric Lippert, programmer in the Visual C# team, writes that immutable data structures are the way of the future in C#. As immutable data structures have been a frequent posting topic of mine, I am really happy about this direction towards immutable data and functional programming that C# is taking. My only quibble is that the last two words, "in C#," are not really necessary.

The frequent complaint of functional programming is that it is slower than procedural programming. However, I discovered something else: As you know, the immutable data structures for maps may be slower for some operations than their imperative cousins, but the running times for all the operations are typically acceptable and well-balanced. (Even then, I did bring up in a recent post a version of persistent hash-tables and array used by COQ that offers nearly identical performance). Plus, these data structures are very flexible--able to be used freely anywhere with no constraints.

Most of my data structures in NStatic and elsewhere are immutable. In my data structures, the equality and the copy operation are constant-time, even for large object trees. The immutability and balanced performance allows many otherwise exponential algorithms to be done efficiently. A good example is the use of dynamic programming in which function values are "memoized" (sic); this technique is absolutely essential for natural language parsing.

Wes Dyer, another developer in the C# team, wrote early this year that "imperative programming is sometimes reminiscent of a Rube Goldberg machine. Both require meticulous thought to ensure that a process works correctly despite a myriad of state transitions and interdependencies. It is amazing these complicated programs work at all." Hmm... could he be referring to any popular application software that we know of?

That Microsoft and the rest of the world programs imperatively and suffers the consequence of their decisions is the main reason that I can even consider developing the software that may indirectly compete with their own.

October 05, 2007

Type Names

I picked up a lot of interesting knowledge about C# in the course of developing and testing a parser for the language. I'll try to post a few tidbits as they come to mind again.

One nice tidbit is how to create new short type names without having to specify the full name of every non-primitive type, especially inside of generic type names.

Normally, using "using" to create a new type name, even in the presence of the appropriate namespace import, one has to fully qualify every type name that is not a keyword.

using System.Collections.Generic;
using T = System.Collections.Generic.List<string>

However, if the "using" is inside a namespace, then it will utilize any imports located in an outer namespace.

using System.Collections.Generic;

namespace MyCompany
    using T = List<string>;

August 22, 2007

Source Code

Scott Hanselman's post on Weekly Source Code 2 reminded me of my desire to post some of my source code. I'll probably do just that, but, with me and my short-term memory loss, I often forget to follow up until after a few years go by.

A couple of my classes come to mind:

  • Rational number class. This is an implementation of a general-purpose 16-byte data type that can precisely represent rational numbers (up to a certain precision), long (both signed and unsigned), and doubles. The code is written in IL to take advantage of extended-precision floating point. More info...
  • Fast, persistent hash-tables and array. This is one of my two sets of persistent collection classes. Similar to persistent arrays used by the COQ theorem prover for fast, functional union-find data structures, these data types are as fast as their imperative cousins and are ideal for backtracking. Unfortunately, my implementation is not thread-safe. My second set of persistent data structures is fully "functional" and thread-safe, but is more connected to my code-base.

If I haven't posted any source code by the end of September, give me a ring.

I have been meaning to post some regular expression matching implementations from my post in December: one using closures and another less efficient implementation using iterators. Both are functional, but imperative implementations can be compact too. I won't get around to doing it until after I ship NStatic, though.

Several years ago, I wrote a fast pattern-matcher and a slower unification engine (bidirectional pattern-matching) for expressions. Both algorithms provided a superset of regular-expression capabilities including wildcards and disjunctions.  (For unification, one regular expression pattern might be matched against another pattern.) They also handled other phenomena such as associativity and commutativity. By going completely functional, I was able to merge both algorithms into one, fast algorithm and dramatically reduce the total number of lines down to about 500.

I had written a regular expression engine for a college class based on NDFAs, the code for which was substantial. I also know that Microsoft's Rotor implementation includes a large number of files, so I was surprised by the compactness and directness of my functional pattern matcher implementation.

Since then, I have been (re-)writing most of my code in the functional style with the same consistent results. Programming in a functional manner entails heavy use of closures (anonymous methods), which is only really practical in C# 2.0+, and greatly eliminates the need for temporary or accidental classes.