The next iteration of C# is poised to become multi-paradigmatic, addressing numerous issues in programming. Most discussions focused on SQL and XML data integration and concurrency, but new features mentioned by a journal submission suggest an assault on dynamic languages is in preparation.
Eric Meijer and Peter Drayton recently submitted “Static Typing Where Possible, Dynamic Typing When Needed: The End of the Cold War Between Programming Languages” for a journal on “Revival of Dynamic Languages.” Both authors worked at Microsoft on research projects experimenting with new language extensions on top of C#. Their article explains how dynamic features can be retrofitted cleanly into statically typed language using examples based on C# extensions. Most of these have been published previously in articles on COmega and Spec#. The effect is pseudo-dynamic typing—dynamic style programming on top of a statically typed language. The introduction of pseudo-dynamic typing in C# raises the possibility that it may become pervasive as a scripting language, stealing much thunder away from Python, Perl and Ruby.
Is it a coincidence that the timing of this publication is so close to the upcoming PDC 2005 announcement of C# 3.0? I think not, yet at the same time I am not sure all of the features mentioned will make it in. C# 2.0, which introduced four major features, is essentially a light release in preparation for the mega-release C# 3.0, which was co-developed at the same time. (Whidbey was originally the version 1.2.) Chris Brumme, CLR architect, previously mentioned that Whidbey release was focused on product maturity, focusing on fundamentals such as performance, reliability, and security, but that the following Orcas release would embark on a lot of crazy new ideas, such as those from functional and dynamic programming languages. As an indication of this, Jim Huginin, author of IronPython, a .NET dynamic language, was recently hired into the CLR team. I’m just glad that Orcas will be such a feature-filled release, because the next version, Hawaii, will probably not arrive until 2009 or 2010.
I outlined below some of the major solutions that the paper mentioned. Some of these solutions have already been incorporated in C# or introduced in one of the research languages Spec# and COmega; in those case, I included them information within square bracket.
- Type Inference. Type inferencing is already offered in a number of languages like Haskell. Currently, type information must be specified explicitly in the declaration. With type inferencing, a variable is assigned the most general type that will successfully compile within a block. It requires an advanced AI technique called unification, already used to some degree in generic type inference in C# 2.0. Eric noted that static typing with type inferencing could result in less verbose code than dynamic typing: Within object literals, for example, constructors can be invoked without specifying the type to construct.
- Contracts. [Spec#] Contracts extends the type system to support invariants, preconditions, postconditions. Contracts allow stronger compile and run-time checks. It also opens up the possibility of long-running theorem-proving tools to examine the IL post-compilation and establish program correctness.
- Coercive subtyping
- typelifting across collections, null types, discriminated unions and tuples. [COmega] This is essentially generalized member access, which I mentioned in an earlier post. I liked to see a systematic, orthogonal approach to enable lifting over an type, not just the specific ones mentioned above.
- late-binding. [VB] C# may support late-binding for object variables in the same fashion VB does today.
- (patterns). This isn’t in the paper, but a C# developer mentioned possibility that generic types may support patterns, in addition to interfaces, to allow calling member functions by “name” like C++ templates do. I suspect that, if this is implemented, patterns will be implemented as interfaces, which are dynamically painted at runtime unto an object rather the explicitly implemented at compile-time.
- Dynamic scoping. A statically typed compiler emulates dynamic scoping by passing in hidden arguments into a function that uses a dynamically scoped variable. Any function that calls dynamic scoped functions with an implicit argument will also required a hidden implicit argument as well.
- Covariance and Contravariance in Generics. [IL 2.0] Covariant and contravariant generc type parameters are already supported in IL in Whidbey, but have not yet been integrated into any of the mainstream languages.
- Adhoc relationships and prototype inheritance
- Expando properties. C# could offer special syntax support for untyped objects, essentially variables of type Dictionary<string, object>, essentially mimicking the behavior of Perl and Python. This support would also required reflection through the IExpando interface.
- External link tables. Link tables could help C# bridge the mismatch between relational keys and object references.
- Lazy evaluation. Erik mentions streams of objects, which I think is partially addressed by iterators in C# 2.0, with more advanced support coming in Orcas.
- Eval. The article also looked at eval capabilities of dynamic languages. Some of the more common uses of eval can be solved through closures or standard methods for deserializing data. The more advanced uses (partial evaluation, multi-stage programming and meta-programming) would rely on support for code literals (programs within programs). No specific details were provided on this point, so it’s the least likely to be address within Orcas.
Will these features carry over to Visual Basic? Perhaps, but recent statements by various designers in the C# and Visual Basic team suggest a growing divergence in language approaches, reflecting differences in the underlying philosophies of each. While both VB and C# are adding data features to the language, it appears that each will take a separate approach mirroring their respective Mort and Elvis personas. These emerging differences will force developers in the future to reconsider the substitutability of C# and VB and take sides depending on the priorities of their application development—rapid development or code quality. (Microsoft maintains that VB, C# and C++ emphasize RAD, language innovation, and power, respectively.)
Visual Basic is all about rapid application development. It more focused on accessibility and productivity and is probably more likely to tradeoff runtime performance for the immediacy of quick background compilation and interpreter-like interactivity. This is because VB applications are more likely to be ad-hoc, internal business applications that need to be churned out quickly and inexpensively. These applications are more likely to rely on off-the-shelf components to speed development. Since they are used in a fixed manner in a fixed environment, these applications have limited testing needs.
Development in C/C++ is usually too costly for routine IT work; those languages are more cost-effective for applications developed for sale. As a mostly psychological descendent of C family of languages, C# is more focused on software engineering, and hence emphasizes explicit code and error detection at compile-time at the expense of programmer interactivity; it also competes with C++/CLI in offering easier access to the platform though unmanaged pointers—important for managing memory-mapped files, processing image and manipulating low-level system data structures. C# is used more often in commercial applications and libraries with large external customers. The higher cost in testing makes it more worthwhile for C# to trade off compilation time for time-consuming code analysis and inferencing
These aren’t hard guidelines, but reflect the different emphasis each language takes. Some good examples are new features offered to reduce code: New code-reduction features in Visual Basic seem intended to promote accessibility such as the My classes, Handles and WithEvents. In contrast, new code-reduction features in C# involve more advanced concepts (closures and iterators) and serve to eliminate the programmer from entering tedius, error-prone lines.
I have seen some COmega stuff before but it looks like I should really read this paper.
On differentiation of VB and C#, I hope Microsoft announces it explicitly. If C# is for innovation where VB.NET is for real RAD, this could help people choose their languages more easily. (Of course, you would still be able to write quite similar code in both.) I believe that the typical business applications developer shouldn't be bored with closures, contravariants and all similar magic terms. If VB.NET language can hide all these but still can manage to provide the same power C# can (e.g. via COmega data features), then it would be perfect. No VB.NET developer would have to learn about closures.
Posted by: teddy | July 12, 2005 at 09:37 AM
I wouldn't draw too much 'inference' from the topics covered by Erik's and Peter's paper.
Posted by: Matt Warren | July 18, 2005 at 08:27 PM
Interesting post Wesner.
I may be misunderstanding your meaning, and if I am please tell me, bu Type Inference doesn't require any advanced AI technique since every C# expression has a type so something like
var = expression;
is unambiguos.
Injecting (Design?) Patters at compile time is somewhat meaningless for any multi class pattern, you might be able to inject Singlton but that would be about it.
Also, how would you implement Eval using closures and object deserializing?
-- Jan
Posted by: Jan Bannister | August 11, 2005 at 01:40 AM
I think that the type inferencing not just limited to the var keyword--more along the lines of Haskell.
Even with just the var variable, the compiler could still attempt to resolve a variable to a type that would be compatible with all uses with it.
Posted by: Wesner Moise | August 11, 2005 at 02:08 AM
Closures solve most of the uses of eval in which you need to pass a code block to another function along with local variables.
Eval is also used for parsing values such as number and strings, but it's sufficient to just use Parse methods for that purpose.
The other option is to have code literals in which an abstract syntax tree is created by the compiler and then later compiled and executed.
If you look at perl, it has the notion of eval blocks, which are syntactically checked. In perl, you never really need to have a string, except if you are reading code from external programs.
Posted by: Wesner Moise | August 11, 2005 at 02:16 AM