I mentioned earlier in my post “Get Your Butt Outta Bed and Build Something” that I would be shipping subproducts before my main product.
My main product (which shall remain secret) is general-purpose desktop application and may attract an audience that requires more hand-holding and is less tolerant of inevitable bugs from a first version release. Quality fears due to size of the product is currently driving my decision to ship different portions of my codebase in smaller subproducts earlier.
This is my current plan:
In my first subproduct, a static analysis tool, I will focus on testing my AI backend. Subsequent subproducts will focus on my other technology such as natural language processing (e.g., kickass grammar checker) and document-editing (e.g., graphical code editors). These subproducts will each have short release cycles; I will release a free lite version and commercial pro version. In contrast to my main product, the subproducts will be targeted to developers, experienced computer users and software businesses, who require less hand-holding and of whom I am also representative. After I have tested and released the various components of my technology, I will incorporate the feedback into my main product.
As I mentioned earlier, I am working on a Static Analysis tool for .NET languages comparable to Spec#, Microsoft’s PreFix/Fast, Java’s FindBugs, and PCLint. I tentatively call it NStatic. I’m looking at a beta release just before Christmas (one month from now).
I could also add possible support for dynamic languages like Ruby and Boo, depending on how inexpensive such extensions are to add and how valuable such support would be to my own build process, which does incorporate dynamic languages. Building a parser for a new language is trivial for me; it’s the additional language services such as a complicated generic type system and method disambiguation that are time killers. On the other hand, adding more language support is a distraction from my main product.
Here’s how static analysis relates to my main product. Programming languages are much simpler to parse, analyze and work with than natural language and provide a natural intermediate step toward the real thing. Programs of different languages are parsed into a common universal expression language, which also makes it trivial to convert code from one languages to another a la CodeDom. I also capture syntactical elements such as comments and preprocessing instructions. This is the same representation that will be used for my natural language expressions.
The tool tests my AI backend, and the time investment in building this tool is low on top of my existing codebase. I also have prior experience from writing another much simpler static analysis tool for C++/C, CStatic, several years ago, based on symbolic pattern matching. I submitted that command line tool to the Larkware Contest after a simple recompile to Visual Studio 2005, to see if I could win a prize by default—unfortunately, no such luck; I think the judges had a bias for graphical interfaces. (Larkware is a good example of a tastefully done commercial blog.)
My upcoming NStatic tool is different from most other static analysis tools in several ways:
- Static execution through multiple paths in a flowgraph locating infinite loops, dead code, condition violations, and exception-only code paths based on the dynamic types and values of variables.
- This is time-consuming step, since large functions generate an exponentially increasing number of code paths.
- There’s a blurring of compile and runtime, because, in some cases, code may actually be run, and many Framework functions are recognized intrinsically.
- Spec# appears to transform procedural code to a functional representation, and this would be a much better long-term approach. However, since static analysis is not my main focus, it’s not a strategy I will invest in.
- Symbolic computation. Symbols and functions, not just numeric values, can be manipulated algebraically.
- Interprocedural analysis. Theorems are extrapolated from function body, so errors straddling function boundaries are caught. APIs calls in the .NET Framework are pre-analyzed and parameters are validated statically.
- High-level, declarative rule language for specifying both syntactical and semantic constraints and providing some support for specifications as in Spec#. The language is design to closely match human intent.
These are typically hard to do but, with the right backend, can be straightforward to implement. Some features could be dropped before beta.
There aren’t any good static analysis tools for .NET, and what I am offering should be smarter than many commercial implementations. I haven’t checked out Team System, but the free version of FXCop appears to be mostly a style checker. Given that static analysis tools are already commercially successful, I should be able to command a good price.
There are many options for delivering a free lite and a paid pro version of the tool. For instance, my tool can produce good results in seconds, but, if it is allowed to run continuously for days or weeks, it would explore more of the abstract state space of the program and uncover more difficult-to-find bugs.
You state that there aren't any good static analysis tools for .NET, yet you obviously have not checked FxCop if you think that it's a 'style' checker; FxCop doesn't even check source.
It is also the static code analysis that is available in Team System.
Posted by: David M. Kean | November 20, 2005 at 01:02 PM
You can do style checking in .NET without reading the source code since IL contains full metadata.
The emphasis of FxCop is on enforcing Framework design guidelines -- that's what it was designed for. Hence, it mainly checks stuff like casing rules, identifier suffixes, presence and naming of recommended methods, etc. That's definitely what I'd call style checking.
FxCop also does some more advanced stuff like detecting unused variables and methods, but nothing like the really in-depth code analysis that Wesner is outlining for his tool.
Posted by: Chris Nahr | November 20, 2005 at 11:24 PM
If that's what Wesner meant by style then that is but a small portion of FxCop; there is security, performance and correct usage analysis as well, on top of that it is also extensible.
Posted by: David M. Kean | November 21, 2005 at 12:44 AM
FXCop rules tend to be based on metadata rather than code (IL).
Because of this, I referred to the current version as a style checker. There are a few instances of IL-based rules.
FXCop has the capacity to do more, but is currently underutilized; that might change in the next iteration of FXCop rules. As I have only used the free version, it's quite possible that the Team System edition may already have more IL-based rules.
There is also some information lost in the translation from languages to IL--high-level language constructs, variables names, XML summaries and comments.
Posted by: Wesner Moise | November 21, 2005 at 03:44 AM
About 6 months ago, I had an idea to use FxCop with DxCore (devexpress, Code Rush) with FxCop to make a plugin that did static anlaysis in the background while you were working in Visual Studio (I wanted to call it Fx-DriveBy.) Unfortunately, FxCop is difficult to use like that because of a bunch of CAS nonsense, plus the introspection engine (?...or some other component, can't remember now) is not licensed for use outside of FxCop, so I gave up on the idea.
If you make a static analysis tool, may I suggest making it possible to use in other tools/applications.
Posted by: Chris Bilson | November 21, 2005 at 04:27 AM
I believe that the Phoenix compiler framework that Microsoft is developing will address extensibility issues, so third-parties can build all sorts of cool tools on top.
Since I am just one person, I need to do the minimal amount, and can’t build an extensible framework that has to anticipate and implement every developer’s needs.
There is Cecil, which is a free introspection engine. Also, I believe that there may be some reflection enhancements in Whidbey that eliminate some of the advantages of the introspection engine, I believe.
Posted by: Wesner Moise | November 21, 2005 at 07:39 AM
How extensible is this static analysis tool going to be?
I've had plans of creating one myself and even took steps to accomplish this, but never got around to finishing it.
Posted by: Omer van Kloeten | November 21, 2005 at 08:40 AM
Feature request: Exceptions!
I'm looking for a static analysis tool that can tell me all possible exceptions for an object's method call, constructor, etc. I would also like details on the code-path that generates the exception and not have specific ties to a specific framework version (so I could run an analysis of App X against .NET 1.1). From this post, it seems this should not be difficult for this tool to report.
Posted by: George Tsiokos | December 01, 2005 at 08:26 PM