About

I am a software developer in Seattle, building a new AI software company.

Ads

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Recent Posts

Ads


« May 2007 | Main | July 2007 »

June 26, 2007

NStatic Plans

Update

In my last post, I just mentioned some thoughts about where I would be going with the post-V1. I not making major development changes at this point, so as not to delay the product any longer. Even before adding WPF and developing with Orcas, though, there are quite a few items such as VB.NET and partial or complete C# 3.0 support as well as other postponed items such as specifications that are relatively easy to add.

This week I am going through the user interface and getting the final appearance down and fixing all the little UI and keyboard issues. I moved up the Scan window from a docked pane to a ribbon tab per Scott Hanselman’s suggestion. It’s almost July… I don’t want to still be worrying about UI after this month; I am growing tired and want the program out.

I won’t be posting much this week. I’ll post more NStatic examples next week, revealing a few more aspects of the tool for the first time. These posts will make it easier for me to pull together documentation before I ship.

Market Plans

After I ship, there are some proposals I will be entertaining from a few companies. So, I may be selling NStatic independently for a few months before any deal is finalized, after which the price will likely go up dramatically; bargain hunters might want to buy early. For now, I am continuing on my original plans.

Part of my original viral strategy was to get people used to NStatic by making a minimal version of the software, which could also act as a Ribbon-based notepad clone with enhanced editing capabilities—maybe pull people into purchasing the full version. This goes with other forms of traditional marketing—PR, advertising, etc. I am not currently sure now how I will have people evaluate my software.

I am hoping that maybe, just maybe, that, with a unique sophisticated static analysis product that’s a must buy, I might be able to capture some percentage of the .NET developer marketplace. Other categories such as unit-testing, component libraries, source control, text editors, obfuscation, refactoring tools, bug-tracking are heavily fragmented with many competitors or face freely available alternatives. I am effectively all alone with an advanced technology and no overhead, too, so I can underprice any large competitor. This ties in with the central message I got from my business school’s strategy course, which is to “avoid competition.” (For example, Walmart eventually overtook Kmart because most Walmart stores were local monopolies based in rural areas.)

How big is this market, though? I used to have data on market sizes of different software application categories, but since I left business school, I lost access to all sorts of privileged market data. I have to rely on serendipitous information, such as slips by Microsoft executives about there being over 1 million Visual Studio 2005 shipments and 15 million VS Express downloads.

Market information is hard to extract from the web, but I did find a bevy of information from O’Reilly’s Radar weblog on the State of the Computer Book Market (Q1 07). The numbers come from a database containing information about Bookscan’s weekly top 10,000 titles sold. Approximately, 200K books on C# and an additional 120K books on .NET are sold each year. Visual Basic books sell at half the rate of C# books, so I could get a 50% boost in sales by including VB support.

The data also suggests that porting to Java and C++ would each effectively double my C# sales, which are not currently in my plans. I don’t want to damage the business model of companies with top-notch researchers who dedicated their lives to source code analysis. I don’t have a tight attachment to source code analysis; I am just applying my symbolic AI techniques to different application areas.

June 22, 2007

Jumping to Orcas and WPF

After releasing the first version of my product, I would most likely jump directly to using Orcas and WPF targeting 3.0, at which point I will begin blogging about my experiences and the new feature set. I firmly believe in moving to the latest released languages and technologies as soon as possible. I don’t want to invest in moribund technologies, and I do want the most productive tools as I have limited resources.

I wonder how practical it is for me to move to the Orcas beta right now. There will be probably be a number of problems that I encounter, but will it be more than made up by the expanded feature set of Visual Studio or will there be show-stoppers? Is compiler and IDE performance going to take a hit as it did in Whidbey? Both productivity tools, Resharper 3.0 and Refactor Pro/CodeRush, already support Orcas.

For now, I will play it safe and jump into Orcas after I released NStatic. Every change seems to drag my release ever farther away.

As for WPF, I halted my user interface development work in anticipation for WPF, since it would obsolete some of my existing work and also includes support for animations, 3D, and a declarative programming model. My 3D effects library that I spent some time developing, including perspective warping for images, will probably go to waste.

My plan is to switch over to WPF as the primary programming model and utilize its WinForms compatibility controls. I will purchase the WPF editions of the existing control library as there are already many WPF-based user interface components available now. I purchased Chris Sells and Chris Anderson’s books on WPF; I will probably also buy Adam Nathan’s based on Jeff Atwood’s recommendation. The WPF programming model is just so clean and beautiful, a sharp contrast from MFC.

June 21, 2007

Loops, Part 1

I was hoping to post at a rate of once a day highlighting some area of NStatic, but I have found myself busy eliminating bugs. This is my first post on loops, but I will likely be expanding on it on the future.

One feature that is possibly unique to NStatic is the exact treatment of loops. Other products tend to examine loops a fixed number of iterations (usually just once) or simply punt on the issue.

Variables that are changed within loops are converted into expressions containing functions or closed forms of specially recognized functions. In this example, I will use the Fibonacci function. The Fibonacci function returns the nth element of the sequence

1,1,3,5,8,13,21,34,55,...

It's alternatively expressed as a recurrence relation:

Fib(0) = 1
Fib(1) = 1
Fib(n) = Fib(n-1) + Fib(n-2)

Below are two alternate implementations of the Fibonacci function, one which is recursive and another which is implemented imperatively via for loops.

I have also included below a goto version of the for loop to show that loops are handled in a general way and that I offer no special treatment for the for loop operator.

Below is the actual result at the breakpoint in the FibGoto function.

The values of both prev and sum are expressible in closed forms. The closed form of the Fibonacci function is well-known and is expressed as the sums of the powers of the golden ratio (and its difference from one), all divided by the square root of five. However, I did not hardcode the formula, since the Fibonacci function falls into a general class of second-order homogenous recurrence formulas.

The function FibTest computes the results of the earlier three implementations of the Fibonacci function. Note that I pass in a double to the recursive function, so it doesn't produce a closed form. (The closed form can probably be generated by taking the ceiling of the floating-point value before applying the function.)

The output for FibTest is shown below.

June 18, 2007

Equation Solving

One of the side benefits of the NStatic Tool is the ability to solve various types of equations or systems of equations. One set of problems is the solving of systems of equations in linear algebra.

In the example above, we have three equations with three unknowns. Since all three equations are linearly independent, we are guaranteed a single solution as revealed below by NStatic.

I kept the parameter types as doubles, since even though all the coefficients are integers, we could easily get a solution set that incorporates numbers from the rational domain.

Just by changing the integer constant in the right hand side of the last equation from 10 to 11, all of the values would have to be rationals, and last assert would no longer be satisfiable if any of the parameter variables were declared an int.

The value column depicts all of the solutions values in rational form (with a numerator and denominator), even though they were declared as doubles. The use of rational numbers ensures against rounding errors, which could easily introduce hard-to-find bugs into the system.

Another set of problems are polynomial equations. NStatic has the ability to solve polynomial equalities and inequalities. Nonlinear equations arise quite commonly with recursion and looping control structures.

Here I have a loop that computes the sum of an arithmetic sequence and then performs a test on the sum.

In this case, we are able to place a bound on the value of n to ensure that the sum is over 20. The variable sum is a quadratic polynomial of a complex term, so we can find the roots of the equation to come out with a disjunction of all the possible ranges of n.

 

kick it on DotNetKicks.com

June 17, 2007

Execution Paths

There were a few items that I didn't show in previous posts on my attempts at visually illustrating execution traces leading to an error or breakpoint.

One is the superimposing of arrows of all possible paths together. For all errors or breakpoints of the same type in the same location with the same call stack, all possible paths leading to the error are shown simultaneously. This has interesting consequences inside various control structures such as finally clauses, loops.

For one thing, this does means NStatic's reported error count will actually be an underestimate of the number of errors it found, as each reported error count represents multiple bugs. On the other hand, it cuts down on false positive rates in which not every possible path will likely be explored in actual use.

There are some reasons for this. For example, there may be an exponential number of paths leading to the error, so it would not be possible for me enumerate all of them. I do plan on providing a later option in a minor update where additional assumptions may applied after the fact to selectively choose which arrows are most important.

In the bug above, there is a warning given for the cases where we may encounter an error. Inactive code in the function is blanked out.

The execution path is general. From within a function, all types of control structures are supported and accurately depicted from arbitrary gotos, throws, loops, finally clauses. Probably, the main limitations are due to asynchronous exception handling (thread aborts, stack overflows, out of memory), which I am still thinking about. The execution path is shown for all functions in the current call stack, but it would not be difficult in the future to add support for examining the traces of previously completed functions that have vanished from the call chain.

The implementation is simple but completely general--a real functional pearl. The execution path is internally represented as an expression with a sequence operator, which is only simplified with current set of assumptions at the time the user views the error. The expression becomes arbitrarily complex with lambda functions and conditionals.

June 16, 2007

Old School Programming

Scott Hanselman recently wrote about teaching children and kids to program the old school way by using the Commodore 64 emulator. It seems just recently that Zenzo, his 18–month old child, jumped off the cradle.

I wonder if old-school programming with direct access to the computer and the operating system is preferable to learning with scripting languages of today. What follows is my case for old-school programming, but scripting languages do provide a much better, functional programming language to learn from..

I first got into programming with the Commodore PET, followed by the Commodore 64. Like Scott, I would spend hours typing up long BASIC and machine language programs (written in hex) from various computer magazines such as Compute! and Run. It was then that I began programming full-time. I still remember 53280/1 as the addresses for changing the screen colors, SYS 64738 for rebooting the operating system. I still have most of the 6510 codes burned in my head. LDA was A9 (169), LDX A2 (162), JSR 20 (32), RTS 60 (96), and of course BRK was zero.

I wrote my own assembler and disassembler and used both to write primarily in assembly language for the next few years (near address location 40960 which was an unused section of memory), which was all before high school. I used my disassembler to decode and rewrite the entire 8K of BASIC ROM and parts of the 8K kernel back to source with meaningful symbols. I remember becoming mesmerized with how the 256 byte stack was used for parsing expressions. Even though processors were slow, linear search was used everywhere even during frequent operations such as changes in control flow and variable access. I eventually added my own extensions to the BASIC language to support structured programming and better graphics.

I also played around as if I was the operating system, disabling interrupts to create my own interrupt routines, which I used to implement a partial multitasker, an animation engine, as well as a rasterizer that bypassed hardware limits for 8 sprites. In the default screen mode, every character occupied one byte of memory for a total of 1000 characters in a 25x40 grid; the number of characters were limited to 255. I also replaced the default mode to bitmapped mode, which consume eight times as many bytes: This enabled me to support character styles (bold, italics and underline), independent background/foreground colors for each character, larger character sets, and integrated graphics. I also experimented with proportional fonts (based on automatic detection of character boundaries and bit shifting) and screen widths larger than 40 characters.

I wrote rather than purchased my own software, as, being a child, I had no money. I built a range of programs including a wordprocessor to figure out how things works; some of these programs were better than commercial versions, but I just did not know how to sell them. I did seriously think about whether I could one day develop and sell my own software independently. In college, I purchased a couple books on developing and marketing shareware.

Programming at an early age helped with my education. I often found myself exposed to college material as a result of my activities. At a self-pace program in my elementary school, I began taking some high school courses while in sixth grade. Because I was already comfortable with memory addresses and stack, I never had the slightest difficulty with pointers and recursion, when I encountered them in C and Pascal; they both seemed natural to me. (Back eight years earlier, when I encountered PEEKs and POKEs, streams of hexadecimals numbers inside DATA statements, and mysterious SYS statements in computer magazines, I did find myself confused.)