About

I am a software developer in Seattle, building a new AI software company.

Ads

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Recent Posts

Ads


August 17, 2007

Symbolic Ray-Tracing

I thought of an approach to ray-tracing that could result in faster performance using symbolic evaluation. I mention it here to illustrate the benefits of thinking symbolically and also because I don't see myself writing a ray tracer anytime in the next decade.

Images typically contain stretched out areas that correspond to the surfaces of individual objects. While each pixel in an area may have a unique color value, all of the pixels may easily computed by a single, simple function, even for textured surfaces. In theory, the applicable area of a function could be extended to the full dimensions of the image by using conditionals, but such a function would be unwieldy. 

Instead of calculating a color value for each ray we shoot out from each screen pixel, we could produce a partially evaluated function that returns a color. (This could work recursively in the event of reflected surfaces.) The computation for a single pixel would suffer, but the function could be reused to fill out the area occupied by adjacent related pixels, which also share the same function. Search and anti-aliasing costs for these pixels would be eliminated.

One approach to area detection is to produce a second function, which returns a boolean value indicating whether the first function is still valid for a given point, based on the information gathered from intersection tests of the original pixel. The second function basically serves as the boundary test for the flood fill operation. With this approach, we have the choice of making our first function a compiled function constructed from closures, a partially evaluated function represented as data, or a combination of both. Compiled functions might be faster, but data representations, being symbolic, could produce exact results.

Since this approach reduces work for typical images, I suspect that high quality images could one day be produced in the same amount of time that lowest quality images currently take without special hardware. 

August 09, 2007

LISP

Randal Munroe posted some "XKCD" comics on LISP, which I thought were  especially relevant to my situation.

This one below drawn a while back is called ""LISP" and captures my fascination with functional programming and its remarkable ability to express simply and elegantly everything about the world.

This more recent one called "LISP Cycles" conveys my attempt to use the light side of the "force," functional programming, to overcome the dark side, imperative programming (and --shhh!-- singlehandedly defeat the evil empire in the process.)

June 16, 2007

Old School Programming

Scott Hanselman recently wrote about teaching children and kids to program the old school way by using the Commodore 64 emulator. It seems just recently that Zenzo, his 18–month old child, jumped off the cradle.

I wonder if old-school programming with direct access to the computer and the operating system is preferable to learning with scripting languages of today. What follows is my case for old-school programming, but scripting languages do provide a much better, functional programming language to learn from..

I first got into programming with the Commodore PET, followed by the Commodore 64. Like Scott, I would spend hours typing up long BASIC and machine language programs (written in hex) from various computer magazines such as Compute! and Run. It was then that I began programming full-time. I still remember 53280/1 as the addresses for changing the screen colors, SYS 64738 for rebooting the operating system. I still have most of the 6510 codes burned in my head. LDA was A9 (169), LDX A2 (162), JSR 20 (32), RTS 60 (96), and of course BRK was zero.

I wrote my own assembler and disassembler and used both to write primarily in assembly language for the next few years (near address location 40960 which was an unused section of memory), which was all before high school. I used my disassembler to decode and rewrite the entire 8K of BASIC ROM and parts of the 8K kernel back to source with meaningful symbols. I remember becoming mesmerized with how the 256 byte stack was used for parsing expressions. Even though processors were slow, linear search was used everywhere even during frequent operations such as changes in control flow and variable access. I eventually added my own extensions to the BASIC language to support structured programming and better graphics.

I also played around as if I was the operating system, disabling interrupts to create my own interrupt routines, which I used to implement a partial multitasker, an animation engine, as well as a rasterizer that bypassed hardware limits for 8 sprites. In the default screen mode, every character occupied one byte of memory for a total of 1000 characters in a 25x40 grid; the number of characters were limited to 255. I also replaced the default mode to bitmapped mode, which consume eight times as many bytes: This enabled me to support character styles (bold, italics and underline), independent background/foreground colors for each character, larger character sets, and integrated graphics. I also experimented with proportional fonts (based on automatic detection of character boundaries and bit shifting) and screen widths larger than 40 characters.

I wrote rather than purchased my own software, as, being a child, I had no money. I built a range of programs including a wordprocessor to figure out how things works; some of these programs were better than commercial versions, but I just did not know how to sell them. I did seriously think about whether I could one day develop and sell my own software independently. In college, I purchased a couple books on developing and marketing shareware.

Programming at an early age helped with my education. I often found myself exposed to college material as a result of my activities. At a self-pace program in my elementary school, I began taking some high school courses while in sixth grade. Because I was already comfortable with memory addresses and stack, I never had the slightest difficulty with pointers and recursion, when I encountered them in C and Pascal; they both seemed natural to me. (Back eight years earlier, when I encountered PEEKs and POKEs, streams of hexadecimals numbers inside DATA statements, and mysterious SYS statements in computer magazines, I did find myself confused.)

 

December 27, 2006

Fabricated Complexity

There is a quote in computer science, “the solution to a hard problem, when solved, is simple.” I don’t know who to attribute it to, but I have repeatedly found myself arriving at very simple and elegant solutions to hard problems—problems in natural language, in AI, and in application development.

Anna Liu mentioned a talk by Willty Zwaenepoel on research and fabricated complexity.

He spoke of "Fabricated Complexity" - and basically about his observation that researchers often over complicate issues to make them seem 'interesting and novel' and to be accepted by the academic peer review process, while real practical/applicable ideas that lead to useful innovations often are actually based on 'simple ideas'.

He also came to the conclusion that 'design by contract/stable interfaces' are the key to successful (maintainable) innovation, despite he and his team spent many years of building some of the most 'sophisticated/complex' algorithms in distributed systems technologies, it was down to these simple software engineering concepts that would lead any innovation to wide adoption.

One paper that comes to mind is from the Spec# at Microsoft, Abstract Interpretation with Alien Expressions and Heap Structures. It took me a few readings to digest this paper. I gathered that “polyhedra domains” refers to the feasible region in the simplex algorithm, but much of the other content made more sense after going through related literature and course material.

What is more surprising is that I came up with a simpler, more elegant and more general solution than those mentioned in this paper. Perhaps, fabricated complexity leads to complex solutions. If simple, familiar ideas were used, would our mind be able to discover more associations with other related ideas? There may also be a greater tendency to preserve the beauty of a system when the components themselves are beautiful; when not, anything goes.

In economics, unions effectively drive up wages while limiting employment. Professional organizations maintain high levels of income of their highly-paid members by creating legal barriers of entry to new (usually younger) entrants. They lobby the government to impose stricter requirements for licensing, while, at the same time, urging grandfather clauses for existing practitioners. New interns must undergo many years of schooling, learn a new vocabulary, and work hours with low pay for a number of years. The same holds especially true for researchers of academia. 

I am not making a value judgment here; such practices might attract only the most dedicated people and also improve quality. We don’t want to reward those who only see the monetary incentive, but are not willing to put in effort. For the most part, the tendency for professionals to sharply divide the world between “knows” and “knows-not” is unconscious but convenient.

Just as professionals introduce a more complicated vocabulary, software companies often come up with complicated names in their products and libraries when simple ones would do. Since computer programming is so broadly applicable and valuable, an arcane vocabulary in this industry does not serve us well. While it’s not as basic a skill as reading and elementary mathematics, it ranks with business vocabulary, which is more accessible than the vocabulary of other professions.

Apple has always used friendly names in developer APIs. In contrast, Microsoft often unnecessarily complicated its API, especially COM, with cryptic names and hungarian notation, unintentionally driving users to the Java space and other platforms. Perhaps, this is partly because the architect of COM/OLE used to be a high-energy physicist from Oxford. His influence can be seen through the use of physical terms like source and sink in some COM interfaces. I once spoke to an OLE dev lead, who remarked that OLE development was incompatible with a short development time. It became clear to me that Microsoft APIs were unconsciously designed for other large software companies, most of which have since been vanquished by you know who.

The .NET Framework, especially with the help of the FX tool, enforces that the name of every type and method can be found a certain dictionary. Because it was designed for multiple languages, it had to serve the lowest common denominator; it had to be easy to use by the VB developer. This I think is one of the many reasons for its success.

One problem with functional languages comes from its heritage in academia. Hence, newcomers are often intimidated by the unfamiliar terminology, which actually refers to simple ideas. Functional programming languages look hard or too mathematical, when in fact they should actually be conceptually easier. The LINQ team has made many functional constructs dramatically more accessible by giving them names that sound more like conversational English. Operators like Map, filter, reduce, fold were changed to the names of their SQL counterparts such as select, where, sum, aggregate and so on. Terms like lambda expressions or closures might also become more accessible if described as anonymous functions or blocks. (Don’t try to figure out why closures are called “closures.” The origin of the name, which I have forgotten, is now meaningless today, but there use to be an “open” counterpart to the concept.)

December 14, 2006

Patents

Google just launched a search engine for finding patents. I typed in my name and discovered that I had been awarded a patent (assigned to Microsoft) for some obscure PivotTable feature in Excel.

In my MBA program, I attended some classes and seminars in a variety of laws including intellectual property law. I also used to regularly attend monthly sessions given by Washington Software Association to keep up to date on latest changes to law regarding software.

I have ambivalent feelings about whether I should seek patents. Software patents limit technological progress, and I would like to see the day when computers take over the world before I die.

There are a number of genuine inventions in my product that I think are patent-worthy. My needs are primarily defensive, though. I am not incline to sue anymore, especially the open-source community and small companies, and have even thought about opening up technology after I made a independently livable amount of wealth.

I could used this as a bargaining chip if I ever sold my technology. However, if that happen, I could get into the awful position of not being able to use what I invented. There’s a staggering amount of reuse between different areas of my product, and this is principally due to the inherently compositional nature of functional-style programming.

Incidentally, I have been contacted by the chief scientist of a leading source code analysis company, flowing with PhDs. He called my product “unique and interesting” and inquired about how my product works; he also mentioned potential business cooperation opportunities. Maybe, I hit some gold.

I can’t afford to actually buy a lawyer, but, since I am a do-it-yourselfer, I purchased last year Patent It Yourself by Nolo Press and other manuals. The law is constantly changing, so you need the latest copy. 

If I do go ahead, I may also buy Nolo’s software package, PatentEase. First, I would do a provisional patent application, which allows me to label my product as “patent-pending” and gives me a one-year window to actually file an application (I believe, but ask a real lawyer). Since I have been through the process before, have taken a number of basic courses, I think that I am prepared even though I am sure I will make some mistakes.

My motivation isn’t mostly money. I will probably never buy a large house or a car that costs over 40K. I inherited my father’s value system, which regards excessive, ostentatious wealth as vulgar. My motivation is to avoid meaningless work at a large corporation, to eliminate stress and to maintain a high level of freedom. I like the idea of staying a small company. My motivation is also intellectual—to explore the unrealized possibilities of computer, to find that intersection of human and computer intelligence.

December 05, 2006

Lego Programming

Joel reviewed a book Beyond Java, and, in his review, he enthusiastically recommended an essay by Fred Brooks called "No Silver Bullet: Essence and Accidents of Software Engineering." He recently mentioned it again in his post Lego Programming. Brooks wrote the Mythical Man Month, which was really the first software engineering text. It was remarkable in identifying surprising asymmetries in software development. For example, Brooks identified the network costs in adding more developers to a project and dramatic disparities  in individual developer productivity (eg, "adding manpower to a late softer project makes it later"), but he also made a number of forgotten poor predictions in the same book, some of which he confesses to in "The Mythical Man Month After 20 Years"  such as "I am convinced that interactive systems will never displace batch systems for many applications."

I have written about Silver Bullets in the past and emphatically feel the widely regarded author to be irresponsible and premature in his assessment of there being no silvery bullets, which is leading many developers, not the least of which is Joel, to be unimaginative and pessimistic of advances in software development.

For example, Brooks asserts in the start of his essay this bit of false wisdom:

There is no single development, in either technology or in management technique, that by itself promises even one order-of-magnitude improvement in productivity, in reliability, in simplicity.

That assertion turns out to be pure nonsense, amply disproven by numerous advances in IDEs, languages, frameworks, componentization over the past few decades. Our expectations of software and our ability have risen. A year of work takes a month or a month of work takes a day. An order of magnitude improvement usually results in major qualitative changes, often resulting in an existing lengthy project becoming a short task item or a new project suddenly becoming feasible, such as when end users start writing applications (using scripting and RAD tools) that were once exclusively the domain of IT.

The net effect is that we often don't consciously recognize tenfold improvements in productivity. We forget how hard it was to program decades ago. Consider developing a game for the simple Atari 2600 gaming system back in the early 1980s.

I see the driving force towards tenfold productivity as the move to more declarative and compositional approaches, be it through functional, object-oriented, and component-based programming. Kinda like Lego.

December 04, 2006

Technical Assistant

I just read about the next technical assistant of Bill Gates, Joshua Goodman, from the Natural Language Blog. He was another classmate of mine at Harvard University majoring in computer science. We knew each other especially because the computer science department was so small, ~25 students out of 1600 per class.

I joined Microsoft as a Researcher in MSR in 1998. During that time, I’ve worked on speech recognition, stopping spam, and email among other topics. I helped start the MSN Safety Team that still builds spam filters for Microsoft’s email products, and I spent two years on loan to that team, getting some product experience. Early in my career, I worked as a developer at Dragon Systems, a long gone speech recognition company: some of my code has survived 12 years, two acquisitions and a bankruptcy. I interned at Microsoft in 1989 as a PM on the Excel team, and in 1991 as a developer on the iFax project (an intelligent fax machine – this would have been huge if it were not for the internet.) Until I joined Microsoft, I spent most of my life in the Boston area, including time at Harvard earning a Bachelors, Masters and Ph.D., all in computer science. I’m married with three young boys who enjoy playgrounds, lakes and pools.

The job of technical assistant was created basically to assist Bill Gates in responding to the large volume of requests for technical feedback and guidance that he receives from across the company. Pretty cool to have someone who knows about natural language work in that role!

Funny. It seems that all three of us, Joshua Goodman, Alex Gounares (the previous technical assistant, whom I was also on a first-name basis), and myself have all had some involvement in natural language.

August 16, 2006

Powers Of Ten

The New York Times recently included an interesting graphic,“Separated at Birth,” which compares the image of the universe to that of a mouse’s neurons. The graphic strangely suggests that the universe may wrap around itself as we delve more into the infinite or the infinitesimal. 816power10

This notion is captured nicely in this Simpson’s Power of Ten video. The Simpson’s video is actually a parody of the original Power of Tens video produced by two IBM scientists; the scientists also have a www.Powersof10.com website that depicts 10X transitions across both space and time. This video has resulted in numerous copycats: “Secret Worlds: The Universe Within,” and “CellsAlive! How Big.”

In the early nineties, I read a book by Andy Grove, former CEO of Intel, called Only the Paranoid Survive, which describes how 10X forces create strategic inflection points, which can topple established companies. Intel faced such a disruptive transition as it moved from being primarily a memory chip company to a CPU company in the last eighties. It left such an impression on me, that it’s the only thing I recall from the book.

Being a quantitative person, I tend to evaluate features in product in terms of quantifiable attributes. Mark Miller, of DevExpress, also uses quantifiable metrics to evaluate the usability of a feature. For example, he measures the amount of mouse distance and keyboard costs required by each new feature. When viewing feature sets quantitatively, I often look at is what the introduction of an order of magnitude change in productivity would mean for the design of a product. I often tell people that the goal of one of my future products is to improve the writing process by a factor of ten.

 

August 03, 2006

Research Pipeline

On the last day of the Lang.NET Symposium, I sat through an interesting lecture on F# with Don Syme. Don Syme is a researcher at Microsoft Research’s Cambridge office. He and Andrew Kennedy previously researched and designed generics years before its eventual incorporation into the Whidbey version (2.0) of the .NET Framework. The research project was called Gyro. Don’s work was instantly credible because it was implemented on top of the CLR infrastructure.

Research Pipeline at Intel

I congratulated him on being able to navigate the research and development divide, and asked him how he was able to do it. Don said that he previously interned at Intel, which had an established notion of a research pipeline unlike Microsoft and trained researchers on the proper steps to facilitate research into development.

At Intel, researchers were more closely involved in development. They were required to make proposals and identify the stage of their research — prototype, design, development, so on... As each stage proceeded, more development resources would be allocated to it, roughly double the amount before. The researcher had to be associated with a group, and must be able name a number of development contacts. A critical success factor is the researcher ability to convince a development group to invest money and resources into the idea—to bring in important stakeholders.

The culture of integrating research and development at Intel is probably due to the founders, Robert Noyce, Gordon Moore and Andy Grove, all having doctoral degrees and conducting significant research. Robert Noyce coinvented the integrated circuit and, had he lived long enough, would have shared the 2000 Nobel Prize given to coinventor Kilby; compare that to Bill Gates’s intellectual accomplishments.

Because of his prior involvement in Intel, Don Syme successfully spearheaded generics into the CLR, and made sure the implementation of generics was of high-quality and completely orthogonal. As a result, generics introduced hardly any seams into the CLR. Underneath the execution engine, though, a single IL instruction may translate into substantial code as some features like generic virtual methods, generic interfaces, and generic static fields may use dictionary lookups rather than simple vtable dispatch.

Don also paved the way for additional research projects to migrate into development tools like LINQ, which was incubated in research as Comega. Future research projects, Spec# and Polyphonic C#, will probably migrate into the fourth iteration of .NET.

Google

This happens to be why I feel optimistic about Google’s chances, given both top management and the founders of the company are researchers themselves. Google also institutes the now famous 20% time and embeds a researcher into each product team. The executives recognize the bottom-up nature of innovation as well as the limits that bounded rationality places on top-down management.

 Update: Don Syme has a presentation on Tech Transfer at Microsoft.

July 30, 2006

Incompletely Undecidable

In one of my computer science courses, a professor prefaced his proof on the impossibility of translating from one language to another with the comment: “Next time you are ever asked to write a converter from Pascal to C, consider this.” I immediately thought, skeptically, “while arbitrary translation is impossible in general, translation between C and Pascal is actually trivial, since these two languages are virtually isomorphic.”

I have always been skeptical of the real world applicability of various proofs that place a limit of human knowledge, particularly with Godel’s Incompleteness Theorem in the mathematical realm and with undecidability in the computing realm. The standard proofs typically contain contrived, self-referential arguments. GIT states “this statement cannot be proved” cannot be proved true or false, while the classic undecidability problem, “the Halting Problem,” defines a halt function, which basically determines if any function halts on input, and uses that same function as input.

Such proofs cause people to exclude useful programs like powerful static checkers from consideration, when, in fact, most interesting inputs that occur in practice are solvable.

In this introductory paper “Extended Static Checking for Java” by the developers of ESC/Java, the authors come roughly to the same conclusion:

730esc

The horizontal line in Figure 1 labeled the “decidability ceiling” reflects the well-known fact that the static detection of many errors of engineering importance (including array bounds errors, null dereferences, etc.) is undecidable. Nevertheless, we aim to catch these errors, since in our engineering experience, they are targets of choice after type errors have been corrected, and the kinds of programs that occur in undecidability proofs rarely occur in practice. To be of value, all a checker needs to do is handle enough simple cases and call attention to the remaining hard cases, which can then be the focus of a manual code review.

Update: Interestingly, the not-so-reliable Wikipedia actually indicates the halting problem is decidable for deterministic machines with finite memory, which pretty much covers all existing computers and puts more wood into the fire.

June 20, 2006

.999... = 1

I looked at this post by a mathematics teacher explaining that 0.999… = 1.000 via Digg. I always assumed that this was common knowledge, but Digg warns that “readers indicate that this story contains information that may not be accurate.”

I don’t recall encountering this relationship in school but only stumbling on it on my own as a child. The poster includes an algebraic proof and the calculation of a geometric series, but he doesn’t include my own childhood observation.

To arrive at the fractional representational of a repeating decimal value, take the repeating digits and divide them by a number consisting only of nines with the same number of digits.

So,
.000… becomes 0/9 = 0.
.111… becomes 1/9.
.222… becomes 2/9.
.333… becomes 3/9 = 1/3.
.999… becomes 9/9 = 1.

Similarly,
.090909… becomes 09/99 = 1/11.
.121212… becomes 12/99 = 4/33.
.333333… becomes 33/99 = 3/9  = 1/3.
.868686… becomes 86/99.

Any repeating decimal is a rational value, expressible as a fraction of two integers; all rational values are repeating decimals. For any arbitrary base b, dividing any integer from 0 to b-1 by b-1, results in that integer turning into a repeating number in base b.

This knowledge can be used to quickly convert a number to its rational form with a numerator and denominator. Another cool algorithm for doing this, but that suffers from rounding errors, is to take the integer portion of a number as the partial calculation of the number. The remaining fraction can then be calculated recursively by using the reciprocal of the calculation of the reciprocal; the reciprocal of the fraction, whose magnitude is less than one, will necessarily have a non-zero integer portion that can later be stripped.

June 13, 2006

Microsoft Stock

Due to recent stock price activity, Dare lost his down payment on his first house. My advice for him is to hang in there. History has shown me that stock price is in the doldrums up to the day of each major software release, only to rocket shortly afterwards, which I guess illustrates that technologically unsavvy investors are unable to appreciate new technological advances before they occur. 

The market is excessively short-sighted. Microsoft has significant (relatively speaking) upgrades of two core software platforms in the pipeline, and I sense latent demand that will manifest itself in dramatic rise in the stock price.

I am not sure that I wholly subscribe to the theory that the stock price is the present value of all present and future dividends—a relationship that holds even if the stock is from a growth company that doesn’t offer dividends.

How would that explain the rapid rise of stock prices in general when pension funds were deregulated and allowed to invest in the stock market in the 1990s? The addition of Microsoft to the S&P 500 index and the DJIA compelled index investors to pour more dollars into Microsoft at the going price. The law of supply and demand dictates that increased demand would drive up prices, which would negate the assumption that value of a firm depends only on earnings. One way out of this predicament is to say that stock price is also positively correlated with liquidity, which increases with greater availability of buyers. Too bad all the major indices include Microsoft this time around.

There’s clearly some psychology involved in the stock market, but that doesn’t negate the theory, since the stock price is actually the “perceived” present value. This does give a opening for the “technology enthusiast” to make a bundle. Since mainstream investors are slower to recognize the impact of new technology than those inside the industry and can typically only see the immediately past, the stock price tends to “correct” itself only after the revenues of new products pour in. 

Disclaimer: These are just my thoughts. I am not responsible for any financial downside that you may experience. Take my advice at your own risk. (Any upside, you can contact me through email.) 

May 29, 2006

Playing with Office

I continued playing around the new versions of Microsoft Office to check up on changes. I have to look at every features again, because anything could have changed. The Office beta website provides minimal details.

Clearly, there’s a huge investment in the user interface. I wonder how much time was spent on galleries, or if that work was delegated to interns. All of the little details in the user interface has been fixed, such as Word’s nonstandard selection highlighting. The new fonts are beautiful as is everything else.

A number of commands appear to have been consolidated. Some functionality has been lost, probably due to a poor showing in usage results from instrumented versions of Office. For example, full screen and reading layout, once independent concepts, have been consolidated into one button, so Normal View in full screen mode is no longer available. Incidentally, Normal view has been renamed to “Draft” view. There actually used to be an additional Draft view distinct from Normal view, but that view wasn’t available by default and required customization. I was desperately looking for my macro support. Apparently, the developer toolbar needs to be turned on in the options dialog.

My favorite feature in Word is native, inline support for equations, which use to be provided by a third-party addin. That and bibliographies (which are no longer very useful for me now) were features that I longed for since college, and whose presence now in Word gives me some sense of completion.

Even with all the new icons, the initial working set of Excel and Word appear to be about the same as in Office 2003. There also seems to be some reclamation of memory after a long idle.

I also looked at how Office will impact my own product plans.

One item that caused me concern earlier was that, post 2007, Office will have a different version for each type of worker. A sales person will have a sales version of Office that will be different from one use by an R&D professional.

Word’s grammar checker is still poor. I ran an document, written by a non-native English speaker and filled with numerous obvious errors, and Word responded back instantly “The speller and grammar check is complete.” At least, it did not have me walk through instances of passive voice and long sentences this time.

 

May 25, 2006

Office 12 UI

I have been playing around with the second beta of Microsoft Office, and I am very impressed with the changes made to the user interface.

In attempting to design a fresh new UI and distinguish it from those of other software companies, though, Microsoft is in a weaker position than ten years ago, partly from its own doing. Ten years ago, when Office 97 was introduced, it took years for software companies to replicate all the new controls like commandbars. Now, because of advances like .NET and Windows Forms and a flourishing ecosystem of component providers, thousands of applications may sport the Office UI even before Office 2007 ships.

To demonstrate my point, I will completely redo the user interface of my own application in the next three days to incorporate Office 2007–style ribbons instead of the menus and toolbars of yesteryear. In addition, the icons will be more stylish and Vista-compatible than those in Microsoft Office. My application UI was starting to look stale, despite the additional graphics, controls and animations I added this month. I’ll have screenshots of the new Office look by Sunday morning.

With Avalon coming in the horizon, things get even murkier. Component developer, Phil Wright, speculated a while back on the disruptive impact of Avalon on the marketplace for controls in his post, WPF Tidal Wave. With Avalon, third-party software applications may even look flashier than Microsoft Office.  Some may say that was already the case with the last couple versions. 

May 23, 2006

Google Interviews

Chris Sells points to a blog post in which someone undergoes two days of interviews for a contracting position at Google. 

The poster mentions a Google interview question that refers to the famous birthday paradox. However, the poster seems to have recall the interview question incorrectly, as it has an trivial, uninteresting solution. The poster stated that the birthday of one of 9 people in the parties must match his, rather than anyone else at the party. Even then there would only be a twelve percent chance of getting a matching birthday between any two persons. The interviewee apparently struggled a bit through the problem, before subsequently being offered a job many weeks later (apparently, not the first or second choice). 

Google seems to have topped Microsoft in its approach of recruiting and interviewing eggheads. For example, the search engine company, in the past, has placed in billboard ads complex mathematical puzzles, which when solved directed the person to a website

Contrary to popular belief, I always felt that Microsoft interview questions were actually rather straightforward for seasoned, talented software developers; the inability to correctly answer a question is a flashing red sign. Google, on the other hand, seems to screen for higher-level mathematical reasoning. This explains much of my special affinity for a company founded by two PhD students.

Despite this, I am not sure that screening applicants through puzzles should be the only approach. (Google, by the way, also puts equal emphasis on ability to work on team.)

May 19, 2006

Professor Sleator

I was doing a search on persistent data structures, and came across this paper by CMU Professor Daniel Sleator in the late 1980s. I have encountered his name so often in the course of my work, I wondered if we shared similar interests.

Daniel Sleator coinvented the splay tree data structure, which I have used quite often. He arrived at the startling conclusion that just by rebalancing the most recently accessed node, one can obtain amortized logarithmic behavior for most operations. Sleator also founded the Internet Chess Club, the most popular chess site on the Internet and one which I was once a member of. 

Most importantly, for me, he codeveloped the link grammar parser that I used as the starting point for my own natural language parser. Link grammars are based on semantic relationships between words versus syntactical grouping of phrases. Sleator’s algorithm for parsing English text is very elegant. The open-source wordprocessor, Abiword, reportedly uses the parser for grammar checking.

One of my proudest accomplishments is having noticed an optimization that he missed and coming up with an entirely different simpler, faster, and more general parsing algorithm. I wonder if I should make it public. 

Anyway, great minds think alike…

 

April 01, 2006

Microsoft Circles

Wonder what Microsoftees (or Googlers, etc) are thinking? You can find out through circles.

Amazon has a Microsoft Circle (among other circles) in which the online store ranks top-selling or uniquely popular items at Microsoft Corporation including books, dvds, toys, electronics and music.

I noticed the book Corporate Confidential : 50 Secrets Your Company Doesn't Want You to Know---and What to Do About Them was next to the top of the list. It’s an Art of War book about navigating corporate politics. I wonder why supposedly meritocratic Microsoftees are reading this.

"Your number one job is to keep your job," Shapiro, a former human resources executive, writes in this informed and disillusioned take on the corporate life, so don't ever "publicly complain, disagree or express a negative view," take more than one week of vacation at a time, "volunteer," or "tell anyone what you're doing." When asked to do anything, acceptable responses are "sure" and "of course," always accompanied by a smile. Your dress style "should match as closely as possible the style of those at the top." Don't make friends at work-it's "deadly" to want to be liked. The book reads like a guerilla survival manual for the employment jungle written by a hardened survivor ("Do you feel there's something...looming over your career, but can't quite put your finger on it? It's not your imagination. It's real."), and explains why companies preach enlightened attitudes-but don't practice them-and why managers and co-workers will not tell you about your career-limiting moves. Though Shapiro's this-is-war outlook may fit some workplaces, her mercenary advice won't work for people whose number one job is to get a job that doesn't require these sacrifices.

The book was previously mentioned by Mini-Microsoft, and, while the only evidence that I have is that Cynthia Shapiro lives in Seattle, Washington, I can’t help thinking that she previously worked at Microsoft.

Google Circles also has a list of the most popular significantly unique searches at Microsoft, two of which include searches for jobs at Google and Apple. (This may be an April Fool’s joke.)

 

March 02, 2006

Origami

Engadget has details about the Microsoft Origami Project, which has been generating a lot of buzz recently. Microsoft also has a mysterious website call origamiproject.com.

Upon reading the Engadget article, I felt a strange sense of deja vu. Maybe it was the blog post that I wrote a couple of years ago? —> Move Over, PocketPC.

March 01, 2006

The Microsoft Touch

Microsoft is often noted for its marketing genius, but the following set of amusing videos and articles illustrates that the company’s ultra-rationalist approach often fails the fuzzier side of its marketing efforts like advertising and branding.

February 16, 2006

Ridiculous EU Fine

The European Union is planning to fine Microsoft $2.4 million daily for not providing technical information to allow rivals to work with Microsoft protocols.

Microsoft submitted 12,000 pages of technical documentation, which it claimed took hundreds of employees and contractors over 30,000 hours to create. The EU hired an independent college professor to evaluate whether the documentation performed its intended purpose, and the professor claimed it did not.

Microsoft then offered access to the source code as well. The EU rejected the source code as irrelevant to its demands, because developers are required to produce documentation with code. It appears the officials never heard the saying “the code is documentation” or read Cusumano and Selby’s “Microsoft Secrets,” in which the authors observed that Microsoft “programmers in general did not write detailed designs, but went straight from a functional specification to coding in order to save time and not waste effort writing specs for features that teams might later delete.” I have heard that software engineering is a more formal discipline in Europe than in the U.S., so the officials are probably working from different cultural assumptions.

The whole EU position seems like a no-win situation for Microsoft. What Microsoft is offering, documentation and source code, seems more than sufficient. Of course, source code is relevant; it often is the best documentation. I wouldn’t be surprised if Microsoft is offering more documentation than their own developers are using.

I also question the use of a single consultant from academia to evaluate Microsoft’s offerings—someone who is not a practitioner and may have unrealistic expectations from the documentation (something as user-friendly as MSDN)—and also whether one person can adequately evaluate complex, low-level interfaces, developed by many people. I suspect no documentation would pass his test.

February 07, 2006

Programming Languages in the Future

In the future, mainstream programming languages will:

  • allow base classes to be extended
  • eliminate several new classes of errors from the language by design
    • accessing arrays out-of-bounds
    • dereferencing null pointers
    • integer overflow
    • accessing uninitialized variables
  • introduce new concepts
    • dependent types
    • dependent functions
    • universal quantification
  • incorporate Haskell-style comprehensions
  • perform lenient evaluation (as opposed to lazy evaluation)
  • address concurrency through
    • software transactional memory
    • implicit data and thread parallelism

So says Tim Sweeney who gave a talk at POPL entitled “The Next Mainstream Programming Languages: A Game Developer’s Perspective“ via Lambda the Ultimate.

 

January 26, 2006

Software Paradoxes: The Cost of Trying Too Hard

Software development is full of paradoxes, the classic of which is Fred Brook’s claim that adding more programmers to a project tends to produce the opposite result of longer development times and inferior products, primarily due to quadratic increase in costs of communication between team members.

Raymond Chen likes to point out that some of the lessons taught in school may actually be counterproductive in the real world. He presented at PDC last year “What Every Developer Should Know,” which I mentioned in my post on algorithmic complexity. Two recent posts of his mention the pitfalls of two common algorithms/data structures with examples from the Windows operating system:

I am actually fond of splay trees, because of their simplicity (compared to AVL and Red-Black trees) and amortized behavior. The disadvantages that he points out—such as inconsistent search times—can easily be mitigated.

But I will agree that, despite their sizable presence in data structures/algorithm books, in most applications, advanced string searching algorithms such as KMP and Boyer-Moore are rarely more advantageous than naive search algorithms. This is because the search strings are usually small and the code for naive algorithms is typically much simpler and maintainable.

January 25, 2006

Innovation at Microsoft

I frequently mentioned Microsoft innovation in the past. I encountered some interesting observations about Microsoft innovation in the past month:

  • Eeyore designs Microsoft’s product: “It makes me think of how many feature meetings I've had and what a small percent of those features have actually ever shipped. Not that every feature is a good idea, but it's damn near wake-worthy sometimes for a feature to actually get out into shipping bits. Que Eeyore: ‘Oh no. Now we have to support it. I suppose a hotfix request will come in any moment now’”
  • Integrated Innovation is the Microsoft Way: “Microsoft's corporate culture is very much about looking at an established market leader then building a competing product which is (i) integrated with a family of Microsoft products and (ii) fixes some of the weakneses in the competitors offerings… when Microsoft does try to push disruptive new ideas the lack of a competitor to focus on leads to floundering by the product teams involved.”

With new product offerings like Office 12, Microsoft may be turning a corner, but it is hard to tell at this point. The more I read about it, the more I like. When Microsoft does notice a shortcoming and devote resources to it as in past software crises such as the Internet and security and currently in user interface design and graphics, they tend to eventually solve the problem in question. Maybe, innovation is one of those shortcomings waiting to be addressed.

Also, the company may be bogged down by the weight of addressing software fundamentals such as reliability, performance, security, and compatibility, which I initially thought were solved with the movement of the 9x platform to Windows XP. I was dismayed by the underwhelming PDC preview of Windows Vista, but then I realized that many improvements were under the hood and that at least sixty groups at Microsoft will not have integrated their offerings into Vista until the second beta. As an example of an under-the-hood improvement, Vista was sluggish when I first installed it in my laptop, but the operating system became springier and more responsive over time; this may have been due to SuperFetch and other memory optimizations in Vista.

Of course, Vista will be a great product, which fixes existing problems in XP and offers a few new things like Avalon, but it looks like I might have to wait for Windows Vienna to move beyond fundamentals to more dramatic, enabling technologies.

I don’t want to diminish the work of thousands of employees, whose job, if done correctly, is to make sure that their feature area silently succeeds, crashes less, or allows users to work more securely and confidently, but I think that we are are still way off from the potential of computer technology. 

January 06, 2006

Trojan Dialers and Identity Theft

A couple of recent personal incidents highlighted the security problems in the Internet age.

Trojan Dialers

Over the holidays, I helped my minimally computer-literate father clean up his work computer of spyware and viruses.

After he purchased DSL service for his private practice, his secretaries apparently made extensive use of the internet during work hours, which probably included downloading songs. He had no security software installed.

Shortly after, his computers, which are used for claims processing for his medical practice, became unusably slow and my father started noticing hundreds of dollars in extra charges from his telephone company. He blamed Verizon and later canceled his DSL service, but the mysterious charges continued. I suspected malware.

I discovered several viruses and spyware programs on his computer including the trojan dialer (!) which racked up charges on his telephone bill. Internet Explorer contained a number of malicious browser helpful objects with mutating names.

I had him purchased Norton Internet Security and another spyware cleaning software. Unfortunately, he wasn’t willing to reinstall his operating system on his computer. I warned him not to install a broadband connection on his work computers because his computer was compromised and his assistants could do further damage (or, at the very least, be Internet-surfing during business hours).

Identity Theft

Just yesterday, I received a call from the fraud detection department of my credit card company about two charges for approximately $4000 each. I refuted the charges and the card company canceled my existing account and issued me a new card.

I use my card infrequently and still had it in possession. I am very careful about my credit card information and monitor my financial accounts about twice a week. Identity theft should not be happening to me.

Looking at my statements, I determined that the most likely cause of the violation was some tampering of an automated ticketing machine for either the Amtrak or LIRR train station. There could have been other reasons such as skimming or theft of customer data from any company I previously purchased from, but those would involve transactions from several months ago.

UPDATE: The fraud may have been due to the use of a unsecured wireless network when visiting my sister at Philadelphia last week in which I paid my card balance.

December 24, 2005

Blog Name

I changed the name of my blog to “Smart Software.”

  1. The original blog name “.NET Undocumented” was too “dorky.” Family members and non-technical friends view my blog and I wanted it to be friendlier. (Just a few days ago, the person I was interviewing for admission to Harvard happened on my blog.)
  2. The name better reflects where my passions lie. The .NET platform for me is a means to achieving smart software, not an end in itself.

 

November 23, 2005

Office Open XML

Microsoft, in case you haven’t heard, just announced that they are submitting Office 12 XML to the ECMA standards body for standardization after feeling pressure from European Union. ECMA will produce the official documentation for the unencumbered use by all, but Microsoft will retain exclusive ownership of format.

This is the same procedure used by the C# standard, although I do believe that the standards body in that case actually forced Microsoft to make a few changes such as disallowing null method invocation. This submission to a standards body also mirrors the approach used by Adobe in standardizing PDF, which have been accepted by certain goverments, notably Massachusetts and the EU, as open.

Microsoft once had the dream (“Word Everywhere”) of making its formats pervasive on the Internet, not just the intranet, with the mere belief that simply building and distributing free Office document viewers would do so in the same manner as Acrobat Reader. Such goals, however, went unrealized because of the closed nature of the formats. Potential users needed to contact Microsoft to secure access to file format documentation.

If Microsoft had opened up their formats earlier, maybe its formats could have been somehow embedded into the fabric of the Web just as PDF. Maybe now it can be. One thing seems inevitable: more documents will be created by third-parties than by Microsoft software.

After a quick browse around the blogosphere, perennial naysayers are still offering their own negative spin to the announcement: (1) The formats are not truly open, since only Microsoft can extend them, and (2) Microsoft is forcing customers to deal with two document standards instead of one, OpenDocument.

My Own Tangle Wth Microsoft Formats

I previously looked at reading and writing Word binary documents from my own application. Available documents on the web were outdated (current as of Office 97) and access to the latest formats required a trip to Microsoft’s legal department.

I did manage to crack through OLE compound documents container and extract the main text and various other document records before determining that working directly with DOC files would be waste of time, especially since binary formats will become obsolete in another year and other products, such as OpenOffice, still have some difficulties with Office formats.

I also looked at some third-party vendors libraries. One company wanted me to pay a minimum of $60,000 plus substantial royalties. Others designed licenses for Intranet use and placed various restrictions that weren’t viable in mainstream application.

I decided to only read and write RTF and HTML directly. RTF has a number of advantages over DOC files: RTF is as open as Office 12 XML; retains all document features; is text-based and regularized so it is easy to parse and roundtrip; and includes public, up-to-date and full documentation. Aside from lack of XML and tool support, RTF has most of the advantages of Office 12 XML plus full support in all versions of Microsoft Word.

By reading and writing RTF, I also trivially enabled support for DOC and other document formats by calling functions to convert files in the Word object model, whenever my application detected an installed version of Microsoft Word in the system.

I will be supporting Microsoft Word’s XML format directly when it comes out.

November 15, 2005

Orcas and Open Specs

I have reading through a post on Microsoft’s future plans regarding Orcas (Visual Studio 2007) from my secret Microsoft informant, high up in the corporate ladder.

The big thing for me was that Microsoft plans to publish its internal product specs as they are written…

One thing that I want to start doing with Orcas is to be able to share specification documents as and when we write them with you so that you know what features we are thinking about.  The thing we need to be thoughtful is that at the specification stage, we would not know whether that feature would make it into a particular release or not.  So, we will have to think through how to make it clear to people when something is at a specification/design stage, when some feature is committed in a particular release, etc.  But like I have said before we are committed to continue driving more transparency in terms of what we are doing and how we are doing with you.

I have been clamoring opening up product specs in my prior posts

Published specs are the last veil left to uncover to achieve total transparency at Microsoft. Published specs means that external developers will have the same inside information about VS plans as non-VS employees already have at Microsoft.

We can finally eavesdrop into the conversation. These specs and the bug databases are the primary communication devices used by the various members of product feature teams—testers, developers, program managers, user ed, and so on.

The specs provide very rich detail—rich enough to put some MS bloggers out of business. These specs also provide the underlying motivation or justification for the feature.

Some other plans mention in the post are the following:

  • Agile processes in VS division— quicker and more responsive
  • Orcas is focused on new tools and designers for Windows Vista, Office 12, and WinFX
  • Incubation work on concurrency and parallel programming

November 08, 2005

Google AdSense

Simply by putting Google AdSense on my blog, an expenditure of about two hours of my time, I just increase my annual income by about $1000. How much higher can it go if become more aggressive or if my audience traffic grows through the passage of time?

Since most blog traffic comes through RSS, doesn’t it make sense for ads to be included in RSS feeds. The revenue potential appears huge (relatively). I wondered about this and it appears that Google is indeed beta testing AdSense for RSS.

However, I could see RSS ads impacting readership negatively, unless they are subtle and out of the way. Click-through rates should fall dramatically, because most feeds are skimmed quickly or skipped entirely due to their heading. Also, these ads may not fit well within certain types of views—newspaper-style columns—in various RSS readers. 

Blog Redesign II

I have recently redesigned my weblog and am planning to posting more frequently in the past. There are a few additional steps to enhance my weblog.

Blog Design & Usability

First, I am attempting to free my blog of the Top Ten Blog Design Mistakes according to Jakob Neilsen (via Jeff Atwood).

  1. No Author Biographies
  2. No Author Photo
  3. Nondescript Posting Titles
  4. Links Don't Say Where They Go
  5. Classic Hits are Buried
  6. The Calendar is the Only Navigation
  7. Irregular Publishing Frequency
  8. Mixing Topics
  9. Forgetting That You Write for Your Future Boss
  10. Having a Domain Name Owned by a Weblog Service

I am guilty of almost all of these abuses save 1, 2, and 6, most of which I will correct. In addition to these ten design mistakes, I am also attempting to become more proactive in responding to comments as Jeff Atwood  recommended.

Monetization

Second, I am also looking to monetize my weblog in a non-intrusive way through Google AdSense and Amazon links. It’ll be more like product placement rather than in-your-face advertising; I don’t want to scare my readers. 

For advice, I looked to Steve Pavlina’s podcast How to Make Money Without a Job. He, by the way, makes thousands of dollars through AdSense every month.

I don’t expect a substantial income, but I do believe that this will provide me with the incentive to produce more posts more frequently, cover my costs and be quite educational. So far, my first half a day of using AdSense netted me a dollar.

Not counting Craigslist, I made my first sale on the Internet through eBay in August, when I sold the book Windows Graphics Programming for $108 (substantially more than the $60 I originally paid) after a bidding war ensued. I was very surprised by my ability to sell the book at above cost, because previously I gave away for almost nothing some very nice books to the local used books store. 

November 02, 2005

Big, Bad Microsoft

There’s been some conspiracy theories about Microsoft’s decision to include a three-year old version of Word with Microsoft Works Suite 2006.

Outsiders automatically assume the worst from Microsoft. Most Microsoft employees are actually passionate about their work, which they see as contributing positively and enormously to the world. The mindset is not to milk customers, but to create additional value for which they get paid for in exchange. 

In a related note, Microsoft employees are not intentionally trying to snuff out third-party utility vendors, when a new feature is added to an operating system; they are just trying to address persistent problems that customers faced when using Microsoft products. When I was involved in the addition of OLAP capabilities in PivotTables, I wasn’t thinking about to put any OLAP vendors out of business, though we probably may have. I feel like that I am stating the obvious…

Microsoft’s decision to include an earlier version of Word has a perfectly valid explanation. The latest version of Microsoft Word 2003 (of Office 2003) runs only on Windows 2000 and XP, primarily used by businesses.

Microsoft Works Suite 2006 has to run on typical operating systems used by consumers, which include Windows 98 and ME. In that case, Word 2002 (of Office XP) is the best option. In fact, Microsoft is still selling Office XP for this very reason.

The mood has gotten better in recent years, though, due to Microsoft’s efforts at increasing openness and transparency.

October 31, 2005

Microsoft and Google Innovation

I have written about the perceived lack of innovation at Microsoft in the past. There has been an Wired article on the topic of whether Microsoft was an innovator or integrator.

One Microsoftee commenter remarked that Microsoft was indeed an innovator, but that these innovations are missed because they are packaged within larger, established applications. Bill Gates admits in the Wired article that Microsoft produces mostly “incremental innovations.”

I am inclined to believe that many of the PDC announcements such as LINQ and Office 12 ribbon interface are indeed first-rate innovations. Also, other inventions that come to mind include TabletPC and OneNote. However, there is still the lingering feeling that these innovations would have come to pass anyway… that they are not disruptive enough.

I spoke to an ex-Microsoftee who left to work for Google, and stated that I could not imagine Microsoft coming up with the product that I am developing—copying, sure, but inventing, no. We came to the conclusion that Microsoft can’t really embarked on these kinds of disruptively innovative products for a couple of reasons:

  1. Microsoft depends on a predictable, schedulable development process, which limits the amount of research. Microsoft does have a dedicated research division, which have improved their pipeline from research to development in recent years.
  2. Most developers that Microsoft hires are straight out of college with bachelor degrees. Most of them are therefore generalists, in contrasts to specialists, who have advanced degrees or domain expertise from years in specialized industries. The lack of specialization and depth in Microsoft engineers probably limits, to a great extent, the level of innovation to the incremental variety.

Interestingly, Google is somewhat different in both these respects:

  1. Google offers 20% time, which frees up one day of the work week for employees to do research projects, that may ultimately benefit the company.
  2. The two founders, Sergey Brin and Larry Page, were adamant that the future CEO, which happened to be Eric Schmidt, have a PhD degrees. I assume that the makeup of the rest of company is more heavily weighted to advanced degrees than at Microsoft.

Chris Anderson, an architect at Microsoft, recently defended Microsoft's practice of asking PhD's simple programming questions in interviews despite risking insult to the interviewee. I know that Microsoft used to be very pragmatic and business-minded and tended to eschew the (highly) academic, but I suspect the atmosphere is much different now. Joel reports that another ex-Microsoftee, who relocated to Google, remarked that “Google works and thinks at a higher level of abstraction than Microsoft.”

Hmmm… Will we see Google churn out interesting disruptive innovations faster than Microsoft? Let’s wait and see.

October 29, 2005

Programmer's Myopia -- Natural Language Grammars

I previously wrote about Programmer’s Myopia, an affliction that a developer acquires, when he becomes fascinated and absorbed by a particular programming concept. In the writing profession, authors are warned not to become too attached to their “baby,” which is probably best edited out.

I recently talked with another ex-Microsoft employee, Erich Finkelstein, who is also working with a Natural Language Processing company.  I went through some of the issues that I had developing a natural language parser. I let him know that I had a low opinion of mainstream NLP practices, that are taught in CS courses or are used in commercial software. Developers seemed to following a herd mentality. A lot of NLP software rely on statistical techniques, which are heuristic—basically, hacks that don’t provide any guarantees for their results; these statistical techniques often fail on short sentences, where little content is available. The other symbolic parsers tend to incorporate multiple stages and rely on phrase structure grammars.

In developing my parser, I wanted to accomplish several goals.

  • Simple and Expressive. As output, the parser would emit a data structure that would be expressive enough to represent all English phenomena (including some ambiguities as well as erroneous words) yet be extremely simple enough to manipulate and transform.
  • Automatic Correction. It should be able to deal with bad input text, such as omitted and extra words. In addition, the parser should automatically correct for grammatical, misspelling, and confused words. The latter two cases would be achieved a parallel parse of a word and its best alternatives.
  • Incomplete Sentence Parsing. The parser should correctly parse partial sentences that a user is currently entering.
  • Fast. The parser needed to be able to process sentences quickly and interactively in a desktop application environment, introducing no perceptible delay to the user.
  • Accurate.  The parser, of course, should generate an accurate parse.

Undaunted, I went on create my perfect parser, with the help of some natural language data that I licensed from various universities and the Linguistics Data Consortium. Along the way, I came up with conclusion that the Natural Language Processing techniques using phrase structure grammars typically taught in courses and in popular textbooks are very poor.

I firmly believe that “hard problems when solved often have simple solutions.” I think that software developers at universities and large firms, like M——t, who have built their own parsers, felt that natural language was hard problem so they built baroque data structures and algorithms and kept piling on complexity. They completely missed the simple solution. They figured that, since everyone is using phrase structure grammars and abundant tools and corpuses for them exist, they are pursuing the correct path of development.

PSGs owes its fame to outspoken MIT professor, Noam Chomsky, the “father” of linguistics, who discovered Context Free Grammars. The problem is that we are dealing with a mathematical formalism that is an imprecise and inaccurate representation of the natural language. I am almost reminded of my computer science professor, Stuart Shieber, who proved that natural language is not context-free. We abandoned “Freud” and his below-the-belt preoccupations, and we should probably sidestep ol’ Noam.

This reminds me of a joke in IBM's speech recognition research labs. "Everytime a linguist leaves the room, the speech recognition rate goes up."

This is why I think that dependency grammars are superior to phrase-structure grammars. Dependency grammars represent words as nodes in a graph and relationships between words are represented by a single link between the two words. For example, a noun and a verb node could be connected with a “subject”  or "direct object" link that describes the relationship of the two words.

  • Better model of English. Phrase-structure grammars use these artificial structures that linguists have built up, such as noun phrases and verb phrases, which are flawed. There are many types of adhoc phrases in the English language and each type of phrase can often play the role of another type. I think that dependency grammars model the English language far more accurately and in the same natural way as humans.
  • Well-factored. Dependency grammars are well-factored and results in tiny fraction of the rules of the equivalent phrase structure grammar. PSGs suffer from rule proliferation and don’t handle “gaps” from questions and clauses very well. PSG are too rigid. Dependency grammars have better support for idioms and collocations (word phrases that act as one word), handle sentence interruptions better, noise words. Any word of any part of speech can acquire any nontraditional role in the sentence, which is frequently the case in normal language.
  • Graph versus trees. In a few instances in the past, I have discovered that by using a more advanced data structure, trees instead of lists, graphs instead of trees, I can obtain a more expressive data structure and simplification of the overall algorithm. The properties of graphs are also well understood.
  • Simple data structures. A dependency parser only needs to emit a graph as output—a node for each word and a set of edges for each relation the word has to other nodes. Simpler data structures are easier to program, test, manipulate, and optimize.
  • Delayerization. Dependency parsers have less need for layering into tagging, parsing, and semantic phases. Each dependency relation is not just a syntactic relation but a semantic one as well, allowing to combine two traditionally separate parsing phases into one. This is important, because human parsing integrates all phases together. I also get tagging for free, since it adds minimal cost to parsing. My dependency parser is fast, and also easily incorporates minimal overhead parallel parsing which allows me to recognize when the user has substituted an incorrect word into the sentence and correct automatically as part of the parsing process a confusable word or a mispelling (that happens to be another real word).

The advantage of PSG is that everyone knows them. There are more tools and corpuses built for them in mind. Unfortunately, the problems with them have led researchers to adopt statistical methods to escape the complexity.

Did I fall into a programmer’s myopia trap with my dependency grammars, or did the rest of the computer science community with their insistence on phrase structure grammars?

50 Years From Now

I was speaking to a woman in her 80’s, and it was remarkable hearing about a different age in which she lived through World War II as a young adult. Fifty years from now, I will be over 80 and living in a changed world that will be every bit as different 1950’s were from today.

These are some of the published differences in the makeup of the US and world population. The world will look very different from today! What is shocking is the advancement of third-world and developing countries and the relative decline in Western countries.

  • The top ten most populous countries will consist largely of new countries such as Indonesia, Pakistan, Ethiopia, Nigeria and Bangladesh. The only countries expected to remain on the list will be China, India and a United States; India and China will swap places. European countries will fall out of the list because of the low birth rate; most won’t even make the top twenty.
  • Muslims may eventually outnumber Christians in the planet. By 2050, the Muslims will rise to about 28% of the world’s population versus 20% today while Christians will fall to 30%. The crossover point will occur soon after that due to the higher birthrate of Muslims. The growing proportion of Christians in Africa and South America means that Christianity will become less of a Western rel