About

I am a software developer in Seattle, building a new AI software company.

Ads

April 2009

Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30    

Categories

Ads


January 07, 2009

Smart Software Should Get Out Of Your Way

Nick Bradbury, author of several successful software products, writes that "Smart Software Should Get Out of Your Way."

If you believe the tech pundits, “smart” software should predict what we’ll do so it can perform the next action faster.  “Smart” software should automatically correct our mistakes.  And “smart” software should adjust its user interface based on the features we’ve used in the past.

Sounds nice enough, but I’ve rarely seen software do these things without causing even more frustration than it attempts to solve.  It ends up being less like a helpful coworker and more like that annoying braniac every office is plagued with who constantly interrupts you with advice on working smarter by doing things his way.

The trouble with existing "smart" software is that they rarely incorporate genuine smarts. Rarely is there any actual intelligence underneath the actions, but rather a set of crude heuristics like pattern matching.

At the low end of the scale is Windows Explorer, for instance, with the lengthy pre-scans that occur when inserting a flash drive or performing a file operation: A single picture among diverse files selects the Picture view with a "Date Taken" column in the folder, for instance. Slightly better is Microsoft Word, which all too often misses with its "Auto" features, which are based on raw document text. Towards the higher end of the scale is the Visual Studio IDE, which maintains a dynamic internal representation of the code base. Even higher along the scale are products like JetBrain's Resharper, which incorporates substantial code analysis.

Then he seems to take a dig on me.

We all know that guy – he’s textbook smart but socially inept.  Which is a good description of much of today’s software.

January 11, 2008

Principle of Most Power

Turing Completeness

Yesterday was Donald Knuth's seventieth birthday. I used to own the three volumes of his Art of Computer Programming, and have even quoted the book once in my blog regarding corountines. Although the content was valuable, I could not tolerate the use of the MIX assembly language with which Knuth used to implement his algorithms.

Donald Knuth also invented TeX, a markup language for typesetting. As proof of its greatness, Mark Chu-Carroll notes TeX remains dominant 30 years after Knuth wrote the original version. It also happens to be Turing complete. There are other widely used languages, which are Turing-complete, such as PostScript for generating graphical images. The template facility in C++ is Turing-complete by accident, and is used for metaprogramming and performing all sorts of optimizations that used to be done by the compiler.

Principle of Least Power

I encountered a few years ago, Tim Berners-Lee "Principles of Design," which lists a number of design principles for the web architecture.  One of the principles listed in the "Principle of Least Power," which states that we should use a low power language that is easy to implement and use.

In choosing computer languages, there are classes of program which range from the plainly descriptive (...HTML) though logical languages of limited power ...which include limited propositional logic, though declarative languages which verge on the Turing Complete (PDF) through those which are in fact Turing Complete though one is led not to use them that way (XSLT, SQL) to those which are unashamedly procedural (Java, C).

The choice of language is a common design choice. The low power end of the scale is typically simpler to design, implement and use, but the high power end of the scale has all the attraction of being an open-ended hook into which anything can be placed: a door to uses bounded only by the imagination of the programmer.

Computer Science in the 1960s to 80s spent a lot of effort making languages which were as powerful as possible. Nowadays we have to appreciate the reasons for picking not the most powerful solution but the least powerful. The reason for this is that the less powerful the language, the more you can do with the data stored in that language. If you write it in a simple declarative from, anyone can write a program to analyze it in many ways. The Semantic Web is an attempt, largely, to map large quantities of existing data onto a common language so that the data can be analyzed in ways never dreamed of by its creators. If, for example, a web page with weather data has RDF describing that data, a user can retrieve it as a table, perhaps average it, plot it, deduce things from it in combination with other information. At the other end of the scale is the weather information portrayed by the cunning Java applet. While this might allow a very cool user interface, it cannot be analyzed at all. The search engine finding the page will have no idea of what the data is or what it is about. This the only way to find out what a Java applet means is to set it running in front of a person.

The principle makes sense for the Web in which a new device or a new application with unforeseeable constraints may need to interact with or browse the Internet. However, I do not think that we should automatically apply this principle to other domains, yet I have seen this particular design principle crop up again in a few blog posts since then. There are widely used "standard" languages like PostScript, mentioned earlier, that are Turing-complete. When I look at XAML, I see a language crying for Turing power, because I see a lot of half-measures like resources, triggers, property expressions, templates, etc.

Another View -- Principle of Most Power

In "Software Factories and the modeling problem," Steve Maine posits that a triangular tradeoff between efficiency, generality, and precision exists for any modeling language, for which I don't think is necessarily true. I think that his examples of languages favoring generality but sacrificing efficiency involve procedural languages like C# and Cw, which are more complicated that purely functional languages.

My own work over the years have focused on designing a language that is Turing-complete in some respects and provides many of the advantages of functional and logical languages. My concern was that some of these declarative languages used in academia were not practical or fast enough for building applications.

The primary language is C#, but programs in the secondary language are constructed and executed at runtime. Being functional, the language consists of immutable symbolic expressions, which are much easier to analyze that procedural programs. The language also incorporates various forms of logical reasoning. The language is used for representing all types of data and knowledge including documents, mathematical expressions and natural language.

My other concern was non-termination. Execution of the language is through partial simplification, always terminating in a reasonable amount of time, enabling its use within an interactive application. I satisfy myself with a notion of Turing-expressiveness in which the partially simplified expression is extensionally equivalent to its fully simplified value.

Many Different Languages

Usually, advocates of the "Principle of Least Power" also believe that, in the future, people will be using all sorts of languages, because no one language can serve every purpose.

I do not agree with the idea of a need for a dozen different languages. I guess I would prefer just one. With all the new LINQ features, I am seriously considering moving all my scripts from various dynamic languages to C# 3.0 with Dot Net Script or Code Runner.Net.

One of the original goals of C++ was to end the use of many specialized languages by supporting the creation of libraries for custom data types through templates, operator overloading, and object orientation. One problem with operator overloading in the "C"-based languages was operator "overloading," because there were very few standard operators offered in the language.

Emerging trends include metaprogramming and integrated support for DSLs within the languages, allowing these DSLs to use the symbols and rich computational capabilities of the host language.

I think the various advantages and disadvantages of dynamic and static languages (and other language classifications) are based on existing implementations, and they should disappear over time.

This is not to say there won't be many languages in the future. We now have a common language runtime and a dynamic language runtime that raises the base level of new languages with automatic support for garbage collection and reflection. Microsoft is also working on the Phoenix project that promises to a common compiler infrastructure to enable many different languages to benefit from advances in parsing, compiler, and optimization technologies at the same time.

January 02, 2008

Human-like Reasoning

My goal in my static analysis work is not to solve general problems like the Four Color Theorem, but rather to simulate closely the human reasoning process to solve problems that are generated, comprehendible, and solvable by normal people. Practical and deep, rather than complete.

Humans write programs to be understood by other humans. Human understanding comes about instantaneously or with relatively little effort compared to the work performed by many analysis tools. We typically don't unroll loops and recursion, but instead "get" the meaning of programs by recognizing a few standard patterns. We look at a program and instantly see a sorting operation taking place, for instance. Programs typically consist of these recurring patterns. In many programs, the only complex data structures used are a standard few like arrays, hashtables, linked lists and stacks. My approach to reasoning adheres closely to the way humans think, even when more efficient computational approaches exist, so as to align my software with human performance; the one exception is when non-human-like approach is strictly better in every way, yet yields identical results.

Stephen Wolfram, inventor of Mathematica, wrote a post Mathematics, Mathematica and Uncertainty touching on some of these themes. Even with the more advanced symbolic capabilities of Mathematica--virtually infinite compared to more than my own software package, Wolfram encountered limits trying to use his own system to verify itself.

Sometimes we use the symbolic capabilities of Mathematica to analyze the raw code of Mathematica. But pretty quickly we tend to run right up against undecidability: there's a theoretical limit to what we can automatically verify.

Yet, I think that our approaches deviate somewhat in that Mathematica isn't geared to testing programs. Also, the Mathematica code base is not typical. Being a program for doing mathematics, it naturally consists of advanced mathematics, difficult for mortals to understand and involving non-linear arithmetic, which is theoretically undecidable.

The code was pretty clean. But it was mathematically very sophisticated. And only a very small number of experts (quite a few of whom had actually worked on the code for us) could understand it.

Another point that Wolfram makes is that, contrary to my approach, Mathematica typically avoids imitating humans but instead uses complicated, non-human-friendly procedures extending hundreds of pages in length, roughly the size of the entire analysis portion of my code base.

Think of almost any mathematical operation. Multiplying numbers. Factoring polynomials. Doing integrals. There are traditional human algorithms for doing these things.

And if the code of Mathematica just used those algorithms, it'd probably be quite easy to read. But it'd also be really slow, and fragile.

And in fact one of the things that's happened in Mathematica over the past decade or so is that almost all its internal algorithms have become very broad "polyalgorithms"---with lots of separate algorithms automatically being selected when they're needed. And routinely making use of algorithms from very different parts of the system.

So even if there's some algorithm for doing something that's written out as a page of pseudocode in a paper or a book, the chances are that the real production algorithm in Mathematica is hundreds of pages long---and makes use of perhaps hundreds of other sophisticated algorithms in other parts of the system.

But such algorithms were not constructed to be understandable---and much like things we see in physics or biology, we can see that they work efficiently, but they're really difficult for us humans to understand.

And in general, the way Mathematica works inside just isn't very "human friendly"... Like when Mathematica does an integral. It doesn't use all those little tricks people learn in calculus classes. It uses very systematic and powerful methods that involve elaborate higher mathematics.

Computation as the Ultimate Metaphor

Rodney Brook, an AI professor at MIT and author of Flesh and Machine, wrote this response "Computation as the Ultimate Metaphor" to a question posed by the Edge Foundation, "What have you changed your mind about during the 2007?"

The Edge Foundation is a site that I have long subscribed to by email. [No RSS feed is available]. The Edge promotes discussion by the "third culture," which, according to its words, consists

of those scientists and other thinkers in the empirical world, who through their work and expository writing, are taking the place of the traditional intellectual in rendering visible the deeper meanings of our lives, redefining who and what we are.

Rodney recently questioned his belief in computation as the ultimate metaphor. Computer scientists, he writes, are ultimately biased towards their discipline when they attempt to explain human behavior in terms of computers, just as molecular biologists see their "their level of mechanistic explanation as being ultimately adequate for high level mechanistic descriptions such as physiology and neuroscience to build on as a foundation."

Those of us who are computer scientists by training, and I'm afraid many collaterally damaged scientists of other stripes, tend to use computation as the mechanistic level of explanation for how living systems behave and "think".  I originally gleefully embraced the computational metaphor.

Such a pattern, he argues, in fact recurs regularly at different stages in the history of technological advancements.

If we look back over recent centuries we will see the brain described as a hydrodynamic machine, clockwork, and as a steam engine.  When I was a child in the 1950's I read that the human brain was a telephone switching network.  Later it became a digital computer, and then a massively parallel digital computer.  A few years ago someone put up their hand after a talk I had given at the University of Utah and asked a question I had been waiting for for a couple of years: "Isn't the human brain just like the world wide web?".  The brain always seems to be one of the most advanced technologies that we humans currently have.

I must admit that I do subscribe to the view of computation as the ultimate metaphor, at least in its applicability to the human thought process. While I admit that I may be biased as a software developer, I still retain that belief despite his arguments. It may even be true that the both the computer scientist and the molecular biologist are both right in their thinking, but that each one is exploring a different facet of human behavior that is more appropriate to each discipline.

The universe is a computer, and everything that happens in it--everything produced--is the result of a computation. Everything is a program. By the principle of universality, we can construct a functional expression for it all. My static analysis tool, for example, instantly produces a functional expression for every variable in the program. Each expression is basically a program showing how the value could be computed, but the expression can also be viewed as the value itself, possibly unsimplified.

The same applies to our thoughts and all our natural language utterances. My own work focuses on representing and manipulating natural language as little programs or functional expressions rather than using traditional forms of knowledge representation. Words are "evaluated" by applying their definitions. This has the nice benefit of a tighter relationship between syntax and semantics, better handling of ambiguity, and ease of use similar to other primitive data types. I only became aware of other research manipulating natural language in a functional way through this paper entitled "Realization of Natural Language Interfaces Using Lazy Functional Programming" mentioned in Lambda the Ultimate. "Oh, my god! they stole my idea," I thought until I discovered that functional Montague grammars were invented way back in the 1960s.

November 14, 2007

Semantic Computing

Walter Stiers from the Academic Relations Team at Microsoft wrote that Microsoft Research is accepting proposals for Semantic Computing. His post caught my eye, because of what I am working on, which can basically be described as semantic computing.

I like the phrase used, "semantic computing." It reminded me of the another phrase coined by the Data Access Team "conceptual layer" to refer to the higher level of abstraction offered by LINQ and expression trees.

My main concern was whether Microsoft invading my turf and whether I would have to put up a fight. It was just a false alarm as Microsoft seems preoccupied with understanding the web to aid in search requests: The program is entitled "Beyond Search" and has two tracks, Semantic Computing and Internet Economics. Microsoft's intent is laid out explicitly in the actual program description, which is to improve search and then milk the advertising revenue: "The total spent by Internet advertisers in 2005 is estimated at $8.3 billion, a growth of 13.3% from 2004."

I noticed that the industry's natural language and other "semantic" efforts seem focused on narrow range of uses such as search engines and command-and-control. Search seems to be in Microsoft's vision because it is demonstrably monetizable and currently dominated by mortal enemy Google. The company seems wary appropriately of putting money on risky black hole projects with poor business prospects.

It's going to be a while before we see deep understanding in every application. Well, it already takes forever for new features to be added to existing applications, where fifty developers add less than fifty features over a couple years.

May 15, 2007

Symbolic Computing

I picked up from Slashdot that Mathematica 6, a program for doing computer algebra, was released. Among the features are equational theorem proving, which is similar to the work that I am doing.

More than any other product, Mathematica embodies symbolic computing, and this recent post on the Wolfram blog, Symbolic Programming Visualized hints why. The Mathematica language is based entirely on repeated transformations on symbolic expressions through pattern-matched rules.

I have never seen any mention of symbolic computing in anyone's dreams for future programming languages. Instead, I see a laundry list of incremental features for the next "big" programming language. Why can't the language be small, incorporating only the most expressive ideas.

Symbolic computing is the blind spot of the technology community. Even Turing thought of both brains and computers as manipulating symbols. Methinks, several years from now, we will probably see again the next revenge of LISP, which are operations on symbols.

According to Microsoft researchers in their roadmap  "towards 2020 science," symbolic computation will not be integrated into programming languages until 2012. (We do see a precursor in LINQ expression trees, being introduced in Visual Studio Orcas. There are also glimpses in some of the newer functional languages.) This gives me five years headstart.

If we want intelligent software, symbolic computing is the way to get there.

March 16, 2007

Code and Data

Usually, when I mention functional programming to other developers, they will unconsciously close their arms and legs—a sure sign of resistance. These developers are naturally suspicious of another new, better “programming” paradigm, so I have learned to zip my mouth. They don’t see the possibility of code becoming more declarative and looking exactly like specification. My blog traffic has diminished after I have started talked about programming in a more functional way.

In the MVP summit, I talked to a developer, who was a known functional programming advocate, about how I merge the notions of code and data.

I mentioned my static analysis tool, NStatic, and how I use everywhere an “expressions” data structure, which is like an immutable abstract syntax tree, but also a functional language in which I can evaluate any expression to its normal form. Expressions are used to represent such things are traditional algebraic expressions, code, natural language, and all sorts of other documents.

Document Operations & Transforms

In the case of a wordprocessor that I am also developing, documents are represented as expressions. In addition, document operations are constructed as functions in my expression language which are then applied to the document expression to return an entirely new expression.

The advantage of this approach is that I can support application-wide functionality by applying a transform to all document operations in my application. Transforms are functions that convert any function into a new function by recursively walking through an expression and applying changes based on pattern-matching (very  much like a derivative or Fourier transform). Control-flow operators like lambdas and conditionals themselves have higher-order transformations that work with any function, and thus any transform.

Transforms can only easily be implemented over code written in a functional way. Code in other paradigms would need to be converted to a functional representation first, as I have done with NStatic, and then converted back (or retained) after the transformation. A common refrain among FP enthusiasts is that functional code is easier to prove, but what the real benefit is that functional code is easier for a computer to analyze and manipulate--ie, easier to transform.

Have you ever written an application in which the same transformation needed to be manually coded into a number of different classes? The transform is stored inside our head and then mentally applied to an each method.

Such transformations couldn’t be previously automated because functions were opaque and the language mechanisms like virtual functions are too coarse. With transforms, new application-wide functionality can be written just once, not interleaved within code in several other unrelated classes. I can also easily integrate thirty-party functionality or externalize features this way, because the transformations are not burned in at compile-time.

In Microsoft Secrets, Michael Cusumano wrote about Excel's "Am I Done Yet" list in page 319, a list of concerns that must be considered for every new application feature added:

Microsoft projects have also been compiling metric-based checklists to help determine feature and product completion. For example, Excel has a six-page list of criteria entitled "Am I Done Yet?" This groups completion criteria into twenty-six categories, such as menu commands, printing, interaction with the operating system, and application interoperability (see Table 5.4). A program manager, developer, or test uses the checklist to help evaluate whether a feature is complete.

I wonder how many of these items in the entire "Am I Done Yet" list would just disappear, if program features were written using transforms. Probably a lot. Transforms would make things like revision marking, selection tracking, simultaneously editing a lot easier to write. The application developer could just focus on writing the essential feature and be rest assured that transforms would take care of much of the rest. 

Code & Data And Lisp

The developer whom I spoke to in the summit then brought up that Lisp supports treating code as data, simply by quoting code and performing explicit evaluation using the “EVAL” operator. Having coded in the language in the late ‘80s, I was already familiar with Lisp.

The problem with LISP is that it is not as tight and clean as it could be. Unfortunately, the way it mixes code and data is seen as the correct approach, and thus people look no further at alternative cleaner approaches.  LISP still distinguishes between code and data: Functions are opaque (their definitions are inaccessible) just as in imperative programming languages. Also, code stored as data can only be executed once all free symbols are removed. For example, (+ x 1 2) would evaluate to undefined symbol error or, if the symbol x were quoted, a type-mismatch error rather than the correct approach (+ x 3).

However, if the standard LISP functions could already operate on symbols, then both quoting and calling the EVAL function, would become unnecessary. In addition, if execution proceeded lazily, LISP macros would also become unnecessary as every function would also be a macro.

December 28, 2006

Hard Problems, Simple Solutions

In my previous post on Fabricated Complexity, I wrote about a quote that I found myself repeatedly agreeing with:

The solution to a hard problem, when solved, is simple.

A commenter remarked:

I disagree, though - certainly many things are overcomplicated due to reasons you state, but this doesn't mean every hard problem has a seemingly simple solution.

Two things I want to point out: (1) Notice that I didn’t say every hard problem has a simple solution, since not every hard problem is solvable. (2) This, by the way, is not a hard belief of mine.

I think the basis of my “belief” comes from experiences with functional style programming, where the solution of the problem often turns out to be its specification. I found that the new functional features in C#, iterators (essentially, an efficient implementation of functions that return sets) and anonymous functions, allow me to uncover more simple solutions.

What’s a moderately hard problem to use as an example? Let’s see. Perhaps, building a generalized regular expression matcher might be. Microsoft’s implementation of regular expression matching over strings is spread across 24 files and 14,455 lines of code including comments and whitespace.

Let’s see what happens if we switch to a functional style. We could use the implementation of lazy lists, LazyChain, from the blog of Wes Dyer (developer in C# team) and then port the 14 line Python implementation of a regular expression engine to C# using iterators and anonymous functions. The regularly expression would be composed functionally rather than through a string:

re = Sequence( term1, Plus( term2, Alternate( term3, term4 )), term 5)

and the results would be returned as a collection by applying the regular expression to the list to be searched:

results = re( list_to_search )

I contend that, with this approach, we can get a reasonable implementation of regular expressions over lists of arbitrary types in less than 200 lines of code, which is two orders of magnitude more concise than Microsoft’s more specific implementation over strings. I’ll produce a port of this one day and post it in my blog.

In another point, the fact that a human outperforms a computer in a given programming task serves as a cue to me that maybe the computer algorithm is poorly written. In such cases, I look for an alternative functional implementation which tends to more accurately model how we really think.

Let’s look at resolution-based theorem proving, for example. For a large problem, I wonder if its just faster for the prover to extract from the clauses in the knowledgebase a functional expression equivalent in truth value to the user’s query rather than actually performing the proof. Quantified expressions get turned into lambda expressions, and boolean operations into conditionals. This extraction might be possible in polynomial time, after which we could go about reducing the returned expression to normal form, which is far more efficient than performing a lengthy nondeterministic search.

December 20, 2006

Deduction versus Reduction

In my last post, Software Verification, I criticized the field’s heavy reliance of theorem provers based on predicate logic. I mentioned some obvious problems such as exponential proof procedures, but a Coverity presentation “Selling an Idea or a Product” pointed me to other nonobvious problems. A slide (#17) titled “Myth: more analysis is always better” hinted that more analysis does not always improve results; it can sometimes produce worse results.

  • Ease of Diagnosis – The longer the chain of reasoning, the harder the bug is to reason about, since the user has to manually emulate each step, but, unfortunately, long chains of reasoning are also needed to find some high-level bugs.

    Another related example is the Simplify theorem prover, which can determine if linear arithmetic constraints are contradictory using the Simplex Method. However, such a general procedure may not be very advantageous, since people will typically write simple inequalities that they can easily reason about. A general analysis consumes more resources than a narrow solver. Secondly, a human being may not be able to understand a complicated error (an empty intersection of several inequalities) reported by the Simplex solver and simply dismiss it as a false positive. Once again, such errors are not likely, because humans do not typically write such code in the first place. [I believe that the Simplify prover actually uses a more optimized algorithm, Nelson-Oppen, for simple inequalities.]
  • False Errors – The greater the number of steps, the more likely that the analysis contains an error. If there is false information in the knowledge base, then any assertion can be proved via a refutation-based proof procedure.

The alternative to logical deduction is functional reduction in which expressions are continually reduced to normal forms, so that equivalent expressions will eventually look the same. In other words, algebra not logic. The benefit is greater efficiency due to the elimination of much nondeterminism, deeper inferences, and more opportunities for dynamic programming to tame the unwieldy beast. This approach is also referred to as equational logic and is very similar to high-school algebraic proofs.

Last year I wrote a post, On Intelligence, which is relevant to this post. In it, I quote the words of Stephen Wolfram, on the subject of logic and rules…

But what about capabilities like logical reasoning? Do these perhaps correspond to a higher-level of human thinking?

In the past it was often thought that logical might be an appropriate idealization for all of human thinking. And largely as a result of this, practical computer systems have always treated logic as something quite fundamental. But it is my strong suspicion that in fact logic is very far from fundamental, particularly in human thinking.

For among other things, whereas in the process of thinking we routinely manage to retrieve remarkable connections almost instantaneously from memory, we tend to be able to carry out logical reasoning only by laboriously going from one step to the next. And my strong suspicions is that when we do this we are in effect again just using memory and retrieving patterns of logical argument that we have learned from experience.

The primacy of reduction rules (patterns stored in memory) over logical inference in human thought shouldn’t be surprising given the efficiency of the former; the latter is equally as “laborious” for humans as it is for computers.

December 06, 2006

Bullets Flying My Way

Larry O'Brien has been questioning my critique of Fred Brook's essay, "No Silver Bullets" in my post Lego Programming. Actually, he turned my post into a strawman argument and tied my name to an argument that I didn’t really make—that IDEs are a silver bullet. IDEs certainly have introduced a number of tenfold improvements in area; in most cases, however, IDE improvements are not the bottlenecks in software development. The ability to build an interface in a few hours instead of a few days doesn’t significantly impact a six-month project. Support for third-party controls and object-oriented frameworks, on the otherhand, have eliminated bottlenecks and resulted in dramatic improvements in delivery times. Lotus was able to develop Improv in six months on the NeXT, because of its object-oriented capabilities.

I wrote some facetious comments, which he took up on a second post "Bullets Over Wrong Ways." I'll answer those posts with a more persuasive arguments in another month. My silence now does not indicate agreement or resignation.

I don't want to be drawn into a debate right now, while I am trying to bring a product into a beta. I'll just reprint an edited version of my comment to his last post:

To some extent, I was humoring you in my comments; but I don't think the functional paradigm has been fully fleshed out, but will most likely become the dominant approach decades from now. Much of the Brook's thoughts on silver bullets rests on artifacts of imperative programming like state and side effects. If we eliminate state, for example, our program becomes more like a specification, which I feel also impacts some of the other costs Brook's mentioned such as testing and communication.

I am going to make my point in an upcoming post.

I don't think object-oriented programming is entirely distinct from functional programming. To the extent that object-orientation supports compositional programming, to the extent that an object can be interpreted as an inert function whose arguments correspond to the object's members, an argument can be made that it is close to a functional style.

That's a subject of a future post.