About

I am a software developer in Seattle, building a new AI software company.

Ads

May 2008

Sun Mon Tue Wed Thu Fri Sat
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Recent Posts

Ads


November 14, 2007

Smart Machines

Recently, I got a email from the reader...

You are only about money, yes? You want to start a new company for research and development into AI technology.. For the money.

What else? How else could anyone be so ignorant of consciousness? It is money the blinds.

I searched my posts related to consciousness, but only uncovered this one on Will Machines Become Conscious?

We are naturally resistant to the idea that machines can think because of society, religion, or maybe our natural desire to be special--to be center of our universe. A good illustration is the AI effect, where people discount advances in AI as AI after they have been accomplished. "AI is whatever hasn't been done yet." Wikipedia sums up:

This change of perception can be traced to the mystery being removed from the system: that being able to trace the cause of events implies that it's a form of automation rather than intelligence. Michael Kearns suggests that "people subconsciously are trying to preserve for themselves some special role in the universe".[1]. By discounting artificial intelligence people can continue to feel unique and special.

This tendency occurs with our views of animals, women, and other races as well. Psychologists joke, that throughout history they have often tried to defined mankind as the "only animal that can..." feel, think, reason, plan, reflect, and so on--only to narrow their assertions in the wake of new evidence such as animals with human-like abilities like the sign language monkey or the super smart parrot. (We also know that children raised in the wild lose many of these same abilities).

A related effect has been noted in the history of animal cognition and in consciousness studies, where every time a capacity formerly thought as uniquely human is discovered in animals (e.g. the ability to make tools, or passing the mirror test), the overall importance of that capacity is deprecated.

I think, in order to make advances in technology such as producing smart software as I am doing, one must abandoned preconceived human-centric notions of the world.

I sometimes think that AI research in the large companies are compromised by coworkers and executives (who control funding and project goals) who fundamentally don't believe in computer intelligence. One female pioneer, who invented COBOL, was criticized for her attempt by an executive and warned that “computer programs could not understand English." Nowadays, it's companies diverting natural language research to search technology.

In order for Alan Turing to envision computing, he had to believe that a machine could think. It may have been easier for him to take such an outsider’s perspective because his homosexuality at the time was taboo and considered a mental illness. He may also had to contemplate that the mind was also machine. As Jenna Levin  illustrates in her book A Madman Dreams of Turing Machines, Turing had a controversial mechanistic view of the human mind: (Via Lispmeister)

The human mind can also be reduced to a machine. This idea drives all the others as he runs on grass, past trees, over bridges, through cattle. States of mind can be replaced by states of the machine. Human thought can be broken down into simple rules, instructions a machine can follow. Thought can be mechanized. The connection isn't perfectly clear, but it is there, the catalyst of a great crystal. It is not just that thought can be mechanized. It is mechanized. The brain is a machine. A biological machine. The idea cools him from head to toe, a wave of understanding washing clean his confusion, his muddled notions, and his breath. Shock feels like this: There is no sky or earth. No time, no meaning. It's a throb—a hard silence, a pulse. It is colorless, tasteless, senseless. A white-hot explosion[…]

At the age of twenty-three and for the rest of his life he embraces, without reservation, a mathematics that exists independently of us—although we, by contrast, do not live independently of it. We are biological machines…. bound to mathematics and mathematics is flawless. This has to be true.

While the above was a fictional example based on facts, we do have an AI pioneer, Marvin Minsky, who shares thoughts more explicitly on whether computers can think, understand or even be conscious; he even questions whether humans are self-aware.

But if mind is machine, where does that leave free will? Like another AI pioneer, John McCarthy, I am a compatibilist determinist, which I think all determinists really are anyway. I am a firm believer that free will is an illusion as is Scott Adams. One of my readers, after having lunch with me, asked if I believed in determinism as if he able to glean this from my many blogs posts. I hesistantly said yes, not wanting to stir up a debate. In fact, my very first college paper was on determinism for my English composition course; as if predetermined, another classmate also wrote on determinism but from a biochemical rather than physical angle.

The New York Times article, “Free Will: Now You Have It, Now You Don’t,” reported early this year of experiments suggesting conscious choice to be an illusion, even as some philosophers and physicists continue to disagree. The New Scientists also chimed in soon afterwards with “Free Will – you only think you have it.” Hofstadter has an interesting video, Victim of the Brain, containing thought experiments on identity, consciousness, and determinism.

November 13, 2007

What's Wrong With Reason?

Earlier this year, Slashdot pointed to a set of Flickr photos of someone’s visit to a newly built creationist museum in Kentucky. I have often assumed that many creationists live in households and communities, where access to information was heavily regulated. After looking through the photos, I discovered that many creationists actually do see the same things that non-creationists see, but simply reached different conclusions. Below are three photos from a set of about a half dozen juxtaposing human reason as a faulty device in opposition to God's Word.

image image  image

I guess it make sense, if one takes scripture as absolute, literal truth to think that reason, despite its necessity for understanding scripture, could also lead us astray if it contradicts scripture.

Roman Catholicism (the faith in which I was raised) regards the stories of creation as metaphorical and assures us that reason is fully compatible with truth. Other Christian denominations, lacking a centralized organization and authority, are more willing to accept a literal interpretation. The Koran, the holy book of Islam, even proclaims itself to be free of imperfection.

There were other museum photos touching on topics such as Noah's flood or reconciling apparent contradictions between the world and scripture. I was a bit stunned by the directness of the comparisons, which could potentially introduce doubt to faithful visitors by exposing them to disturbing arguments alongside their creationist explanations.

Maybe, I shouldn't be surprised as I have seen such direct confrontation before. I have, for some time, been following the Uncommon Descent weblog, which promotes William Dembski's ideas on Intelligent Design attacks the "materialist" beliefs of its principal antagonist and promoter of evolution, Richard Dawkins. One of Dembski's frequent claims is the impossibility of computer intelligence, or, indirectly, of my software. I don't agree with most of Dembski's arguments, but reading his posts does help me to recognize any of my own biases and flaws in reasoning.

October 01, 2006

Turing Test and Loebner Prize Competition

A couple of weeks ago, the 2006 Loebner Prize competition was held. Back when I was at Harvard in the early 1990s,  the annual Loebner Prize Competition was created as the first real-life version of the "Turing Test," described in Turing's article "Computing Machinery and Intelligence" to answer the question "Can Computers Think?" Turing wrote:

It is proposed that a machine may be deemed intelligent, if it can act in such a manner that a human cannot distinguish the machine from another human merely by asking questions via a mechanical link.

The test is a natural extension of his earlier work on Universal Turing Machines, which can simulate any other machine, to the human mind. The test, however, is controversial. Searle famously counterargued with the Chinese Room experiment. Also, Turing himself made an unsuccessful prediction that the Turing test would be passed by 2000, but tech visionary Ray Kurzweil has willingly bet that Turing Test will be passed by 2029.

The first competition made big news on campus, especially since it was held locally, and I followed the results of the initial competition closely. Since I had a prior interest in AI and natural language processing, I envisioned that one day that I might be the one to actually win the $100,000 prize; unfortunately, the terms of the grand prize has since been expanded to include audio and visual input. The direction of my work has been increasingly intersecting with the aims of the competition, and so maybe one day (perhaps in ten years) it might actually compete.

Instead of summarizing the competition, I'll refer to the text in the Loebner Prize website:

The Loebner Prize for artificial intelligence ( AI ) is the first formal instantiation of a Turing Test. The test is named after Alan Turing the brilliant British mathematician. Among his many accomplishments was basic research in computing science. In 1950, in the article Computing Machinery and Intelligence which appeared in the philosophy journal Mind, Alan Turing asked the question "Can a Machine Think?" He answered in the affirmative, but a central question was: "If a computer could think, how could we tell?" Turing's suggestion was, that if the responses from the computer were indistinguishable from that of a human,the computer could be said to be thinking. This field is generally known as natural language processing.

In 1990 Hugh Loebner agreed with The Cambridge Center for Behavioral Studies to underwrite a contest designed to implement the Turing Test. Dr. Loebner pledged a Grand Prize of $100,000 and a Gold Medal (pictured above) for the first computer whose responses were indistinguishable from a human's. Such a computer can be said "to think." Each year an annual prize of $2000 and a bronze medal is awarded to the most human-like computer. The winner of the annual contest is the best entry relative to other entries that year, irrespective of how good it is in an absolute sense.

Here's a sample transcript from 1996 contest winner. There are many other transcripts available from the prize website. A conversation with some of these contestants can be had online: [TuringHub] [ALICE] [Jabberwacky] [Others]

My computer science professor Stuart Shieber, who, by the way, stoked my interest in natural language, wrote an critique of the contest, "Lessons from the Restricted Turing Test," in the Communications of the ACM journal.

Stuart noted the technology used by most, if not all, of the contestants are still very primitive. They are basically variants of the 1966 computer program ELIZA, just with a larger database of responses. These programs don't do any natural language parsing--relying instead of simple string searches and manipulation--and don't do any logical inferencing.

For example, a long time winner of the annual competition was the ALICE (ArtificiaL IntelligenCE) chatterbot. I downloaded the open-source software six years ago and was dismayed to see how primitive its technology was. Essentially, the program consisted of a database of common questions (with some wildcard support) and canned answers. Another winner, MegaHal, uses statistical methods (Markov models) to generate responses based on prior data such as movie dialogues, encyclopedias, popular quotatons, and hand-crafted sentences. 

Despite the simplistic technology, a few judges over the history of the competition have been fooled into thinking a computer to be a person, and, interestingly, some humans have been thought to be computers. Curiously, the original ELIZA did fool the assistant of its creator, Weizenbaum, into revealing personal information as did another low-tech program in this webpage detailing "How my program passed the Turing Test!"

The Loebner Prize Competition has mostly relied on tricks, such as simulating typing speed and entering non-sequitors, rather than smarts. If I ever had some free time in the distant future and joined the competition, I would rely on genuine attempt to replicate intelligence. It's the thought that counts.

Although Turing's 2000 prediction of a successful Turing Test failed to come true, Turing did rightly predict that computer programs would eventually defeat men in chess. The ability to play chess was Turing's hallmark example of human intelligence. Interestingly, "Turing Test and Intelligence," claims that in chess the Turing Test has already been passed.

Kasparov claims to be able to distinguish computer play from human play. During a simultaneous event over the Internet, he once stopped playing a certain game, saying that he was up against a computer (it was supposed to be human opposition). He was losing at the time! Kasparov has also claimed, while losing against a computer, that the computer was being given human assistance!!

It's interesting to see modern variations of the Turing Test include CAPTCHA's for preventing robots from posting spam, and Amazon's mechanical Turk. (Turk, by the way, was a chess playing machine centuries ago that was powered by a hidden human being.) 

June 14, 2006

Anagrams and Combinations

In my post on Google Interviews, I referred to the birthday paradox, which provides but one example of the astonishing results one can obtain through combinations.

I do try to harness the power of combinations in my own work to both break down the complexity of my software as well as to produce a greater semblance of intelligence—more on that in a later post. Complexity quickly disappears when one break a problem up to several orthogonal parts, a process I call logarithmic decomposition, which mirrors how a large number can described with a small number of digits—eg, the number of particles in the whole universe in just 120 digits.

In the Da Vinci Code, a cryptic code is discovered called the “So Dark the Con of Men.” It seems to be saying something in its own right, but actually is hiding the true message “Madonna of the Rocks.” It’s rather surprising and clever until one realises that the nature of combinations makes it quite easy to find sensible anagrams from just about any phrase. Take “The Da Vinci Code” which could be rewritten “Oh, even didactic,” “Novice did cheat” or “Deceit. Din. Havoc.” Another example, “Mary Magdelene” produces “Anagram medley!” All of these examples can be found in the New York Times “The Anagram Code” article which utilizes the ARS MAGNA anagram software program to find interesting anagrams. This anagram software has also inspired many other New York Time pieces.

 

 

May 26, 2006

Software Built to Replace a Human

I just noticed this blog post on “software built to replace a human” referred to by someone in Joel’s forms. That’s the whole raison d’etre for my software company, SoftPerson — hence the name.

Currently at work I'm writing some EDI (electronic data interchange) software that I know for a fact will perform a job currently done by a human being. Whether this person will be assigned to a new duty or laid-off I don't know, though it is a little disconcerting to think about the later.

There must be many, many developers out there working on software that will replace the job of a human being; how do they feel about it?

personally I'm not going to lose sleep over it (I don't get enough as it is!), though I can't help but feel a slight "moral twinge". I am firm believer in survival-of-the-fittest and I'm not about to give up my own job so someone can keep theirs, but at the same time I still have a sense of morality.

I don’t feel any moral twinge. If I do my job well, perhaps some people will lose their jobs, but many more people will be able to do things they never reasonably expected to before.

(Personally, in Excel, when I was helping develop new OLAP functionality into Excel PivotTables, I was well aware and feeling somewhat guilty knowing that many OLAP companies would lose business with free functionality that came with the combination of Excel 2000 and SQL Server Analysis Services. I am over it now, though.)

January 27, 2006

AI Hubris

Chris McKinstry, a researcher in artificial intelligence, recently committed suicide, after posting suicide notes in his personal blog and a discussion board on Joel on Software. (Ryan Park has more details.)

Joel’s discussion board, which I read regularly, provides very useful information for software development and marketing. The suicide notes Chris left were in Joel’s little known off-topic discussion board ?off. This action directly led Joel to shut down the uncensored board to squelch “psychopathic” behavior as he wrote:

I used to have, hidden so deeply that almost nobody found it, a discussion forum on this site colloquially called ?off. It was the official off-topic forum and was virtually uncensored. It was foul, free-wheeling, and Not Safe For Work. It generated quite a few severely antisocial posts, which were sometimes funny.

Anyway, it wasn't really appropriate. It didn't have anything to do with software, I didn't participate, and there was no compelling reason to host it on my servers. Over time, the number of reasons to shut it down increased. Today, the last straw was broken when one of my actual friends (you know, a real-life person) told me that the discussion group was getting downright disturbing. Some of the participants in the group had probably crossed the line from common obnoxious online behavior to downright psychopathic behavior. In a discussion group which prides itself on "anything goes," this was impossible to control.

At 6 pm today, I closed that discussion group, having learned an important lesson about anarchy.

What made an impression on me was that Chris founded Mindpixel, a Web-based collaborative artificial intelligence project that accumulated a database of human facts, similar to both Cyc and ConceptNet.

Chris was even interviewed by Slashdot for his ambitious project. In the interview, he made some remarkable statements.

My primary inspiration for the project comes from observation: I observed that computers are stupid and know nothing of human existence. I concluded a very long time ago that either we had to write a "magic" program that was able to go out in the world and learn like a human child, or we just had to sit down and type in ALL the data. When I was studying psychology in the late 80's I wanted to begin to gnaw the bullet and start getting people to type in ALL the data.

… I would store my model of the human mind in binary propositions. I would make a digital model of the mind.

I realized within minutes that a giant database of these propositions could be used to train a neural net to mimic a conscious, thinking, feeling human being! I thought, maybe I'm missing something obvious. So, I emailed Marvin Minsky and asked him if he thought it would be possible to train a neural network into something resembling human using a database of binary propositions. He replied quickly saying "Yes, it is possible, but the training corpus would have to be enormous." The moment I finished reading that email, I knew I would spend the rest of my life building and validating the most enormous corpus I could.

His suicide reminded me of cautionary stories from my Catholic high school of various philosophers, like Nietzsche, who fell victim to hubris and either later became crazy or committed suicide. I never really put any credence into those tales.

I am also an AI software entrepreneur and a blogger as Chris was. His vision of computer intelligence is very similar to mine, and sometimes I do feel a little crazy. Perhaps, I should be careful. The desert of AI is full of the dead bones from ambitious researchers, whose visions went unfulfilled.

 

January 13, 2006

Word Finder

In my post on Human versus Computer, I mentioned a Scrabble Word Finder program I had written several years ago. Here it is — Word Finder Zip File. This program allows one to perform pattern and anagram searches to locate every legitimate Scrabble word that matches.

WordFinder2

What I like about this program is how it demonstrates the benefits of using the computer to perform brute search on all the possible permutations of a word. I asked the reader how many different legitimate 3–letters words can that he can make from the word TEA. More than likely, the reader will fail to account for every possible word. Run this program, choose anagram search and find out. It becomes exponentially more difficult to find all combinations of larger words.

PS: I no longer have the source code, so I can’t really remove the Scrabble trademark except through a hex editor.

This program is free… since I doubt that  I could make any money off of it.

January 08, 2006

Intelligence vs Intellisense

My main gripe with Microsoft is that the company doesn’t know how to write “smart” software. (Much of the industry doesn’t either, but, since Microsoft is the leader…)

The current tendency among software developers is to determine the minimum work needed that can be done to help users. This is a sensible strategy driven by short-term business concerns. A more ambitious developer would ask what more could be done—what can humans still do (or do better) that computer programs yet can’t.

Microsoft invented “Intellisense,” in which an application attempts to behave intelligently while observing a user’s actions. Intellisense uses a set of heuristics that more or less work without a real understanding of the document. However, heuristics offer no guarantees to correctness, so Intellisense often works unpredictably or produces errors, which sometimes forces the user to waste time undoing the error. These errors are in all of Microsoft’s products, but they are more common in Office than in Visual Studio, because Microsoft Office doesn’t recognize structure within in documents while Visual Studio actually parses code in the background and therefore has richer knowledge of the user’s documents.

In contrast to Intellisense, “Intelligence” requires a genuine understanding of the document. This often consumes more time and memory. However, the benefits are significant: Since Intelligence performs deeper analysis and guarantees correctness, it can be used reliably to perform major transformations of the document like refactoring, whereas Intellisense, in contrast, is typically limited to auto-correction, limited auto-formatting, and auto-completion. To offset the additional processing overhead, Intelligence offloads “higher-level” work from the user and takes advantage of computer’s ability to outperform users in tedious, repetitive or brute-force activities.

General Principles versus Specific Rules

Intellisense typically requires hundreds of adhoc rules to find errors, mostly of a trivial nature. With Intelligence, a general purpose algorithm that has some genuine understanding of the document could perform the function of hundreds of rules. Whereas Intellisense utilizes specific rules, Intelligence focuses on general principles.

In my tool, NStatic, I strive for Intelligence over Intellisense. I continually ask myself how far I can take this, attempting to close the gap between errors that only human being can find and those that a computer can find. These are some principles that I code to:

  • Exceptions. Any code that inevitably causes an exception is an error, unless it is caught and handled by a proximate catch block.
  • Comparison. Comparisons and conditions should never evaluate to always true or always false, unless a named constant or a literal is present.
  • Infinite Loops. All code should be able to terminate. (Yes, I know that the Halting Problem is undecidable, but it’s still attemptable plus that reasoning never stopped the CLR from verifying code.)
  • Redundancy. Any operation which is redundant adds no value.
    • Non-operation. A complex expression should not always produce a constant result, except in a few situations. An assignment should change the value of the assignee. An expression should be used.
    • Dead code. All code should be executable.
    • Dead store. A variable assignment or initialization should be used before it is reassigned or goes out of scope.

Most analysis tools contain rules that perform some matching at the syntax level. Instead of constructing rules to catch specific instantations where the principles are violated, my tool simulates code execution symbolically and utilizes a constraint solver to find errors in a more general way.

A typical code analysis tool using heuristics may catch the following redundant assignment by checking if the left hand expression matches the right expression.

x = x

This heuristic would produce a false positive if the assignment was x[i++] = x[i++].

An intelligent tool would determine if the right hand side of an assignment evaluates to the value of left hand side before the assignment, such as in the following case.

x = a;
y = x + 1;
a = y – 1;

A heuristic tool might look for the specific case of the same variable being tested for different values within a conjunction as in the following case:

a == 4 && a == 5

but miss other cases that follow from general principles such as the following

x*x + a * a == 2 a*x && x == a

Approach

In trying to bring intelligence to software,

  1. I identify the actual rules and steps that people use to solve a problem in human terms. This often involves me researching the actual steps that humans use in a human setting.
  2. I accurately model the human concepts involved in code to avoid any difference between the computer representation and reality. (A couple good examples in my code include word senses and symbolic expressions.) This often trades off performance for full fidelity.
  3. The final code is usually a straightforward implementation of the rules and steps from point 1 into code such that the code often reads like an instruction manual for humans. Sometimes, I need to perform humanistic techniques like searches, normalizations, pattern matching and permutations.

This is how I approached, for example, the problem of determining transitional relationships between sentences in my natural language product. I consulted textbooks on English composition and identified the four general ways that sentences are linked from most explicit to least.

  1. Transitional keywords and phrases, which fall into several categories contrast and qualification (however), continuity (in addition), cause/effect (therefore), explanation (indeed), exemplification (for instance), summation (finally)
  2. Pronoun references & determiners
  3. Repetition of keywords (or their synonyms)
  4. Repetition of sentence patterns

Typically, natural language software focuses only on the first point, which is the use of transitional expressions. All the other points require additional work such as tracking pronoun references, parsing and analyzing sentence structure, or utilizing an ontology. Some researchers remarked that it was not possible to accurately determine transitions between clauses and sentences. I disagreed from my belief that if a human can deduce a relationship, a computer should also be able to. These researchers also developed their software under limited available knowledge (ie, without a dictionary backend).

There are still possible avenues of improvement in my upcoming static analysis tool. One area that humans best computers is in recognizing natural language within code. Since I have a library of natural language routines, I have asked myself if I could incorporate natural language understanding to locate another class of bugs that have previous eluded analysis tools.

November 05, 2005

Thinking in Ifs

Joel recently made this comment on Google versus Microsoft:

A very senior Microsoft developer who moved to Google told me that Google works and thinks at a higher level of abstraction than Microsoft. "Google uses Bayesian filtering the way Microsoft uses the if statement," he said. That's true. Google also uses full-text-search-of-the-entire-Internet the way Microsoft uses little tables that list what error IDs correspond to which help text. Look at how Google does spell checking: it's not based on dictionaries; it's based on word usage statistics of the entire Internet, which is why Google knows how to correct my name, misspelled, and Microsoft Word doesn't.

If Microsoft doesn't shed this habit of "thinking in if statements" they're only going to fall further behind.

I would tend to agree that abstraction is not a strength of Microsoft’s business-centric culture. Google has a history of utilizing general techniques besides bayesian filtering such as Map and Reduce and Page Rank. Even, Amazon employs collaborative filtering for personalization.

This view reflects some of my own thinking over the years, as I wondered about the future of control flow in mainstream programming languages. A lot of attention have shifted to constructs like continuations, closures and the like that release us from restrictions of  the stack-centric programming model that originated with Algol.

I came to the conclusion that low-level development with imperative constructs of structured programing like if, while, goto and for loops and even the new stuff—continuations (old-stuff for Scheme’rs)—was somehow impeding the development of “smart” applications. We already have gotten out of some of the hurdle with automatic memory management in which the runtime periodically searches for all available objects to reclaim, and perhaps new ideas in transactional memory will relieve developers from thinking about concurrency.

In the PDC talk, Scripting and Dynamic Languages in the CLR, one of the panelists, David Thomas, the brain behind Eclipse, who also worked with Smalltalk and Forth at IBM, mentioned that there was something wrong with a programming language if an AI backend is required. I was wondering if he thought about the converse statement of whether that there is something wrong if developers need to acquire machine intelligence (ie, think like a compiler) to work effectively within a programming language. Isn’t garbage collection sort of an AI system in which the runtime periodically searches for all available objects to reclaim?

In another interesting video on the future of CLR language support, “CLR Team Tour, Part II – The Future of Languages (PDC panel preview).”

Asked what makes a good language, Eric Meijer pushed his view of the ideal programming language, which is to remove the cognitive dissonance between human world and programming languages. Jim Miller then responded that, in designing the CLR, he tried to meet the needs of three different classes of languages—object-oriented , functional, and dynamic languages—in successive versions of the runtime. However, he left out support for the class of logical languages like Prolog, remarking “that, for my personal taste, it’s a fine one to leave out.” I wondered whether that might turn out to be a mistake years from now, because those languages, Prolog, Mercury, et al, may actually be more cognitively consonant than any of the other mainstream languages. These language don’t use control flow, but instead rely on searches and backtracking, mechanisms that can be difficult to replicate through traditional programming.

When we develop software with low-level control constructs, when we think in “ifs,” three things happen. (1) We are developing and reasoning at the human pace, seconds instead of nanoseconds, (2) we tend to avoid styles of programming that are complicated, styles that only a “computer” can understand, and (3) code doesn’t match the specification, so it’s difficult to read, test and maintain.

There are already declarative programming languages, both functional and logical, which do away with side-effects and traditional control flow structures. Control flow is managed by the compiler and may employ iteration, searches, and queuing.

I do a lot of declarative programming in XML and S-expressions, which is later generated into code or processed as data, to make up for deficiencies in today’s languages. I do think today’s programming languages will progressively become more declarative, while retaining their traditional efficiencies. We already see such progression in C# 3.0.

August 07, 2005

On Intelligence

Over my lifetime, I developed a set of unintuitive tenets from studying various disciplines such as psychology, economics, biology, statistics, computer science and so on. These tenets hold that from simple, unintelligent forces can emerge efficient, complex, and seemingly intelligent behavior. Obvious examples include the “invisible hand” in economics and natural selection in biology. In AI, there are the classic examples of neural networks and genetic programming.

I believe also that the notion of human intelligence similarly is derivable from simplistic phenomena. From personal experience, I developed from practice the ability to obtain perfect scores from some common standardized tests, yet, when I do so, I surprisingly do not feel intelligent because I am simply, explicitly, and mechanically recognizing and applying fairly simple rules.

Recently, I skimmed through two books, which have something to say about the origin of human intelligence…

I looked at On Intelligence by Jeff Hawkins briefly, which I will probably purchase in the future… The book deals with artificial intelligence and the human brain. One section on the Human Brain in page 43 caught my eyes. Jeff notes that a brain specialist Mountcastle observed that the cells in different areas of the brain for different activities like vision and motion were fundamentally similar.

… the neocortex is remarkably uniform. The same layers, cell types and connections exist throughout. It looks like the six business cards every. The differences are often so subtle that trained anatomists can’t agree on them. Therefore … all regions of the cortex are performing the same operations. The thing that makes the vision area visual and the motor area motoric is how the regions of the cortex are connected to each other and to other parts of the central nervous system.

In fact, … the reason one region of the cortex looks slightly different from another is because of what it is connected, and not because its basic function is different. He concludes that there is a common function, a common algorithm, that is performed by all the cortical regions. Vision is no different from hearing, which is no different from motor output. He allows that out genes specify how the regions of cortex are connnected, which is very specific to functions and species, but the cortical tissue itself is doing the same thing.

Jeff found that observation surprising given that sight, hearing and touch seemed very different with fundamentally different qualities. He concludes the that human brain is fundamentally memory-driven machine using pattern recognition techniques—essentially a rules-based machine.

Another related book that I looked at A New Kind of Science by Stephen Wolfram, who developed Mathematica and spent ten years of his life writing his tome. I had put off reading his book earlier, because of the size, the singular focus on cellular automata and some of the lukewarm or harsh critical reviews. Wolfram came across as arrogant, while the content was often deemed narrow and insignificant. Kinder critics agreed that, while the content was intellectually impressive, Wolfram usurps ideas discovered by others such as Church-Turing thesis and claims them as his own.

Wolfram’s book expends most of the book on simple cellular automata in order to push transformation rules, not traditional equations, as a basis for a new kind of science. This is not surprising, since Mathematica is founded on rules. Before his book was even released, I had anticipated his focus of rules as I was similarly mesmerized by their power and simplicity in his product, so much so that my own AI product is partly reliant on a similar system.

Mathematica is a sophisticated computer algebra system that can manipulate mathematical expressions containing symbols (eg, variables, functions, symbolic constants, …) just as easily as those containing numbers. Despite objections from one of my readers (optionsScalper), these systems are generally considered an area of AI. Mathematica uses a declarative style of programming of adding new rules. Mathematica evaluates expression continuously by applying transformations from a dictionary of rules using symbolic pattern matching. One can create a new function and Mathematica can automatically deduce the derivate in a number of ways ranging from simple algebraic rules to the complicated procedure of applying the limit to the function. I have been very impressed by the language. I especially like the way that subexpressions can remain unevaluated due to undefined symbols yet the whole expression can still be evaluated and transformed.

The proper way to view this book is as his extrapolations about the world based on his experience designing Mathematica. One of these extrapolations involves human intelligence… (pgs 626–629)

But what about about the whole process of human thinking? What does it ultimately involve? My strong suspicion is that the use of memory is what in fact underlies almost every major aspect of human thinking… Capabilities like generalization, analogy, and intuition immediately seem very closely related to the ability to retrieve data from memory on the basis of similarity.

Mathematica manipulates mathematical expressions without using any Prolog-like logical inferencing which suggests that symbolic pattern-matching is a more general approach, yet extensive pattern-matching is very rare in software and commercial programming languages. My own software does include inferencing, which I believe is valuable but not as much as the pattern-matching approach. About logical reasoning , Wolfram remarks…

But what about capabilities like logical reasoning? Do these perhaps correspond to a higher-level of human thinking?

In the past it was often thought that logical might be an appropriate idealization for all of human thinking. And largely as a result of this, practical computer systems have always treated logic as something quite fundamental. But it is my strong suspicion that in fact logic is very far from fundamental, particularly in human thinking.

For among other things, whereas in the process of thinking we routinely manage to retrieve remarkable connections almost instantaneously from memory, we tend to be able to carry out logical reasoning only by laboriously going from one step to the next. And my strong suspicions is that when we do this we are in effect again just using memory and retrieving patterns of logical argument that we have learned from experience.

In addition, the way rules are specified mirrors closely how humans would articulate such rules. My own insight is that the same system of rules can also work well just as with natural language not just mathematical expressions. Indeed, this is what he mentions in his book …

In modern times, computer languages have often been though of as providing precise ways to represent processes that might otherwise be carried out by human thinking. But it turns out that almost all of the major languages in use today are based on setting up procedures that are in essense direct analogs of step-by-step logical arguments…

As it happens, however, one notable exception is Mathematica. And indeed, In designing Mathematica, I specifically tried to imitate the way that humans seem to think about many kinds of computations. And the structure that I ended up with for Mathematica can be viewed as being not unlike a precise idealization of the operation of human memory.

For the core of Mathematica is the notion of storing collections of rules in which each rule specifies how to transform all pieces of data that are similar enough to match a single Mathematica pattern. And the process of Mathematica provides considerable evidence for the power of that kind of approach.

He also makes the following conclusions on the nature of human intelligence …

There has been in the past a great tendency to assume that given all its apparent complexity, human thinking must somehow be an altogether fundamentally complex process, not amenable at any level to simple explanation or meaningful theory.

But from the discoveries in this book we now know that highly complex behavior can in fact arise even from very simple basic rules. And from this it immediately becomes conceivable that there could in reality be quite simple mechanisms that underlie human thinking. … And it is in the end my strong suspicion that most of the core processes needed for general human-like thinking will be able to be implemented with rather simple rules.

June 10, 2005

Human vs Computer

In my post Disruptive Programming Languages, I snared Ian Griffiths into my trap. Ian Griffiths is an .NET author and instructor at DevelopMentor and writes lots of low-level .NET articles on the web. I own his book .NET Windows Forms in a Nutshell. Maybe he’s getting back at me, after I was surrendipitously able to score one up on him on weak delegates in a prior post. Thanks, Ian, for the opportunity to present myself as your peer and pair off my sophomoric ideas against your expericnce and research.

Here’s a simple exercise: How many words can you come up by rearranging the three letters in the word TEA, including TEA? Even though there are six possible words that one can make from rearranging the letters, it’s likely that you will not have found every word. You can also try this for many other words with my WordFinder application, which I’ll post later. A computer can search for and find every instance of a valid word each time without fail. A human may miss some words that he knows as well as some he doesn’t know.

I often think about Mathematica, which I use to solve all kinds of symbolic math problems. The results that I get from this program are better than I can obtain by hand—much faster, often instantaneous, and without errors. It’s a good example of artificial intelligence, where the computer consistently outperforms humans in intelligence task. I also believe that is the ultimate destination of all AI.

When I was speaking about compiler output surpassing human output earlier, I was viewing this as part of a long-run trend. Most compilers don’t utilize logic programming, which is useful for emulating human reasoning, although we might be seeing some of that with the “whole program optimization” feature in C++.

Also, when I talk about compilers, I am speaking rather generally. In some sense, I am comparing a computer-managed process against a human-managed one. When we entrust more and more work to a computer, there may be more overhead initially, but, over time with more optimization and because of the computer’s advantages in speed, the computer-based approach will likely far surpass the manual labor.

I would include .NET garbage collection as a computer-managed process which competes well with C++’s finalistic determinization, a manual process, even though the latter has more domain information. Over time garbage collection will improve from better algorithms and coevolution. Through coevolution, processors have evolved to work better with compilers than with humans. Future architectures will be optimized for garbage collection. (I am reminded of processors that were optimized for Lisp decades ago.)

I believe future languages will be declarative and symbolic. Current control constructs will be replaced by various forms of search, which optimize away to conditionals and loops in the simplest case. Future languages will probably resemble functional and logical languages like Prolog more so than the imperative languages most developers use today. I suspect that our present day of writing low-level control constructs leads us to less sophisticated programs, because programming proceeds at a human pace rather than at a computer’s pace.

I wrote earlier that compilers can currently emit C code that can run faster than handwritten assembly code. Ian responded:

“Not that old chestnut!” This statement is only true for small values of developer capability.

Intel still release hand-crafted performance libraries. The ones I've used (for image processing) still beat the socks off anything produced by a compiler - and we're not talking something that's a few percent faster. These libraries run several times faster than the same algorithms written in C and compiled by the compiler.

Ian states that the statement is true only for small values of developer capability, but that probably still represents an overwhelming majority of developers I imagine—and not just those who are don’t know assembly language. He means you—mainstream developer.

I admit that the current C# and JIT compilers aren’t very good optimizers. The C++ compilers are probably better ones in the Windows platforms, though C/C++ have some difficulties with aliasing because of pointers.

I don’t think the Intel libraries are a good example. The libraries themselves could almost be considered part of compiler. Intel does develop their own optimizing compilers with its finer understanding of their processors, since it doesn’t have access to the commercial compilers of other vendors, which are more platform-agnostic. Intel contributes to these other compilers in the only way it can by providing libraries. The type of code that Intel produces in its performance libraries may very well be handwritten; however, the code is written by chip designers themselves, very likely aided by extensive computer analysis, and probably have undergone lengthy development and testing. Also, these libraries can make additional assumptions that a general compiler cannot. Users are not likely to write equivalent code without the deep level of knowledge that Intel programmers possess.

I also wrote that if the compiler has the same information about a problem as a human, the compiler should theoretically alway be able to produce superior code.

That's patently bogus. The human will always be able to do at least as well as the compiler because if all else fails, the human can work through the same set of rules the compiler is using. Indeed, there's an age-old technique for guaranteeing that you do at least as well as the compiler: get the compiler to write your first version for you, tweak what it gives you and always compare against the original when benchmarking. You are guaranteed to do at least as well as the compiler, because if you make any changes that make things worse, you just back them out again.

Here’s why, if a compiler encodes all the rules that a human would apply in optimizing the code and both the human and the computer have the same information, then a compiler can successively apply rules until no more can be applied. Since the application of one rule can affect the subsequent application of another rule, the compiler may have to backtrack to consider alternative reorderings of rules. Assuming the computer has applied all rules, a human could not perform better, because the computer, using the same rules, has already considered any enhancement the human could add.

Because humans built computers, my statement consists of few circularities. Anything produced by a compiler is also indirectly produced by a human. The other circularity that Ian mentions is that a human could modify the output of the compiler to produce better code than the compiler, unless the compiler has already considered all of the human’s enhancements prior to output. If a compiler supports inlined assembly as Visual C++ does, you could also use the same argument in reverse, in which you feed back the human-modified code into the compiler; the resulting code of the compiler will be as good as the human.

Optimized code is notoriously difficult for humans to navigate. Something as simple as reusing a variable for a different purpose in assembly can make code very hard to follow. When modifying optimized code, humans often introduce subtle errors. Many times, an “obvious” improvement to optimized code actually reduces performances.

In many cases, it will not be feasible for a human to reasonably understand the compiler output. Code may be generated as a fully optimized state machine, for example. The output of  a neural network, used in machine vision and speech recognition, or a genetic algorithm is very hard to modify, if either technique was used in compiler.

Processors are also increasingly being designed for compilers and not for humans. I haven't written in assembly since the early 1990s, but I know that one considers not only instruction cycles but a host of new constraints like instruction ordering.

Ian then asserts that my axiom "if the compiler has the same information about a problem as a human" has never been valid.

The compiler doesn't have the same information. There is almost always domain-specific information that can be brought to bear on the problem which cannot be put into source code. (And in any case, compilers still don't make use of all the information that is technically available to them today.)

There are at least two situations where the compiler can extract additional domain information, without the human having to include explicit invariants in the code.

1) Profile-guided optimizations. In this case, the compiler uses information from profile data from an instrumented executable to reorder and optimize code. Profiling provides information to the compiler about how people actually use the product, about which code paths actually get hit often.

2) Whole program optimization. The compiler, in this case, considers the relationships between functions by producing theorems about each function, and then going back to each function and reoptimizing based on those theorems.

These two optimizations can allow a compiler to indirectly deduce doman-specific information. Futhermore, the information obtained can be even more valuable, because the compiler can generate a far greater amount of information, most of which may be nonobvious. These kinds of optimizations also make it difficult for a human to modify code without adversely affecting its performance, because the assumptions that the compiler was able to make are not explicit in the code output.

So your statement is untrue. It's also not really relevant. The important thing is that compilers are good enough. They certainly do a better job in the time they take than a human could do in the same time, so it's all about saving developer time, not machine cycles. Using higher level languages offers a host of benefits, but the quailty of the code is one of the costs, not one of the benefits - we trade quality of code for speed of development.

Of course the other reason it's not all that relevant is that an awful lot of code isn't limited by raw CPU speed. The usual killer is sitting around waiting for stuff to come out of main memory and into the cache. That's something that you can't fix by tweaking the compiled code - you need to address either the data structures or the algorithms or both.

So compiler output really isn't that hot. The key thing is that most of the time it doesn't need to be - it simply has to be good enough.

I agree the bottlenecks today has very little to do with instuction speed. The bottlenecks are memory access and the efficiency of algorithms and data structures used. The compiler may still be able to help with the latter two cases, but more likely it makes more sense for a company to focus on other productivity features. This only strengthens my argument that languages need to get higher-level. I don’t believe that quality of code is necessarily words with higher level languages. You are more likely to write higher quality code in C than in assembly and, also, in C++ than in C.

I also want to point out that compiler output is not limited to assembly, but also to intermediate languages. In such cases, the benefits of compiler-generated code over human-generated code may be more real in terms of performances, not just in productivity.

By the way, there are five words that can be made from exercise earlier—every combination of letters except AET.

June 28, 2004

Looking Back at My CS Education A Decade Later

When I taking computer science (and applied math) back in college over a decade ago, nearly all of the content was heavily theoretical and virtually impractical to me at the time.

The Computer Science department proceeded with the belief that the courses will not delve into any specific programming language or technology, which tend to become obsolete over time,  but more on knowledge that will last forever.

My computer science studies actually seemed like a math class, with discussions of Turing machines, various "calculi", formal grammars and the like. Many concepts in CS can be described as the most minimalistic and essential representation of a machine, language or system--compact and convenient enough for formal analysis and proofs. As I entered the work force, though, it seems as if the knowledge was completely useless. Virtually every technical skills and languages that I needed for work, I learned on my own outside of college.

In my programming language, for example, there was, I believe, only two languages taught, both declarative, ML, a functional programming language, and Prolog, a logical programming language. They were both of a different paradigm altogether and neither of them are used regularly (or even rarely) in industry. In addition, much time was spent on lambda calculus (some minimalistic funky formalism, with perhaps three symbols, one of which is the sole operator) and predicate calculus, weaving it into the discussion of both languages and proving their equivalency to Turing machines. Our exercises in Prolog utilize lambda calculus in an assignment on denonation semantics, which was to parse and convert English sentences into semantic form in predicate calculus and then make queries against the it.

Other schools would have actually provided a comparative overview of different widely used languages and maybe bring up a discussions on garbage collection and by-name parameters. I had often wondered whether I missed out on my education.

With my current work in natural language and AI, the seemingly impractical theory has come back to haunt me in big way. Ideas like DFA, CFGs, O notation, lamda calculus, and resolution profoundly affect my development. Without the foundations established in my computer science education, I may never have been able to proceed in this direction of producing highly intelligent software.

AI

Artificial Intelligence is my pet computer science topic. I first took an AI course at Columbia University while I was in high school, and was completely entralled. Now a decade and half later, I am devoting my life to developing intelligent software on my own.

I recently read a quote from a professor, that, in the 1980s, investors were enthusiastic about AI while researchers were pessimistic. Unfortunately, AI in the 1980s was associated with numerous project failures and gained a bad reputation. Now the tables have turned, at lot of advances occured in the field, not to mention improvements in computational speed and storage, have happened over the 1990s, researchers who still remain are now very optimistic, but investors, having been burned before, are pessimistic.

I am actually quite optimistic about my chances of succeeding with AI. First, a large amount of resources and technologies are available for licensing from consortiums and universities. The difficulties with natural language processing, for example, are not computational, but more related to the accumulation of linguistical data. Second, there are fewer people in the field, just when the possibilities in AI have dramatically opened up, and a large amount of skepticism--meaning little competition. Third, the amount of memory now available in desktop PCs makes it feasible to store megabytes of machine readable dictionary and knowledgebase in memory.

I have notice an almost exclusive emphasis on quick and dirty statistical approaches in AI nowadays like neural networks, bayesian reasoning, and genetic programming. These approaches often result in nonsensical output.

People have lost interest in or don't appreciate the power and exactness of symbolic approaches. I am in the symbolic AI camp, though I do see statistical approaches as complementary in the areas of disambiguation and prediction.

January 29, 2004

Will Machines Become Conscious?

I came across this link "Will Machines Become Conscious?" in KurzweilAI.NET.

Ray Kurzweil is a leading expert on artificial intelligence and wrote the book the "Age of Intelligent Machines." One of the companies that Kurzweil founded is ScanSoft, which develops scanner technology to convert images to text. It is a very strong piece of technology.

Kurzweil believes that computers will eventually reach human intelligence ("strong AI") and he debunks many of the existing criticisms levied against strong AI. He also makes important distinctions between some AI programs (like Deep Blue, the chess-playing program) which rely on exhaustive brute-force searches versus real AI programs that more closely simulate the way humans think. Critics tend to point to the former, when discussing limitations of AI programs, when they really should be concerned with the latter. Newer chess-playing programs, with only a fraction of the horse-power of Deep Blue but rely on human-like techniques instead of brute force, are better chess players.

I do believe that machines will someday reach and surpass the intelligence of humans, a belief that I did not originally have, but I have developed after studying how humans think and how computers operate.

As for developing consciousness, I am not quite as sure, since we are moving into nebulous territory. Although it is a provocative question, would this even be desirable or useful? For me, a "conscious" machine would refer to a machine having sensory capabilities such as sight and hearing, some emotion-like capacity to infer threats and opportunities ("fear" and "lust"; "pain" and "pleasure") from the environment, an ability to learn from the environment and to communicate or react to it, and an awareness of one's self in the environment. Even if that definition doesn't quite equate with human "consciousness," it would be hard to tell the difference. I have a hunch it will happen, if humans don't destroy themselves first, but probably not in my lifetime.

I looked at an AI book a few days ago that estimated that the neural capacity of the human brain by multiplying the number of neurons by the speed of each individual neuron was equivalent to a CPU speed of 10^17, which would require a few decades for computers to catch up to, if Moore's law maintains its current rate. Of course, computers can potentially be more efficient than humans at utilizing its processing speed, so they would never need to reach that level of power. (I don't necessarily think that this line of reasoning is valid.)

As you probably know if you have read my blog in the past, I am developing a software company that produces common commercial desktop applications employing artificial intelligence. My software won't be ready for another year.

Part of the reason why we don't see smart applications today is because there are no existing libraries available for AI in popular platforms like Windows and Mac. Few applications used gradients or alpha blending, before Microsoft offered direct API support in Windows; it's no different for even more complex technology like AI. (Longhorn will ship with a Natural Language API, but it will be limited to tasks such as spell-checking; there will be no ability to parse natural language text or convert it into semantic forms.) There are other types of AI as well besides natural language processing.

Another reason is that many of professionals languages and tools we are using don't work well with AI. Garbage collection, I believe, is essential for example, since much of AI relies on complex cyclic graphs and searching algorithms, which makes manual memory management more difficult. Hierachical list data structures more closely maps to the way we think than class objects.

A final reason is memory requirements. In addition to libraries, operating systems would have to ship databases consisting of world knowledge, such as the semantic relationships between words. My current software has an in-memory database over 10 megabytes, which several years ago was unthinkable because that was more memory than most computers had.

Some of my MBA friends reminded me of the AI debacle in the 1980s. AI is a broad term, and I don't believe the bad investments of the past relate to anything I am doing today. I am providing a radical improvement on existing applications. The field is much more mature and advanced than it was today, and the hardware on today's desktop is more powerful than the mainframes of that era.

October 07, 2003

Human-like Software

I have started a software company, called SoftPerson, and am currently writing software which uses artificial intelligence to create new kinds of desktop applications. In the type of software we develop, the "software" acts like "person" (typically refered to agent in AI literature) in its ability to help construct documents and to reason about the content.

Very few commercial applications employ AI, but I think there will eventually be a AI wave in the future, especially computers know have the resources. The software that I am developing requires megabytes of compressed memory just to store world knowledge; this would not have been commercially feasible 7 years ago when computers did not even have that much RAM.

Whenever a task needs to be accomplished, I always imagine the software as a personal servant, and ask myself how would I interact with this servant, how would the dialogue proceed, what steps would the servant need to take. (That's why I am intrigue by inductive UI.)

Then I translate the steps to code. The tasks that the software needs to perform for each step are of a higher level of intelligence and require a bit of artificial intelligence, which I found to be difficult with traditional "C"-like languages. The programming language I use is C#. First, C# is more high-level than C++--the low level details of C++ is for deterrent for software that needs to mimic high-level human thinking. Garbage collection is essential because references abound everywhere forming complex tree and graph relationships, whose lifetimes are impractical to track.

To support human-like thinking, I used techniques from declarative programming, which involves constructing a model, sort of like XML-like document object model, free of control flow, that can be easily manipulating and searched, and can be rendered in a variety of views. I have an optimized list-based interpreter, modeled on Mathematica--basically it is a programming-language with a programming-language (C#). Mathematica has many advantages over Scheme, because it supports functional and rule-based programming, in addition to procedural programming, and because the list data types are based on vectors rather than linked-list, which improves performance. This interpreted language can perform lazy evaluations, handle unknown or ambiguous list expressions, and manipulate the whole expression without understanding the parts; these are hard to deal with in C. I have also built a Prolog-like inference engine on top of the interpreter complete with a knowledgebase implementation and the ability to solve basic mathematical equations.

I have also added statistical libraries for probabilistic reasoning and other mathematical support. (I am so glad that I dual majored in math and computer science.)

In order to act like a human, software needs to understand basic human concepts. I had to build several new data types to represent these concepts, just as we have ints and strings. So, there is a datatype for a Words and WordSenses (a word can consist of multiple senses, multiple words can have the same sense).
I use several natural languages databases like WordNet, which contains real world semantic knowledge. WordNet is nice because it a machine-readable dictionary that categorizes all the words of the English language and maintains dozens of relationships between these words, such as synonymy, hypernymy (dog IS-A animal), meronymy (finger IS-PART-OF hand).

There are other data structures for Sentences, Paragraphs and Discourse. If computers seem rigid and mechanical and not smart, today, it's because current software programs don't have these notions of these real world concepts. (Of course, to implement these I had to write a natural language parsing engine. It's ironic that all the theoretical concepts you learned in college like NDFAs and CFGs, which seemed, at the time, overly academic and impractical, come back to haunt you in a big way, when you implement anything moderately ambitious.)