The Undoables of Computer Science

Doables and undoables

As you work in a field for long time, you develop a sense of what is doable and what is not. You try to work on the problems which are doable, and avoid working on the problems which are undoable. The categories are not black and white, and interesting problems are those which are hard but doable.

As I waded through my undergraduate and graudate courses in Computer Science, I formed an opinion about various fields. Some fields tackled doable problems while some others tackled undoables problems. The latter ones were to be avoided. There were other fields which tackled non problems. One could study them for fun.

Artificial Intellience was the flashiest of the fields, but it over promised and under delivered. I always thought Machine Learning is more sober name for the field, if at all it wanted itself to be separated from Algorithms. Deep Blue defeating Garry Kasparov was thought to be a high point in AI, but if you scratch the surface there was nothing that justified the name "Artifical Intelligence". Lot of opening theory is fed into the computer, end games can be looked up, and there is lot of hand tuning. Other things that we learnt in AI courses were VC dimension, regression trees, naive bayesian classifier, SVM and ANN. Recognizing handwritten digits was where the AI gave up, though I am sure there would be research papers that showed much more.

Then, starting 2012, AI started solving problems I had hitherto considered unsolvable. If, in 2010, you asked me write a program that could detect cats from dogs, I would have give up immediately. (Ok, not immediately perhaps!). But the deep learning area of AI provided with practical ways to solve problems that were till now untractable. [It is another matter that the techniques were discovered decades earlier, and those techniques became applicable only lately with the advent of large training data and compute capacity.]

This forced me to revise my perception of AI. The core learning was that the image classification function (and other similar functions) was learnable in a way that generalized well to images beyond the training examples.

So, which other fields are tackling undoable problems, or non problems? Here is a list. I would be great if a few of these items make it out of the list in my lifetime.

1. Quantum Computing

Quantum Computing has been perenially promising that once Quantum Computers are developed, they will be a million times faster than the existing computers. During the past 30 years that the Quantum Computing has been promising us a million times faster computers, traditional computers themselves have become a million times faster. Meanwhile, Quantum Computers keep celebrating Shor's algorithm and are happy to report that quantum computers have factorized 15 as 5 * 3 (2001), and that they have factorized 143 as 13 * 11 (2011).

Will Quantum Computing ever be widely available?

2. Automatic Parallelization of Programs

With multi core, super scalar processors available, won't it be great if there were a system that can convert your program, written with single stream of exeuction in mind, and parallelize it to run across cores?

Turns out that this is a problem which is hard, but not for the lack of trying. See for examples, this, or this.

In the absence of automatic parallelization, there have been two approaches to execute programs in parallel (a) Programmer writes his programs in a "parallel augmentation" of his favorite language. Most popular augmentation available is Message Passing Interface. Thus, the programmer is responsible to define the parallelism in his algorithm. (b) Programmer writes his program in a framework like Hadoop MapReduce. If he does so, he can write parallel algorithms for a large class of problems very easily, but not all problems (e.g. graph algorithms) lend themselves to map reduce.

So, while practically we have made progress in running our programs across compute threads, the holy grail of automatic program parallelization has eluded us.

3. Automatic Theorem Proving

At the base of Maths are axioms. Any theorem can be derived from those axioms using the rules of derivation. For exmaple, at the base of Euclidean Geometry are five axioms, and the basis of number theory are Peano Axioms.

So, can we write programs to automate theorem proving? Turns out that we do not yet know of ways that enable a computer program to wade through theorems, applying logic to go from one theorem to another, and arrive at a given theorem. What is more, even writing an automated proof checker is hard, though is used in some domains like integrated circuit design.

4. Genetic Algorithms

Genetic algorithms look plausible to begin with: just as in nature where offsprings combine genes from their parents, and have some mutations of their own, let us try to find the maxima/minima for a function by starting with a few random points, and then creating offsprings repeatedly. Just as in nature, we kill the unfit data points that we generate, thus operting with some fixed size population from generation to generation.

The problem is: nature is not particularly efficient at generating optimal species. Infact, nature is not even required to generate species that fit the environment: species go extinct all the time.

Thus, drawing inspiration from nature in this case is a bad idea. That however is not the opinion shared by conferences like this.

Note that drawing inspiration from nature is not always a bad idea: animals are very good at vision, and thus drawing inspiration from them to develop algorithms for computer vision is a good idea.