Dan Byström’s Bwain

Blog without an interesting name

Archive for the ‘Nostalgia’ Category

Real Programmers don’t use Pascal – 25 year annivarsary

Posted by Dan Byström on July 31, 2008

I just happened to notice that the legendary text “Real Programmers Don’t Use Pascal” was published exactly 25 years ago this month (and I just have a couple of hours left before this month ends, so I thouht I had to write up something quickly). You can often find it on the Net filed under “humor”. This can, of course, only be done by Quiche Eaters! Sometimes someone has replaced Pascal with Visual Basic. I have no problem with that. 🙂

Among all the words of wisdom, I found these:

  • Real Programmers aren’t afraid to use GOTOs.
  • Real Programmers can write five page long DO loops without getting confused.
  • Real Programmers like Arithmetic IF statements– they make the code more interesting.
  • Real Programmers write self-modifying code, especially if they can save 20 nanoseconds in the middle of a tight loop.
  • Real Programmers don’t need comments– the code is obvious.
  • Since Fortran doesn’t have a structured IF, REPEAT … UNTIL, or CASE statement, Real Programmers don’t have to worry about not using them. Besides, they can be simulated when necessary using assigned GOTOs.

I guess that the number of programmers out there who have written (or even know what it means to write) arithmetic IF-statements or self-modifying code are decreasing rapidly by the day. And writing a five page long DO loops without getting confused may be a cool party trick, but nothing to applause in real work. But the remaining two points are something I’d like to comment!

Real Programmers aren’t afraid to use GOTOs

Well, why on earth should they be? Really? This “truth” seems to be something that all student know by heart. Thou shalt not use GOTO. Thou shalt not use GOTO. Without even understanding why GOTO was such a big problem in the first place. In a structured language it is not. How could it be that more students seem to know about this silly “non-problem” while at the same time failing to understand the most fundamental thing about programming: Thou shalt never duplicate code. Thou shalt not use copy-and-paste. I just wonder…

Real Programmers don’t need comments– the code is obvious

I think this is almost true! It only needs slight modification:

Real Programmes write code in such a way that the code becomes obvious

I have come to realise that when I download a piece of code from the Net, the first thing I usually have to do is delete all the comments so that I can read the code!!! Most of the time, comments seem to have been put there just “because you must write comments”. Then after a time, changes are made to the code and the comments are not updated, leaving any reader totally confused! Skip the comments and write clean code with long descriptive methods and variable names I say!

(This goes for much of the documentation too, I think. The code is the best documentations because it is “the truth”! This is one of the reasons Tests (Test Driven Development) is such a great thing: you get “true documentation” for free.)

Posted in Nostalgia, Programming | Leave a Comment »

Insomnia

Posted by Dan Byström on March 6, 2008

To iterate is human, to recurse divine.

L. Peter Deutsch

To iterate is boring, to recurse plesantly dizzying.

Dan Byström

Have you ever tried counting sheep when trying to sleep? For twenty years I’ve been utilizing a different method. I’ve been trying to solve the following problem:

How do you build a balanced binary search tree from a stream of sorted data of unknown size?

That is, given a stream of the letters: ABCDEFGHIJKLMNO, construct the following tree:

H
/ \
D L
/ \ / \
B F J N

/

\

/

\

/

\

/

\
A C E G I K M O

directly by reading the characters one by one and place them DIRECTLY in the correct position WITHOUT using any intermediate storage.

I’m confident that this has been thoroughly researched at least half a century ago, but I have never wanted to look up any solution – what’s the fun in solving a crossword puzzle if you look at the solution first?

This sleeping strategy has been very successful, up until last week, when baby Ellen gave up a horrible cry in the middle of the night – and while trying to go to sleep again I remembered this old problem. Accidentally I solved it and now it’s useless as a sleeping pill!!! But since I have a history with this problem for twenty years now, I thought I had to write a little memorial over it before letting go.

This will also be a tribute to the most powerful solving method I know of: Divide and Conquer!

Background

Two decades ago, when I was a student (that was at a time when I could also be called both young and promising) taking a programming course I chose to write a record register application using Turbo Pascal 3.0. This was a time when CD players were just as common a Bly-ray players today (that is: most people were still waiting for the prices to drop). Vinyl was the name of the game. (Turbo Pascal 3.0, by the way, was an incredible piece of work; the whole environment consisted of one file with the whopping size of 38kb containing the full runtime system, the compiler and the source code editor! You read right: 38kb – not 38Mb!)

I wanted this application to be able to run smoothly with its data files residing on a floppy disc (and of course, memory consumption had to be low). I chose to use binary search trees to keep track of the albums, artists and song titles. This worked well and it was quite fun figuring out all the things that were not in the books, like: given an arbitrary node in a tree, how do you implement a “next” function that will bring you to the next node in sort order when the user hits the “next” button? Not completely obvious the first time you try to acquaintance yourself with the concept of trees. Adn then embarrasingly simples once you figured it out.

However, there was this one thing that I never bothered to figure out, but took an easy way out. That has nagged me ever since.

Prequisities

Let’s declare a simple binary tree node:

public class Node<T> where T : IComparable<T>
{
  public readonly T Data;
  public Node<T> Left;
  public Node<T> Right;
  public Node( T data )
  {
    Data = data;
  }
}

Then let’s define a data pump that will read data/records/entities (or whatever we like to call them) from some kind of stream. The important thing here is that there is no look-ahead facility:

interface IStreamingDataPump<T> where T : IComparable<T>
{
  T GetNextData();
  boolEOF { get; }
}

Maybe it helps our thinking if we have a concrete example implementation of the interface as well:

public class TestCharPump : IStreamingDataPump<char>
{
  string data = "ABCDEFGHIJKLMNO";
  int index = 0;
  public char GetNextData()
  {
    return data[index++];
  }
  public bool EOF
  {
    get { return index >= data.Length; }
  }
}

Obviously, this data pump will deliver the sorted characters A to O (in order) to a consumer, throwing an error if trying read past the last character.

The problem with inserting already sorted data

The standard way of inserting data into a binary tree:

public static Node<T> Insert( Node<T> tree, T data )
{
  while ( true )
  switch ( Math.Sign( tree.Data.CompareTo( data ) ) )
  {
    case 1:
      if ( tree.Left == null )
        return tree.Left = new Node<T>( data );
      tree = tree.Left;
      break;
    case -1:
      if ( tree.Right == null )
        return tree.Right = new Node<T>( data );
      tree = tree.Right;
      break;
    case 0:
      throw new Exception( "Duplicate keys!" );
  }
}

If we read all the data from our stream containing sorted data and use the method above:

Node<T> tree = _pump.EOF ? null : new Node<T>( _pump.GetNextData() );
while( !_pump.EOF )
  Node<T>.Insert( tree, _pump.GetNextData() );

…then we effectively end up with a linked list instead of a binary tree:

A
B
C
D
E
F
G
H
I
J
K
L
M
N
O

The standard(?) solution

So, what I’ve did in my record register application was something like this:

public Node<T> CreateTreeFromList( List<T> list, int left, int right )
{
  if ( left > right )
    return null;
  int middle = (left + right) / 2;
  Node<T> tree = new Node<T>( list[middle] );
  tree.Left = CreateTreeFromList( list, left, middle - 1 );
  tree.Right = CreateTreeFromList( list, middle + 1, right );
  return tree;
}

And then made use of that function like this:

List<Song> list = new List<Song>();
while( !_pump.EOF )
  list.Add( new Song( _pump ) );
Node<Song> tree = Node<Song>.CreateTreeFromList( list, 0, list.Count - 1 );

As you can see, I first read the whole stream into memory and then I construct the tree from there.

This will construct a perfectly balanced tree and the performance will be good. So what’s the problem? Well, actually there is no real problem, it is just that I feel that first storing the data in a vector, when it is a tree we want is an unnecessary step!!!

Another approach

So, given ABCDEFGHIJKLMNO, I want to read one character at a time and place it at the correct position in the balanced tree:

H
/ \
D L
/ \ / \
B F J N

/

\

/

\

/

\

/

\
A C E G I K M O

And again, the number of nodes must not be known in advance!

So let’s think about it. Reading A into a node seems uncomplicated. Then we should read B and somehow realize that it must be placed in such a manner that it will become the new tree root and that the previous tree root should become the left node of the new root.

Reading C and placing it as the right node of the current tree root seems fair enough, and the next node D should be handled similar to the B node… but now it gets strange, the E node should be placed in a void because F hasn’t been seen yet… and what if there IS no F node? If E is the last element in the stream it should come directly under D but not if the F appears.

And thinking about what will happen after the G is read seems even stranger, although there clearly isa pattern here if you say “EFG” will be handled exactly like “ABC”, and in fact, B and F behaves similar… so… what makes A and E alike… aaaa….

This is were I used to fall asleep. 🙂

Up until a night last week, then my brain got bored of having to start all over again with this problem and thought that it was best to settle the thing once and for all. There was no quantum leap – just the most common tool we use every day: divide and conquer!

With the aid of first one and then two helper functions the solution was suddenly crystal clear.

Reasoning away…

Firstly, one thing that disturbs my thinking a great deal is that we always have the risk of running into EOF. We can reduce that problem somewhat by a tiny helper:

public Node<T> ReadNodeFromStream()
{
  if( _pump.EOF )
    return null;
  return new Node<T>( _pump.GetNextData() );
}

Thinking about it, what we now have is a function that can create a zero or one element sized tree. Not bad really! 😉 Also note one other important fact: we are allowed to call this many times after reaching EOF – it will return null every time.Can we extend this into a three element sized tree (=height of two)?

public Node<T> thinkingOutLoud_1()
{
  Node<T> left = ReadNodeFromStream();
  Node<T> tree = ReadNodeFromStream();
  Node<T> right = ReadNodeFromStream();
  tree.Left = left;
  tree.Right = right;
  return tree;
}

Well, not quite… this will work if there are at least two elements in the stream, but otherwise it will crash. It can be fixed very easily though:

public Node<T> thinkingOutLoud_2()
{
  Node<T> left = ReadNodeFromStream();
  Node<T> tree = ReadNodeFromStream();
  if ( tree == null )
    return left;
  Node<T> right = ReadNodeFromStream();
  tree.Left = left;
  tree.Right = right;
  return tree;
}

Spot on! What about a next step… modifying the method to create a one or two level high tree?

public Node<T> thinkingOutLoud_3( int height )
{
  if ( height == 1 )
    return ReadNodeFromStream();
  Node<T> left = ReadNodeFromStream();
  Node<T> tree = ReadNodeFromStream();
  if ( tree == null )
    return left;
  Node<T> right = ReadNodeFromStream();
  tree.Left = left;
  tree.Right = right;
  return tree;
}

This method knows how to create two different sized trees, and it has no trouble at all dealing with EOF. But wait a minute… can’t we just as well write it this way, it wouldn’t change a thing if we call it with parameter of 1 or 2:

public Node<T> thinkingOutLoud_4( int height )
{
  if ( height == 1 )
    return ReadNodeFromStream();
  Node<T> left = thinkingOutLoud_4( height-1 );
  Node<T> tree = ReadNodeFromStream();
  if ( tree == null )
    return left;
  Node<T> right = thinkingOutLoud_4( height-1 );
  tree.Left = left;
  tree.Right = right;
  return tree;
}

But what if we call it with a desired height of three… or four…? Wow! I think that we safely can rename this method:

public Node<T> CreateTreeOfDesiredHeight( int height )
{
  if ( height == 1 )
    return ReadNodeFromStream();
  Node<T> left = CreateTreeOfDesiredHeight( height-1 );
  Node<T> tree = ReadNodeFromStream();
  if ( tree == null )
    return left;
  Node<T> right = CreateTreeOfDesiredHeight( height-1 );
  tree.Left = left;
  tree.Right = right;
  return tree;
}

AHHH! Recursion. (Said like Homer Simpson.)

If we knew the length of the stream we could create our tree like this:

Node<T> tree = createTreeOfDesiredHeight(
  (
int)Math.Ceiling( Math.Log( _pump.Length ) / Math.Log( 2 ) ) );

But we don’t. So exactly how does this help us anyway? It actually helps quite a lot – we’re almost done. We just have an iterative method left. And since iterative methods are boring, I won’t say much about it:

public Node<Data> CreateBalancedTree()
{
  Node<Data> left = ReadNodeFromStream();
  for ( int height = 1 ; ; height++ )
  {
    Node<Data> tree = ReadNodeFromStream();
    if ( tree == null )
      return left;
    tree.Left = left;
    tree.Right = CreateTreeOfDesiredHeight( height );
    left = tree;
  }
}

Well, maybe I should say something…

  1. Create a “left tree” of height one and then try to tuck on a root above it.
  2. If there are no more elements, then out left tree is the tree and we´re done.
  3. Otherwise create a right tree of the same height as our left tree
  4. Name our newly created tree the new “left tree” and repeat step two

Note that the key to this solution is step 3 – creating a subtree of a desired height. The part that made me pleasantly dizzy! 🙂

But… wait a minute… is this really it? Isn’t it strange that we now use completely different pieces of code to produce the left tree and the right tree??? Indeed it is! The CreateTreeOfDesiredHeight method is only a helper method I used to arrive at the solution!!! Since our CreateBalancedTree method is obviously fully capable of of creating the left tree and there is no difference between a left tree and a right tree:

The solution

public Node<Data> CreateBalancedTree( int maxHeight )
{
  Node<Data> left = ReadNodeFromStream();
  for ( int height = 1 ; height<maxHeight ; height++ )
  {
    Node<Data> tree = ReadNodeFromStream();
    if ( tree == null )
      return left;
    tree.Left = left;
    tree.Right = CreateBalancedTree( height );
    left = tree;
  }
  return left;
}

Iteration begone! And since a caller doesn’t care about any maxHeight, we can create our final (overloaded) method as such:

public Node<Data> CreateBalancedTree()
{
  return CreateBalancedTree( int.MaxValue );
}

Again, it is so embarrassingly simple once you figured it out! And this tiny piece of (almost) trivial code eluded me for so long!

OK, so now this was really really old news I guess. And there are things to criticize as well. First of all, what’s the point really? The method of loading the stream into a vector works faster AND produces a better tree, since my solution actually isn’t balanced (it is in our case where the number of elements is 2N-1) – it just has the same height as a balanced tree. But I’m fine with that.

There seems to be a lot of misconceptions around recursion out there, like that it is slow and uses an excessive amount of stack memory. Of course, calculating Fibonacci numbers using recursion instead of iteration should only be done as an intellectual exercise. But the fact that people think that it’s slow may be due to reasoning like this:

Fact: A recursive algorithm is slower than an iterative method when an equal iterative method exist.
Conclusion: All recursive algorithms are slow.

This is of course exactly the same reasoning as:

Fact: All cheeses are round and the moon is round.
Conclusion: The moon is made out of cheese.

Well, well. I don’t know how to end this, but I’ll give it a try anyway:

Try and be nice to people, avoid eating fat, read a good book every now and then, get some walking in, and try and live together in peace and harmony with people of all creeds and nations.

Monty Python
The meaning of Life

Posted in Nostalgia, Programming | 4 Comments »

Quest Completed

Posted by Dan Byström on December 8, 2004

10,000 experience points gained.

I’ve done some serious searching in the .NET help files as well as hard core googling to find out how to do design time interaction with a UserControl. I thought this would be one of the first tings everyone would like to try out in .NET. 😉 But it was amazingly hard to come up with the relevant keywords. I found numerous of really really exciting stuff along the way, so this time was by no means wasted.

This is a reminder of how easily we (normally) can find information these days. Back in the eighties information was found in magazine articles, in books arriving three months after ordering them and by disassembling core products. Not a day passed without tracing and disassembling parts of MS-DOS and the BIOS. Those were the days.

In the nineties, all of a sudden, it was possible to use CompuServe to get in touch with other people to find answers! Then came news servers and finally the whole thing just exploded.

It was way back then, when I got a strange thing tossed in my hands, called MFC. It was called a framework and used some buzz-word technology called object orientation. Surely nothing a real programmer can benefit from I thought (and was proven right, right?), but anyway it was raining and I had nothing better to do so I did something that I had wanted to do for more than ten years: I wrote my own Qix-clone. What? Did I notice you shake your head and smacking your tongue??? You’re obviously no geek at all and shouldn’t be here! Now; just leave! Go on! 😉

The concepts of inheritance, abstract classes, virtual methods, overloading, overriding, aggregation and so on were all completely new to me, but I immediately felt totally at ease with them. I was thrilled realizing that, after creating my monster class, I could just inherit from it in order to get my fuse! Way cool I thought! I created my game vector-based and had to come up with a lot of vector manipulation routines. How do you calculate the area of a (closed) polygon? How do you find out if a point is inside or outside a (closed) polygon? I found the answer to the latter question and was immensely proud of it. I later found out that it was a well known algorithm, but anyway… nothing beats doing it yourself.

Qix remained on some backup disk for some years until another buzz-word appeared: Java Applets. Converting the MFC C++ code into Java wasn’t that hard, and boy was I lucky that I hadn’t relied on the Windows API to help me with regions. How do you fill a polygon with a bitmap without any framework support? Go figure! 😉 vdQix can still be played here, at some periods it has had more than 10,000 hits a month. But this story is not about Qix, instead it is the story of my “point inside a polygon” algorithm.

At roughly the same time, I was working on a VB project where we wanted nicer looking buttons than Sheridan’s controls bundled with VB3 could accomplish. We bought a package calls VB-Tools (more popularly called VB-Fools) which contained a vast number of controls. Each being so poorly designed and full of bugs that I got irritated enough to roll my own.

I ended up with a button control, a tooltip control, a control to monitor processes and a hotspot control (DBPush.vbx, DBTTip.vbx, DBAppMon.vbx and DBHots.vbx). They came out pretty well and I uploaded them to various places and they spread like wildfire. For years I had daily conversations with happy users. A few quotes can be found here. Very rewarding indeed. They almost even made it onto the MSDN CDs!

The DBHots.vbx was directly inspired by my “point inside a polygon” algorithm developed for my Qix game. It relied upon the fact that a VBX was able to interact with a user at design time and that a VBX could be transparent. VBXs were just great in their simplicity! Then along came ActiveX with 32-bit VB4 and both these features were gone! With VB5 and the OCX96 revision these features were put back in, but then it was too late. No ActiveX version of DBHots ever saw the light of day.

Now we’ve come full circle in explaining why I would like to do mouse interaction with a UserControl in .NET. Some things may lurk for years, waiting for the right moment. Creating a .NET version of DBHots is something that simply must be done before I can rest. The next thing I’d like to know if it is possible to have my UserControl not being created as a control at all at run-time, but rather become a component so I don’t have to waste a Window handle on it.

The class I was looking for? It had the most obvious name imaginable: ControlDesigner.

Posted in Nostalgia, Programming | Leave a Comment »