And today's hard-learned lesson is: that's easy for you to say, but may be hard for me to do. In particular, I'm referring to generics, and some seemingly good advice: when moving to 2.0, replace all of your typeless collections with generic collections, so that you can prevent bugs in the future, and maybe even catch bugs that you have right now and don't even know it. Well, that advice is good, in general; but sometimes, it's not practical in existing code, for reasons of backward compatibility. But to understand why generics can be a problem, I'm going to have to lay some groundwork by explaining some terms.
First, let's talk about typeless collections, which were the only general-purpose collections available in .NET 1.0. The idea of a collection class is to provide the logic for maintaining a collection of related things. Common collections include: arrays (one or more dimensions of fixed size that define a set of related things); lists (one-dimensional collections, similar to one-dimensional arrays but resizable); hashes (a collection of things that can be looked up by some key value); and collections built on arrays, lists, and hashes. For instance, a stack is a list where the last thing added is the first thing removed; while a queue is a collection where the last thing added is the last thing removed.
Now it would be a lot of wasted effort to write the logic for a list of dates, and then rewrite that logic for a list of dollar amounts, and then rewrite it yet again for a list of dog breeds. List logic (or array logic, or hash logic) should be independent of what is contained in the list.
There are two common solutions for separating collection logic from content logic. One is to make the collection typeless. In .NET, for example, all types ultimately derive from the System.Object type; so the logic in System.ArrayList (the standard list class) is written to work on System.Objects. That way, if you need a list of dates, you create an ArrayList, and you put dates in it. If you need a list of dollar amounts, you create an ArrayList, and you put dollar amounts in it.
Now the problem with this approach is it requires programmers to be perfect, and further requires that their communications with other programmers be perfect. Here's a smart business tip for you: never rely on programmers to be perfect. We're human. We make mistakes. It's very often smart to build our tools in such a way that they catch our mistakes, usually as soon as possible. (Why as soon as possible? Because if you make a mistake today and find it today, you'll remember what you were doing, and the mistake will be easier to fix. If you have to wait a month to find the mistake, you'll forget what you were doing. Besides, if the mistake is caught soon enough, there's no way to ship it to the customer.) The most common mistake you can make with a typeless collection is to assume it contains one type, and then accidentally put the wrong type in it. Suppose you write a calendar program, and it maintains a list of dates. Now suppose I write an add-in for your program to automatically add the dates of dog shows into the calendar. And then suppose that, when your calendar calls my add-in and asks for dates, I accidentally give it a list of dog breeds. Now when your program tries to put those "dates" into the calendar, it will pull out Cocker Spaniel and say, "I don't know what date this is." Very possibly, your program will crash, and in a way that won't make it obvious that it was my bug, not yours. Or you could add code that checks everything I give you, as a form of defensive coding; but defensive coding adds overhead, and can itself have bugs.
Wouldn't it be a lot easier if, when your calendar asked for my add-in for its list, it could say, "And this list can only contain dates?" Well, that's one of many reasons that templates were added to the C++ language back in the 90s (a.k.a. The Ancient Years). Templates are a way of defining the behavior of a class while leaving some of the details unspecified; and then each time you use a template, those missing details are filled in. The classic example is the template List<T>, where T stands for an unspecified type. List<T> defines all of the list management rules, but doesn't care what T is; and then List<Date> uses those rules to make a list of dates, while List<DogBreed> is a list of dog breeds. If your calendar asks for a List<Date>, I can't give it back a List<DogBreed>: the code won't compile, so I can't ship it. Or if your add-in provided me with a List<Date> and asked me to fill it, I couldn't put CockerSpaniel in that list. Again, the code wouldn't even compile.
Now everyone acknowledges that templates are a better solution than typeless collections (in almost all cases); but when .NET 1.0 was developed, the team recognized that there were complications with templates that were larger than they were willing to deal with at the time. I won't try to defend this decision, but I will point out: they were trying to build a new, managed run-time, like Java (which also did not have templates until very recently); they were trying to make it support a wide range of languages (whereas C++ templates are only a C++ concern); and they were trying to build those languages and tools and a framework for developing all sorts of applications. They couldn't do everything in 1.0; and so somebody decided templates were out.
Now in .NET 2.0, we have generics, which are very much like templates (enough so that I can't really keep the difference straight). And that leads us to the advice that we should pretty much always prefer generics over typeless collections, and we should convert our old typeless collections to generics as soon as possible. Well, I stand by the first half of that advice for new development; but for legacy code, I found that it wasn't so easy. And ultimately, that seemingly simple advice proved to be more work than I could handle for version 1.5 of Tablet UML (coming soon!).
The first problem is that I had some methods that operated on those typeless collections. For instance:
private void SelectMembers(
ArrayList members, ArrayList newSelection,
ArrayList selected, ArrayList unselected)
The purpose of this method is to fill two lists, selected and unselected, based on two other lists. members contains the complete set of items, selected or unselected. newSelection contains the list of things which must end up in selected; and anything which doesn't end up in selected must end up in unselected. Note that this method doesn't care what's in each list at all, and relies on programmer discipline to ensure that all lists contain compatible types.
Now I could imagine converting this to a generic method, something like this:
private void SelectMembers(
List<T> members, List<T> newSelection,
List<T> selected, List<T> unselected)
But the problem with this is that generics have to be defined at the class level, not the method level. So that meant the class itself had to be a generic class. And since the class in question was a form, that meant I had to create a generic form.
Now all of that was possible, and in the end I got it all done; but it made the conversion jump from a simple search-and-replace to a "Now find every instance and every call, and make sure it has a generic parameter." That was a much bigger job than I signed on for.
And in the end, I tore it all out.
It actually worked pretty well, and I thought it was a nice design improvement; but when I went to save my model, the .NET serialization engine threw an exception, roughly paraphrased: "The Soap serializer does not support generics." As Richard Hale Shaw pointed out, "Soap can't support Generic anything per se, 'cause it has to remain platform-independent — Generics aren't!!" And later he added, "Why not use the BinarySerializer? It'll be smaller/faster as well."
To which my response is: that's easy for you to say! (You knew I'd get around to that eventually.) See, while my customer base for Tablet UML is small right now, I still have to think about backward compatibility for their files. And I started Tablet UML using Soap format (i.e., XML) for storing model files. That's not a mistake I'll make again (unless human readability or platform independence is a necessary feature); but it's the mistake I made, and it's the mistake I'm stuck with.
For now. At some point, I'll work on a conversion utility to go from large XML files with ArrayList to smaller binary files with List
Hey, if programming were easy, everyone would be doing it.






















































