Genetic Programming

Obviously "Genetic Programming" is a specialist topic, only of interest to scientists investigating DNA structure.  WRONG!   At a recent (April 3rd) NZ Computer Society meeting, Harvey Lockie showed us how Genetic Programming can provide effective solutions to some common problems. 

 

For example: imagine that you have a wealth of sales data telling you how much you sold of each product each month for several years, and you are trying to predict sales next month.  The formula for predicting the sales in a future month is: -

W1F1 + W2F2 + W3F3 …..   WnFn

where W1 is the weighting coefficient of Factor 1,  etc.  F1 might be the sales last month, F2 the sales this month last year, F3 the average monthly temperature, …  whatever.   Unfortunately, you don't know what factors are important (= have high weighting factors) and which are unimportant.  How do you work out the right answer? 

 

Computing 101: '"You can't solve a problem until you understand how".  So you study the problem until you can develop an algorithm to solve it.

 

Genetic programming takes a different approach. Instead of defining the best solution, a number of potential solutions are tried out. These multiple algorithms are automatically tested for their ability to solve the problem: mimicking evolution, the more successful solutions breed and mutate through many generations, the less successful die out.  Eventually the system evolves to an optimal solution. 

 

This approach sounds as if it ought to be very inefficient. In fact, it is capable of throwing up novel solutions to otherwise intractable problems, sometimes with significant savings in both programmer and machine time. Harvey reports cases of the genetic approach taking 30 seconds compared with 17 hours for an exhaustive search of a problem with 12 factors and 5 states for each factor.  With 24 factors, the figures are even more dramatic; 45 seconds compared with a calculated 500,000 years for an exhaustive search.

 

The genetic approach is not always suitable.  It's great if you want a good (95%) solution quickly, but if you want to be certain that you have the best result there may be no alternative to the 17-hour exhaustive search.  The approach only works when it is possible to see progress towards the solution - you can't use it for decryption, for example.  However, in the right situation genetic algorithms can provide very effective solutions.  Healthsoft's software for pharmacy and retail stock management, used in some 3000 sites worldwide, uses the technique.

 

For more information, Harvey's presentation is available at http://www.nzcs.org.nz/events/documents/april_wfia%20preso.pdf, and he can be contacted at Harvey@lockie.co.nz.

SWEBOK.  

A project to codify the Body of Knowledge for Software Engineering may do for our profession what the PMBOK has done for Project Management.

 

When I started working in Information Technology, "project management training" simply meant training in the use of scheduling software. It was assumed that this was all one needed to know to be a competent project manager. Since then, project management has come to be recognized as an important profession: there is a significant difference between "A professional project manager" and "A competent MS-Project user". 

 

An important factor in the recognition of project management as a profession has been the role of the Project Management Institute, and their PMP (Project Management Professional) qualification.  PMP has become a prerequisite for being considered for senior project management positions and assignments. PMP uses an exam based on the "Project Management Body of Knowledge", or PMBOK. 

 

The PMBOK is not a textbook on project management; it is simply a catalogue of the subject areas within project management.  For example, there is a chapter on Project Risk Management. Within this you will find topics that you will have to understand about managing risk.  PMBOK does not tell you what the risks of your project are, or how to best manage these risks, it just lists the issues to be understood in Risk Management. The PMBOK is not much use as a textbook (you can't get PMP by studying the PMBOK alone), but it is excellent at defining what knowledge you'd expect from a professional project manager.

 

A project is currently under way to similarly define and codify the Software Engineering Body of Knowledge.  A draft SWEBOK has been produced and can be downloaded from www.swebok.org. The current edition is "Version 0.95" (May 2001).  SWEBOK reviewers include people from around the world, including some in New Zealand.  The SWEBOK is already being used as the basis for some Masters-level degree curricula, and it seems highly probable that professional qualifications analogous to PMP (SWEMP?) will become established.  My prediction:  within 5 years, "SWEMP" will be an essential qualification for IT consultants and managers.

 

There may be some within the IT industry who fear that a PMBOK-like approach will force us into waterfall-like processes and away from the agile methodologies that are much more effective. In theory, the PMBOK allows and supports any methodology. Its section on IT development talks about there being many different accepted methodologies, and includes diagrams of Waterfall and Munch's spiral lifecycle merely as examples.   In practice however it tends subtly suggest formality and leans towards the rigidity and bureaucracy of construction methodology. The PMP certification process is also deficient - you study the material, and you learn how to pass the exam, which is multi-choice (set up for automatic marking).  There were several cases of "Don't question, just learn the right answer". In spite of this, I have found the PMP a very useful qualification and tool. Personally, I have found that my PMP training has made me a more effective project manager even though I have continued to use agile methods.  I have had assignments where it was the key to success.  In other cases, having PMP has allowed me to ignore the rules with authority! PMP is no substitute for PM experience and sound common sense.  On the other hand, the issues of risk management, scope management, etc don't go away with agile development; they are merely handled in a different (and more effective) way. 

 

SWEBOK will similarly be a list of topics, and software professionals should be familiar with all of them.  Although there will no doubt be explicit statements that SWEBOK is just presenting a topic list, I expect that there will be a similar inference that formality is a good thing, and it is probable that SWEBOK will have an air of "CMM" about it rather than of "Agile methods".  Never the less, it will be very useful.  I agree strongly with Steve McConnell - the best answer is neither agile methodologies, nor CMM and maturity models, but having a good toolbox and knowing which tool to use for which problem.

Inspections, Structured Walkthroughs, and Pair Programming.  

The evidence is strong that reviewing of one form or another is a very cost-effective way of improving productivity and quality, yet remarkably few development groups use reviewing as a matter of course.  Perhaps this is a consequence of our initial programming training, where collaborating on our coursework was cheating. More probably it's insecurity: if suggestions for improvement feel like personal criticism then you'll naturally want to hide your work until it's "finished".  Reviewing can easily become an ego trip for the senior programmers, and an ordeal for the juniors.  Once this happens, the value of open collaboration flies out the window.

 

In 1998 I introduced reviewing to a consulting client with a Powerpoint presentation based on a number of sources, including Chapter ?? of Steve McConnell's book "Code Complete".  At that time three main types of reviewing were distinguished, formal inspections, structured walkthroughs, and code inspection.  Since then, the eXtreme Programming movement has introduced a fourth method, pair programming.

 

How does the productivity of pair programming compare with other reviewing methods?  In pair programming, as described by Beck in his book "Extreme Programming", one of the pair codes while the other watches (on the same computer).   Does this mean that productivity is halved, or even worse if there are unproductive discussions?  Yet XP advocates claim that productivity increases! 

 

I explored this topic in an email conversation with Bryan Dollery, who writes a regular column on XP in NZ Computerworld.  Bryan's experience is that pair programming doubles development speed, but with two developers in place of one this results in neither productivity gain nor loss.  Bryan believes that there are substantial downstream benefits in increased quality.  Of course this is anecdotal evidence. Like almost every other developer (including me) Bryan has not undertaken controlled experiments in which the effects of programmer skill, problem difficulty, or experience are removed in order to see how much productivity gain or loss can be attributed to pair programming. 

 

Bryan referred me to a paper from the University of Utah "Strengthening the Case for Pair Programming" which, as far as we know, is the only real study on the productivity of pair programming.  This study found that pairs initially produced results 40% faster, for a total programming time 60% greater. After some experience with pair programming, total programming time improved to a minimum of 15% greater.  Although more time is taken, pairs produce better product - fewer defects, better algorithms, cleaner design. 

 

In comparison, other inspection methods have been shown to increase productivity.  Most of the data is from formal inspection, probably because formal inspection is more likely to be used in an environment with a strong metrics.  Does this mean that formal inspection is a better approach than pair programming?  Not necessarily.  

 

The most obvious issue is that it is not clear that the benefits of the higher-quality product are properly accounted for in the productivity results reported for pair programming. If the cost of fixing the errors is included, would pair programming come out ahead? 

 

Another issue is that productivity improvements are documented in more formal development environments.  Agile developers may argue that this is hardly surprising - the potential productivity improvements are very great if you are using an inefficient methodology.

 

Whatever the answer, pair programming may have real advantages in promoting the collaborative approach that makes any inspection technique work.  Inspection techniques imposed from above without real acceptance by developers are often ineffective and soon discarded.  In contrast, pair programmers feel nervous when they are forced to work alone, missing the enjoyment of teamwork, and pointing to their experience that lower quality systems are developed when working alone.  I have a theory that the optimum approach is a mixture of inspection techniques, but that better results will be obtained when developers start with pair programming and then back off, than if they start with no inspection and introduce classical inspections.  I have no evidence whatsoever for this theory - I would be interested in others' opinions.

 

In any event, it is clear that the major success factor is not the choice of a particular technique, but the adoption of the appropriate attitude.  Pair programming's greatest contribution may be in fostering this attitude.

 

More:    My presentation on reviewing techniques. 

            For a description of classical reviewing techniques, and some productivity evidence, see Chapter 24 of "Code Complete", by Steve McConnell

            Bryan Dollery's Web page - contains copies of all his NZ Computerworld papers on XP.

Book Review - "High St@kes, no Prisoners", by Charles H. Ferguson

An insider's account of Silicon Valley, and the way it really works.  Charles Ferguson started Vermeer Technologies and turned his BIG IDEA into Frontpage, which he sold to Microsoft 13 months later for $US130m.  A fascinating insight into the politics and old-boy networks of the world of venture capital and high tech. start-ups, this books paints an unflattering picture of venture capitalists and high-flying managers in general, and several in particular.  I'm not sure how the author gets away with some of his statements without being sued - perhaps he's rich enough to not care - but it makes for entertaining reading.   Well worth reading if you want to know how "The Valley" works.   * * *

Absolute Powerpoint

Do you sometimes feel that we've lost the art of conversation and persuasion, that we no longer have discussions but instead we give and receive presentations?  You'll enjoy this!

http://www.physics.ohio-state.edu/~wilkins/group/powerpt.html

Visual Basic Programming Information

Something for the techo's.  Here are a couple of web sites that we've come across recently in the development of PureBuildÔ

 

Here is a link to a Windows API reference site for Visual Basic users.  It includes VB sample code and links to other sites as well.  However, the last update is March 2001, and the owner seems to have left for greener pastures.

 

Here is another link for VB API's that is more up to date, and even better.  If their hit counter is to be believed, this site has been accessed more than 42 million times.  A goldmine!  Well organized - we were able to find out what we wanted very quickly.

 

Thanks to Ian Marshall for these links.

Buzz-word Upgrade.  "The e-Business Ecosystem".

As companies extend their e-business strategies across the Web linking with customers, suppliers and partners, the result is a truly interdependent and interconnected environment.  A writer in Datamation has coined the term "The e-Business Ecosystem". 

http://cin.earthweb.com/public/article/0,,10493_1014031,00.html