Code Writing Code

I’m not sure why I hadn’t thought of this sooner, but in a very small software company, using a code generator would not only seem to be a great time saver, but would be an invaluable tool for creating vast quantities of reusable, high quality code. This is assuming of course that you do a good job with the templates to begin with.

One of my first experiences with code generation was working at Clearwire. The web based ordering software I was working on was under the unfortunate constraint of reporting to the sales and marketing team. Reporting to either of these groups alone would be a nightmare, but at Clearwire, doubly so as they were basically the same department. Even more difficult was that I was located in Buffalo, NY while the Sales/Marketing team was  located in Dallas, Texas.

I rather clearly remember one particular nightmare I ended up getting involved in. Let me give you the short story.

Week 1:
VP: “Hey Mike, can you make up some suitable text for this web page?”
Mike: “No problem.”

Later that day:
VP: “Mike, can you change that text you made up from X to Y? I’d appreciate it, and keep up the good work.”
Mike: “Sure, and thanks!”

Week 2:
VP: “Mike, can you change the text from Y to X for me?”
Mike: “Sure, no problem.” (Heh. Should have done it my way to begin with)

Week 3:
VP: “Mike, got a favor to ask. Can you change the text from X to Y? That would be great.”
Mike: “Hmm. Well, I guess so.”

Week 4:
VP: “Mike, the text on that page gives the wrong impression to our customers, can you change it from Y to X?”
Mike: “It just was X and you had me change it.”
VP: “Really? Well, I’m sorry. Can you change it please?”
Mike: “Allright.”

Week 5:
VP: “Hey, you got a second? Can you change the text from X to Y?”
Mike: “It just was Y.”
VP: “Who told you to change it?”
Mike: “ummm. You did.”
VP: “Really? Oh, I’m sorry about that. Can you change it back please?”
Mike: “No.”
VP: “What? Why not?”
Mike: “Because I’ve got too much code to write to respond to trivial requests to change meaningless text that nobody reads anyway just because you woke up on a different side of the bed than you usually do after drinking all night and snorting pixie stix with your buddies!”

Chances are that I probably didn’t say that last sentance, but I was pretty annoyed at that point. Basically, I insisted that I wasn’t going to make any further text changes until I had it in writing from him that it was what he really wanted, because I really had too much to do to waste time on changing text every single week.

Shortly thereafter, I received a bit of an unofficial reprimand from my boss for giving the VP of Sales a hard time. Our little chat involved a discussion about how an entry level engineer should address a VP of the company with respect, no matter how stupid he’s being at the time, after which I can bring my concerns up with my boss who will address it with his boss, who will address it with the VP in a social environment of beer and pixie stix so as not to damage his frail little ego.

Fortunately, my point was well taken, I was not formally reprimanded, and future changes were submitted in writing. Apparently, it was my tone that was ill taken. I took it as a valuable lesson about how to reprimand your superiors for excessive stupidity.

Even so, when I was tasked with rewriting the application quite some time later, I built templates that would regenerate the application UI by tweaking configuration files and then running a script. I always found it an impressive little design I had come up with. Code that writes code. Very spiffy!

Of course, whenever you write any code, you’re writing code that writes code because it eventually compiles into machine code. But that’s really not quite the same thing as ‘code that writes code’. Converting to machine code is merely compiling the source code you’ve written into something the machine can understand.

On the other hand, a true code generator will operate from a set of widgets to generate structured code that is useable in other applications, either directly, or as a library that is compiled after being generated.

There are a lot of subtleties that I’m intentionally glossing over. For example, would software that unrolls the loops in javascript to make the page load and execute the javascript faster considered a code generator? Probably not, and definately not in my book.

Would that really work anyway because the load time of the page would be increased in direct proportion to the number of loops that are unrolled, forcing the javascript parser to take longer to parse, load and exacute said javascript? No, probably not. Don’t laugh, I’ve seen this suggested, and no it didn’t actually work.

The fact is that a good code generator equipped with the right generation templates can save you a ton of time and development costs. How much you ask? Well, this month I’ve been doing a lot of research on code generators. So much research in fact that I’ve been seriously neglecting this blog.

As a one person startup company (soon to be expanding! woohoo!), it’s really hard to do everything, so any edge that you can find to get more done in less time is a welcome addition. This includes everything from outsourcing, automation, and yes, code generation.

I’m the type of person who is big into databases. I’ve been writing applications to interface with databases for nearly a decade. There are literally dozens of different schemes for doing so, and so far not one of them has been proclaimed by developers to be the ‘best’. The reason is simple.

Every data layer schema involves tradeoffs. When you abstract the database to the point that it becomes no different than the code itself, you lose bare metal access. If you lean in the other direction and rely on the database to do some of the work, you lose flexibility to use any database you like because not everything works the same in every database.

Sure, the basics are the same, but do you really feel like writing case statements based on how to get the database date, based on the database you’re connecting to? I didn’t think so. Writing database independent code is hard, and rightfully so.

Database vendors don’t make it easy, nor would they want to. Do you really think that Oracle wants their customers to be able to swap SQL Server, mySQL, PostGres, Firebird, or anything else in as the back end database whenever the customer decides they don’t want to pay $40k/processor per year for licensing fees?

The business justification for Oracle and Microsoft avoiding this sort of database swapping becomes even more relevant as computers move towards multi-cores. Oracle has decided pricing should be based on a per-core basis, while Microsoft prices SQL Server based on a per-socket basis. While obviously performance comes into play, if it were easy to swap one database for the other, pricing becomes a much more serious consideration.

I don’t think that there’s any doubt as to whether a code generator can save time if good templates are already available. The tool I’ve been concentrating on lately is CodeSmith, from CodeSmith Tools. It’s not real expensive, and it certainly has a lot going for it. As someone who has worked with SQL Server and web applications for years though, I find it somewhat lacking in the templates that I really want.

Fortunately, there’s a decent sized user community behind it, and all of the templates can be easily customized or rewritten. So if you’re like me and you don’t like the way a set of data access layer templates are written, you can rewrite them.

But is this saving time?

There are a lot of things going against you when trying to answer this question. You have to find the tools, you need to work with them to evaluate them, then you need to build the templates you want, and hope that it’s saving you time rather than wasting it. It’s possible that given all of that, you could potentially use more time. So, how do you know whether a code generator is right for you?

Simple math, my dear Watson.

CodeSmith has a nifty little ROI calculator, which is likely more for marketing purposes on their part than anything else. The project that I’m undertaking right now has something of a large database behind it, and after generating around 80,000 lines of code, CodeSmith estimated that I had saved 1,600 hours of work and nearly $100k in development costs, assuming 50 lines of code per hour and $60/hour for salary.

That alone is justification for CodeSmith, or any other code generation tool for that matter. Just don’t spend 10 months working with and learning how to use the tool before you start on your project or you’re back where you started. Even if other tools don’t include an ROI calculator, it’s pretty trivial to throw a perl script together to count the lines of code that were generated for you.

The other great thing about code generation is that when you make changes to underlying application structures, you can simply regenerate the code from the templates. Did you add a column to a table in the database? Regenerate it, and all of your collections, enumerations, business objects, and everything else are updated quickly, cleanly, and flawlessly.

Wait, flawlessly? Well, maybe. You should still do testing to make sure the new code works, and you should verify that none of the other source files changed. This is best accomplished with source control and differencing source files. You are using source control, right? Since you already have a tool which writes code for you, why not use it to write unit tests for the various objects you’ve created? Why not indeed.

Code generators can also be used to help write documentation. One of the first tools that I ever used for this was javadoc. While it was meant explicitly for java, it did the job. There are other tools out there like NDoc, Doxygen, and literally dozens more. Need to document the database for your application? Some are free, and some not. The documentation doesn’t need to ship with your product, but as an internal document, it can be invaluable for new hires.

Like all things, code generators do have their downsides. These are mainly measured in template development and learning how to use the tools. But if I could trade 160 hours of work and $500 to get 1600 hours worth of code, you can bet your bottom dollar I’m going do it.

To take that a step further, if your company is planning on rolling out 5 products, and you can reuse your templates, you will have accomplished 4 years of development in just one month. That’s a pretty staggering productivity rate. Not to mention that all of your code will have completely uniform coding standards, making it easy to follow for anyone.

After having said all of this, I do feel the need to attach a bit of a disclaimer. While code generators can save you a lot of time and effort, the temptation is certainly there to continue using this two ton wrecking ball to put a screw into a piece of drywall. Not only is it the wrong tool, but the scale is seriously out of whack.

Code generators are not a one size fits all solution. Nor should they be used in every situation. It’s just not always appropriate to use them. Still, they can provide some useful functionality, and are definately worth a good look. And if it can save you a lot of time, effort, money, and ultimately result in a higher quality product, then you owe it to yourself to explore the possibility, even if you decide not to go that route.

2 Comments

  1. V.Chan on February 10, 2009 at 3:20 am

    Have you tried using Codesmith 5.0 to generate codes from PostgreSql database ? I wonder why the schema provider is not working well with ADOX ?? Any ideas ?



  2. Mike Taber on February 16, 2009 at 4:13 pm

    No, I have not tried using Codesmith with Postgres at all. Check with the Codesmith team. Maybe they’ll be able to help. I would think they should have at least a little experience with it, although I suspect that using .NET with a Postgres back end is not a very typical arrangement.



Leave a Reply