How Good Is ChatGPT At Advanced Programming?

It’s time to up the ante and challenge ChatGPT’s code-generating abilities. Last time, I requested OpenAI’s Large Language Model (LLM) AI to write a ShoppingCart class. ChatGPT delivered and produced surprisingly clean, working code. Well done, ChatGPT!

Yet, how will ChatGPT fare with something a bit grittier and more complicated, say, programming a multi-step business logic workflow? And to make it more interesting, the requirements description would entail a mixture of relevant business logic—WHAT the system does—and distracting technical specifications—HOW the system does it.

I’m thinking something like this:

New Requirement: Register A New Customer

Our mobile and web client applications should be able to securely register a new customer via our RESTful Web API.

Here are the detailed steps:

HTTP POST the customer information to the OAuth2-secured API endpoint /customers.
Validate the received customer information. In particular, check that we have the required fields like first name, last name, and email address.
If the customer already exists (by email address) in the SQL Server database, it’s an error and returns HTTP status code 409 – Conflict.
Otherwise, save the customer with a unique identifier to the database.
Return the customer, including a unique identifier, to the mobile or web clients with HTTP status code 201 – Created.

Nice To Have: Send the customer a welcome email after a successful registration.

That should test ChatGPT’s mettle!

FIRST ITERATION

Let’s forgo code generation for now and see how well ChatGPT does at separating the relevant business logic from the irrelevant techno-babble:

Here is how ChatGPT responded:

Unfortunately, that didn’t go at all well. ChatGPT simply reworded the complete technical requirements and failed to extract the business logic.

For example (not an exhaustive list):

ChatGPT mentions the goal of building a RESTful Web API, which is a mechanism—a HOW we’re doing it.
We’re to secure this Web API with OAuth2, a particular authentication framework—another mechanism.
The HTTP Status Codes are also particular ways of doing things rather than what we are trying to achieve, that is, communicating process success and failure.

I am sure ChatGPT can do better; it’s up to me to improve my prompts.

Let’s try again.

SECOND ITERATION

I try again with the same prompt, apart from one minor detail: An hint requesting ChatGPT ignore all technical points when extracting the business logic.

And what did ChatGPT produce?

Now that is more like it! The AI got the hint and dutifully ignored the technical specifics. Well, mostly.

A couple of irrelevancies slipped through:

ChatGPT’s answer still refers to particular client types—e.g. mobile and web clients. Even though these clients are not part of our system, mentioning them here hints at a web services API. The description of a business logic workflow should avoid even indirectly alluding to likely implementation details or mechanisms.
ChatGPT indicates that the system will email the customer upon registration success, as specified in the original requirements. So what’s the problem with that? This optional requirement hides a mechanism. The mechanism is ‘email‘. Do we need to specify a distinct mechanism to send the customer a welcome message? Does it need to be an email? Could it not be an SMS/txt message or an in-app notification? Of course, it could. Business logic cares that we send a message, but not how we do it. At least in this instance, ChatGPT had trouble generalising from a specific example. And this should come as no surprise. Humans are great at generalising from specific examples, while AI has a way to go.

THIRD ITERATION

Let’s generate some code!

In my following prompt, I asked ChatGPT to generate C# code.

And here is what it produced:

OK, so what do I like and dislike about the code? I’ll examine the most apparent successes and failings; otherwise, this post will be too long.

Let’s start from the top:

Customer entity. I dislike how it’s a highly mutable class. All those public setters mean that any property can be altered post-creation. For the most part, this mutability is unnecessary—once we’ve set the FirstName to a specific string, say ‘Fred’, it is unlikely that this would or should change for the same customer instance. So why make it changeable?
CustomerService class. A bland and non-descript moniker for a specific business logic workflow: Registering a customer into the system. However, this does not come as a surprise—tons of flawed training data would have taught ChatGPT to name it as such. In my opinion, a better name might be RegisterCustomerWorkflow or RegisterCustomerUseCase.
Asynchronous calls. I like how ChatGPT made the RegisterCustomer() method async given how we would make Network IO calls to the database or other customer data repository.
Data field validity check. This smacks of Feature Envy. The workflow class does some validation work that could be delegated to the Customer class.
Validation checking – In addition to the last point, I am not a fan of the validation being out in the open in the RegisterCustomer() method. Input data validation is only one of the steps performed in RegisterCustomer(), so why not hide the details of the validation checking into a private helper method, say Validate()? If we want to verify how we are validating, we can navigate to Validate() and take a closer look. It’s not all bad, though—I am a raving fan of using exceptions to communicate problems we cannot handle in the current context, and ChatGPT is using exceptions for validation problems. Nice one ChatGPT!

FOURTH ITERATION

There are three things I’d like to change about the code generation:

I’d like to see code generated that has less emphasis on particular mechanisms, like the requirement to email the customer welcome notification. I add a phrase to exclude mechanisms to the prompt. However, I am not very hopeful it will work—Fingers crossed.
Maybe ChatGPT will produce better code if I request that it employ Clean Code and Clean Architecture principles. Clean Code, for the sake of extracting the validation into a separate helper method, and Clean Architecture to more clearly express my desire to abstract away mechanisms like the emailing of welcome messages.
Lastly, the sending of a welcome message was an optional requirement, and because of that, I suspect the SendWelcomeEmail() method was a public method to be called when needed. However, the calling of SendWelcomeEmail (or, better SendWelcomeNotification()) should have happened at the end of RegisterCustomer(). To this end, I removed the optional requirement. Let’s see what happens.

The reply:

I’ll critique it from the top:

ICustomerService interface. Yes, that’s nice for clients of the service as they can use an Inversion of Control container to instantiate CustomerService. Unfortunately, the name of the workflow class has not improved.
Services. I like how services like the database or email service are abstracted as interfaces and communicated via injection into the constructor. But what if we want to use something other than a database or email? Regarding the database, we have the same problem that we have with emailing: What if we wanted to store our customers, not in a database, but read and write them to a file? Or get them from a CRM system’s API? In those cases, naming the data repository interface IDatabase is as problematic as IEmailSevice—we’ve locked ourselves into a mechanism.
Synchronous calls. Since the last iteration, we’ve lost async/await. That’s disappointing. We’re bound to make networked calls if we’re using a database.
Helper methods. ChatGPT has generated several well-named helper methods hiding much detail that was previously strewn about in RegisterCustomer(). I prefer explanatory names like IsValidCustomer() and IsDuplicateCustomer() methods, which encapsulate the pertinent parts of the input data validation process. SaveCustomer() manages the data persistence, and SendWelcomeEmail() calls the EmailService.
EmailService.SendEmail(). Do you see what’s wrong with this call? It specifies the email subject and body within this high-level business logic workflow! Why is this a problem? It’s foreseeable that the email subject and body change frequently. However, this BL workflow should not be concerned with such detail. Its responsibility is to orchestrate and manage high-level steps, like input data validation, saving the customer data, and kicking off the sending of a Welcome message to the customer. It should delegate the details to other services as much as it can. Concerning sending of (email) messages, RegisterCustomer() is doing too much—a violation of the Single Responsibility Principle. Instead we could have had a call like this:

      CustomerNotifier.SendWelcomeMessage(customer);

Isn’t that much better?!

CONCLUSION

ChatGPT seems to be good at generating typical code, but—from what we’ve observed—not excellent, clean code. Naturally, ChatGPT is limited by the training data it has been fed, and to this end, it’s learned to regurgitate code that is mainstream, won’t raise too many eyebrows, and that most developers will be reasonably comfortable with, and that includes me. Undoubtedly, over time AI like ChatGPT will improve its coding game. In the meantime, we still need human programmers to produce excellent code for reliable, maintainable systems of today and to act as training data for the AI code generators of the future.

How Good is ChatGPT at Advanced Programming?

FIRST ITERATION

SECOND ITERATION

THIRD ITERATION

FOURTH ITERATION

CONCLUSION

Leave a Reply

Leave a ReplyCancel reply