There is no worse code than the one I wrote yesterday.
I’ve been a programmer for about 20 years — about 15 or so of those working with Python — and in my work experience I don’t frequently review other people’s code, but I frequently look at my own. Mainly because I have to maintain it. And when looking at it, I frequently think, “Who wrote this thing? And what where they thinking?”
Of course I know it was me. There is no such thing as a perfect code, but I strive to do my best. However, while trying to strike a balance between various aspects of good coding practices, I often find that certain elements may fall short. This article will discuss a few essential components of good code.
Effective
This is probably the most important thing in software programming. The code has to work. “Effective” means it does what it’s supposed to do without errors. It has to be consistent — I expect the right result every time, whatever the input is.
Consider the following example:
def get_file_value(filename, key):
"""Opens a file containing lines with KEY=VALUE, then returns the value
for the given key.
"""
with open(filename) as fileobj:
for line in fileobj:
parts = line.strip().split('=')
if parts[0] == key:
return parts[1]
This is the kind of implementation that works, provided input is always clean and nice, unlike what users often do. Here are a few issues with it:
The returned value is stripped of whitespace just on the right side (because the whole line is stripped once to get rid of line breaks). This may be unwanted.
# line in file: 'NAME= baz . '
>>> get_file_value(filename, 'NAME')
' baz .'
If the value contains the character “=”, anything after that is ignored.
# line in file: 'CONFIG="x=1,y=2"'
>>> get_file_value(filename, 'CONFIG')
'"x'
Spaces between the equal sign throw the function right off. The same happens with different casing.
# line in file: 'NAME = baz' or 'name=baz'
>>> get_file_value(filename, 'NAME')
None
If the file contains a line with just the key in it, it will raise an IndexError when returning the value.
# line in file: 'END'
>>> get_file_value(filename, 'END')
Traceback (most recent call last):
...
IndexError: list index out of range
The result is None when the key is not found. This may be by design but should be commented. Some people frown upon a function without a “return None” statement but I’m personally ok with it.
If the file does not exist, or is otherwise unreadable (due to permissions, locks, etc), the function will raise exceptions. This may also be by design but should be noted.
If the file is not encoded in the current locale’s encoding (the default value for the parameter encoding), and contains invalid byte combinations in it (believe me, I’ve come across this quite often), it will also fail.
In summary, I would say the function above is not production-ready. Can you fix its problems?
Readable
Believe it or not, the code you write isn’t meant to be read only for machines. Be it a collaborator, a co-worker, or even your future self, somebody will eventually have to read and comprehend your code. It’s impossible to determine their familiarity with the programming language or the specific project, so it’s crucial to keep it straightforward and easy to understand.
This includes:
Giving variables and functions good names: Sure, you can use i for a loop variable, but what are t_prv and t_gl? Wouldn’t it be better to just call them something less cryptic like total_previous and total_goal? Are you trying to save bytes or what? (I’ll talk about naming in a future article.)
Writing meaningful comments: Don’t write in English what’s already written in your programming language. Say what and why you have to have that line. I personally don’t mind having more comment lines than programming lines — as long as there’s a good reason for it.
Follow coding conventions and guidelines: Every programmer has their own style, but it is wise to use a common style guide that other people will recognize. PEP8 is gold standard for Python, and other languages have their own, but you’d be wise to follow some style guide.
Not using language hacks or at least explaining when you’re doing so. One example is when using Python’s magic methods. When used properly, they may read very nicely, but may not always be intuitive for the reader.
Logically dividing the project into reasonably-sized files: You’ve got to have a very good reason to have a file with more than 1,000 lines.
Beware of using things like regular expressions and one-liners. They are great, sometimes they are the right tool, but they can very easily get clumsy and untidy.
Reusable
As much as possible, your code should not repeat itself. Copy-paste of large amounts of code is unacceptable. Nowadays languages have all sorts of techniques — such as class inheritance, promises, first-class functions, etc — to prevent you from writing the same thing over and over again.
This includes not “reinventing the wheel”. Python for example has a robust standard library that can perform things like reading CSVs, generating random passwords, sending emails, handling HTML, providing translations, to name a few.
Efficient
Efficiency has to do with the amount of resources your code requires to be executed. These resources can be measured in terms of CPU time, memory usage and access, I/O operations, and the general complexity of your algorithms.
There are several techniques to make Python deliver a swift execution. I will elaborate on them in upcoming articles.
Secure
This is usually a project-wide architecture characteristic, defined in ways such as where your credentials to the database are stored, or how authentication tokens are passed to / from the clients, but some programming techniques should also be applied when making secure systems. A few examples are:
Never hard-code secrets.
Never trust user input, especially on the web — in particular, always escape it for SQL statements or HTML output.
Use up-to-date cryptographic and hashing algorithms (eg, sha512 instead of md5, or AES instead of DES).
Check buffer size when working directly with memory (to prevent overflow exploits).
It is also a good idea to log — or provide means to log — important actions that your code is doing. This will help debugging, testing, and post-mortem analyses.
Tested
Untested code is buggy code.
It is incredible how many times I’ve tried to make a code change, and because it was either a small change or seemingly inconsequential, I’d just do it without testing — and 9 times out of 10, it would have a bug in it.
I know there are external factors in this, such as tight schedule and lack of company culture around testing, but it is each developer’s responsibility to strive for the best quality possible in their product. You should have a reliable test suite, enhance it and run it on every modification.
Please note there is a difference between testing your code and testing your software. When a tester tests the software, all they are asserting is that the current version of the software does what they expect. Testing your code implies maintaining the test suite by running, adding, and modifying tests upon every change, including external changes.
Documented
Lastly, one aspect of your code — that may even live completely outside your code — is the documentation. Who has never considered using a library or service, but gave up after finding out the documentation was scarce / unorganized? Documentation provides means for developers and testers to know what your code does, and sometimes how it does it.
This is not restricted to comments. Comments often provide only focused insights into a part of your code, while the documentation may provide a more overall approach to the system.
Conclusion
There are many aspects to evaluate “good code”. While it is impossible to write the perfect code, especially when you have time constraints, it is important to keep these aspects on the back of your mind when writing your code: make it effective, readable, reusable, efficient, secure, and don’t forget to test it and document it!