Image: kirill_makarov/Shutterstock
This question was put to me by Motherboard Editor-in-Chief Derek Mead and I can't stop thinking about it: What is the smallest code?It's an interesting question in large part because it can have several meanings all pointing back toward what code is in the very first place. Is small code the actual code that we as developers and engineers write onto a screen? Or should we measure code by what code is translated into and how it actually executes on a real machine. Better, I think, is a combination of sorts: The smallest code is the smallest amount of programming language syntax that we can write to produce the largest machine-level effect.So, let's look at all three of the abovementioned perspectives, starting with the easiest.In terms of characters, what's the shortest piece of valid programming language syntax I can write?To be clear, I don't know every programming language, but I have a reasonable handle on most all of the major ones and then some. So-called interpreted languages are what make this an easy question. These are languages whose syntax (what a programmer actually writes) is fed to some intermediate piece of software that functions as a translator (or interpreter) between our higher-level code and pre-built units of machine instructions. It's like executing programs within programs.The alternative to an interpreted language is a compiled language, which is where we write out a bunch of code in a file (like a Java or C++ file) and then send that file to be converted into a whole new arrangement of machine instructions representing that input file and only that input file. The difference is a bit like building a castle out of Legos (interpreted) vs. building a castle out of a big single piece of molded plastic (compiled). Both approaches have their advantages and disadvantages. Generally, if you're going to write a big-ass piece of software that's going to be installed onto a computer, you'll write in a compiled language.Python and JavaScript are both interpreted languages. We are free to write big-ass, old-school-style programs in either one, but we can also just feed tiny bits of syntax directly into either language's interpreter, which exists as a command line that looks like your operating system's command line (which is also an interpreter, but for a different set of commands). That is, Python is a language but it's also a piece of software that's installed onto our system like any other piece of software.A single numerical digit. This is probably the smallest piece of valid syntax I can write in any programming language.I can enter a single digit into either Python or the Node.js interpreter (which is a shell that interprets JavaScript) and either interpreter will simply echo it back to me without errors or warnings. I can also enter nothing into either one and not get an error, but I'm not sure that's in the spirit of the question.In a compiled language, a lot more is needed, relatively speaking. We at least need the shell of a function providing the operating system with an entry point into the program, so we're talking a half-dozen characters, at least. The basic C++ program skeleton looks something like this:int main() { return 0; }It's not much, but still more than:I don't think the smallest syntax measure above is a very honest way of looking at things. To execute that "0" will actually take a whole lot of system resources, relatively speaking. According to my MacBook's activity monitor, the Node shell I used to interpret that single digit is occupying around 11 MB of system memory. A single character, however, can be represented by a single byte of memory. So, we're holding on to 11 million bytes to echo a byte of data.int main() { cout << 0; return 0; }
The C++ code above modified to output the single digit "0" occupies about 28,000 bytes of memory at its peak (according to the code profiling tool Valgrind). That's a much smaller footprint.
Still, 28,000 bytes is 28,000 bytes. I might be able to improve things by ditching "iostream," which is a standard C++ library for dealing with input/output operations. Including it means that I'm including some extra code from another file, and then more code from other files that the iostream code depends upon. The iostream library itself isn't enormous, but it has to bring in a bunch of other stuff to work. This all gets planted into system memory when the code is actually executed.
In the above program, iostream just gives us cout ("cee-out," but I'll forever say it "kout" in my head). This is just a piece of syntax useful for outputting data to the screen. We can do the same thing is a slightly different way, as in:
Advertisement
Smallest syntax
Advertisement