How to write a makefile

2019-10-31 16:17:06

make

GNU make, to give it its full title, is a program that is used to build other programs. It is ancient by the standards of computer time, being 43 as of the time of writing. Make is frequently used to compile C programs, but can really be applied to almost everything.

A sorry tale

The reason I am writing this blog is because I have finally snapped. Finally been driven over the edge by the endless abuses of this helpless build system that I have seen. Make strikes a fantastic balance between functionality and simplicity, but it seems that many online articles encourage writing makefiles the wrong way, leading to a large body of people who have never experienced the sublime bliss of a proper makefile.

A catalogue of errors

By far the most common pitfall I have seen is the temptation to treat make as a scripting language. To write out exactly the steps to build your project in one gigantic target, inevitably called build.


build: 
	gcc program.c extra.c -o program

The thing that finally drove me over the edge actually wasn't this, but is too complex to reproduce here. I found a build system that was treating make as cmake - generating many targets in a slew of recursive directory traversal.

The basics

Make is made up of the atomic building block of a target. Each target is denoted (roughly) like so:


target_name: target_dependency_1 target_dependency_2
	build_command arg1 arg2 etc

The fundamental, most important point to take away is this:

A target should correspond to a file

This isn't exactly true in all cases, but by and large you should have only about 3 of these "phony" targets. You should have these specifically denoted by a PHONY statement:


.PHONY: install clean

So targets should be files. How to start? The obvious choice is the final executable.


program: ???
	???

What needs doing to create the program? C can be compiled into intermediate .o files and then linked together. Hence:


program: program.o extra.o
	???

This just tells make that before it can create program, it must first create the two .o files. Now all we need to do is to link these two object files (that we know exist, because they are dependencies).


program: program.o extra.o
	gcc program.o extra.o -o program

Great! We can link the object files, once we get them. Next is to write a rule for those object files. Time for another couple of rules.


program: program.o extra.o
	gcc program.o extra.o -o program

program.o: program.c
	???

extra.o: extra.c extra.h
	???

This looks familiar, with maybe one small interesting thing; The c files are listed in the dependencies. Initially this doesn't make much sense - there's nothing we need to do to these files. This doesn't matter though, they should be listed as dependencies because they are certainly dependencies - you can't build program.o without program.c

Time to fill in the rules again:


program: program.o extra.o
	gcc program.o extra.o -o program

program.o: program.c
	gcc -c program.c -o program.o

extra.o: extra.c extra.h
	gcc -c extra.c -o extra.o

Take a moment now. You're all done. Makefile created.

But. I'm sure you've spotted the issue - we're compiling 3 files and we have 3 rules. Our makefile is O(n). At this rate we're going to be writing almost as much makefile as C.

Getting clever

Can we cut this down? Of course we can. Nobody would make a build system that bad.

Automatic variables

the first thing to notice about our rules is that we are effectively writing just writing the same thing twice - to compile this object file you need this C file. We should be able to generalize that.


program: program.o extra.o
	gcc program.o extra.o -o program

%.o: %.c
	gcc -c $^ -o $@

There are two different things going on here. Firstly we are using wildcards - in make this is a %. The %.h matches the first wildcard; if you have an object file it is create from a c file of the same name.

The second is the variables used in the script line. There are two here:


'$^'
    The names of all the prerequisites, with spaces between them.  For
    prerequisites which are archive members, only the named member is
    used (*note Archives::).  A target has only one prerequisite on
    each other file it depends on, no matter how many times each file
    is listed as a prerequisite.  So if you list a prerequisite more
    than once for a target, the value of '$^' contains just one copy of
    the name.  This list does *not* contain any of the order-only
    prerequisites; for those see the '$|' variable, below.

'$@'
    The file name of the target of the rule.  If the target is an
    archive member, then '$@' is the name of the archive file.  In a
    pattern rule that has multiple targets (*note Introduction to
    Pattern Rules: Pattern Intro.), '$@' is the name of whichever
    target caused the rule's recipe to be run.

You can see how this expands - it becomes:


program.o: program.c
	gcc -c program.c -o program.o

Doesn't that look familiar?

Shell commands

This is a lot better, but it still requires the executable to list all the files it needs in its target. Maybe we could still do better.

Make allows the use of shell commands via the $(shell) syntax. It also has quite a large library of string and directory operations. Hence:


CFILES = $(shell find . -type f -name "*.c")
OFILES = $(CFILES:.c=.o)

program: $(OFILES)
	gcc $^ -o program

%.o: %.c
	gcc -c $^ -o $@

Again a few things here:

Firstly make has variables. Shock horror.

Secondly the shell command. We invoke find and tell it to only locate c files, pretty straightforward.

Thirdly the $(CFILES:.c=.o). WTF? This is just some syntactic sugar for a pattern substitution, just swapping the .c extensions for .o.

And finally another magic variable to capture the dependencies and we have a fully automated luxury makefile.

Implicit rules?

There's more?

See the second rule is very boring. It just tells you that you can compile object files from c files. Everybody knows this. So why do we even have that rule?


CFILES = $(shell find . -type f -name "*.c")
OFILES = $(CFILES:.c=.o)

program: $(OFILES)
	gcc $^ -o program

You would think that this wouldn't work right? We haven't told make how to compile an object file. Make knows.

Surely it knows about executables too...


CFILES = $(shell find . -type f -name "*.c")
OFILES = $(CFILES:.c=.o)

program: $(OFILES)

Yes. This is valid. We can even do it in one less line.


CFILES = $(shell find . -type f -name "*.c")

program: $(CFILES:.c=.o)

One thing to note is that implicit rules use environment variables to determine things like compilers. The defaut is a not-so-sane cc, so it is advisable to set these.


CC = gcc
LD = gcc

CFILES = $(shell find . -type f -name "*.c")

program: $(CFILES:.c=.o)

Niceties

This is very elegant, but leaves some things to be desired. First let's tidy up our directory tree.

Source dir.


CC = gcc
LD = gcc

SRC_DIR ?= src
BIN_DIR ?= bin

DIRS = $(BIN_DIR)

vpath %.c $(SRC_DIR)
CFILES = $(notdir $(shell find . -type f -name "*.c"))

$(BIN_DIR)/program: $(CFILES:.c=.o) | $(BIN_DIR)

$(DIRS):
	mkdir -p $@

By now you should be getting pretty good at reading makefiles, so I'm just going to point out the interesting things. Firstly we have a new type of dependency - one separated by a | character. The difference here is that this dependency ignores time checking.

By default make will rebuild a target automatically if any dependency is newer than the target. This is fantastic . It means that when you edit a c file, the corresponding object (and only that object) will be rebuilt. Because that object has been rebuilt the executable will then be rebuilt. All this so transparently that you probably haven't even noticed. But we don't care about the age of the binary directory here, all we care about is that it got created at some point, so we use the alternate dependency type.

Secondly vpath. vpath acts as a directory search path for patterns. Because we now have all the c files in the src dir then the default vpath (the directory the makefile is in) wouldn't contain and c files. Using this statement we tell make that it can find c files in the src dir. At the moment this is just using the src directory, but the addition of another $(shell find) would very simply allow you to use an entire tree of source directories.

Customization

All these implicit rules are cool and all, but that if we want to change some of the compiler flags. Fortunately make includes some more built-in variables in these implicit rules.


CC       = gcc
LD       = gcc

CFLAGS  ?= -O3
LDFLAGS ?= -lm

SRC_DIR ?= src
BIN_DIR ?= bin

DIRS = $(BIN_DIR)

vpath %.c $(SRC_DIR)
CFILES = $(notdir $(shell find . -type f -name "*.c"))

$(BIN_DIR)/program: $(CFILES:.c=.o) | $(BIN_DIR)

$(DIRS):
	mkdir -p $@

You might notice that some assignments are using the ?= operator instead of the normal = operator. This operator can be roughly interpreted as "assign if this is not already defined". This is useful primarily if you call or include this makefile from another makefiles, as it won't override operations you already have set. In the case of CC etc were are overriding a default so need to use the normal = operator.

Phonies

You will note that at this point we've stayed true to our original goal - every single target we have corresponds exactly to a file. If you run:


make bin/program.o

It will create bin/program.o

Now comes the first of our phony targets. You have likely seen clean before.


CC       = gcc
LD       = gcc

CFLAGS  ?= -O3
LDFLAGS ?= -lm

SRC_DIR ?= src
BIN_DIR ?= bin

DIRS = $(BIN_DIR)

vpath %.c $(SRC_DIR)
CFILES = $(notdir $(shell find . -type f -name "*.c"))

.PHONY: clean

$(BIN_DIR)/program: $(CFILES:.c=.o) | $(BIN_DIR)

$(DIRS):
	mkdir -p $@

clean:
	gio trash -rf $(BIN_DIR) || rm -rf $(BIN_DIR)

Clean traditionally removes all built files. Note that here clean is made of two commands linked by a shell || (or) command. The first command that is run will move these to trash, provided that is set up. If it's not set up then it'll rm them. This is default in my makefiles, because there's always that one time that you mess up the $(BIN_DIR) variable and accidentally rm -rf your masters degree dissertation.