Table of contents
Open Table of contents
Building C++ Pipeline
Pre-processing
Getting your ingredients into one space. First the pre-processor is going to look at the files indicated:
cpp pre.cpp > pro.i
And for each of the input files, in this case pre.cpp
it will look for any pre-processor macros that need to be expanded.
#include
#define
#if
#ifdef
#ifndef
This means that the macros will be replaced by text from another file or a conditional logic depending on the circumstance. Once this has occured, the intermediate file (pro.i
) should now contain all the ingredients to begin the compilation stage.
Compilation
Now that our ingredients for making the program have been gathered we now need to check they are good. The compilation stage is where the compiler will go through the actual source code (C++) and check it makes sense. At the end of this stage an object file will be created which will ultimately be used to create the final file to be ran.
gcc -S comp.i
* note -S
produces assembly code (output would be comp.s
)
as -o comp.s comp.o
would then produce the object file, this can be additionaly described as an assembly stage
One thing to bare in mind is that you can compile code without all of the definitions being available, definitions are fetched in the next stage. Therefore the compiler will treat any declaration as a promise, that sometime later you will provide the definition.
Furthermore doing this partial compilation allows for codebases to skip being fully re-compiled, instead only the modified bits will need to be recompiled and checked.
Consider the image below, the two header files promise a method each. There are no definitions but this is fine and would compile.
On the otherhand if you tried to compile something with “bad” syntax then you would get a compilation error.
Linking
The final stage, where now we take the gathered and prepared ingredients and actually make the meal. This means that all of the object files that have been compiled but might not make much sense on their own (they are just declarations of functions or promises…) are combined by the linker. The linker will attempt to reference where all the definition for each function lives and fulfil all the promises that have been made.
This is where the errors such as undefined symbol or multiple definitions come from (LINKER ERRORS) as the linker is attempting to place the correct definitions in the correct addresses.
ld -o outputfile comp.o
The best way to think about this stage is, during linking, all the promises given to the compiler need to be completed. If one is missed or overfilled a linker error will probably occur.
Links correctly.
Will fail to link as there are multiple definitions of the same function (add).
Will fail to link as there is a missing definition of subtract.
CMAKE
This is where other tools might also be worth mentioning such as CMAKE. In CMAKE you can define static and shared libraries. A static library in many ways is like an object file. It can be referenced as a self contained object that contains everything needed for this unit of code to run/make sense. To continue the metaphor from before, to make this portion of the meal taste good!
In the previous example we were using the add
and subtract
methods. Therefore you might want to group these into a static library with other mathematical functions such that they can be reused/referenced in different places.
In CMAKE terms this would be something like below where we define a library called “libint” and this relies upon our input add.c
and sub.c
files.
add_library(libint STATIC add.c sub.c)
Throughout a CMAKE project you can then reuse these static libraies by defining it using the target libraries
declaration:
target_link_libraries(... libint)
During the linking process, the linker will search for all the undefined definitions and if it finds one from the static library it will copy it into the final executable file.
Any code not referenced in a static library won’t be included in the final executable.
The alternative to static libraries/linking is dynamic linking (in CMAKE this is called shared libraries). This is where the linking process doesn’t happen at build time but instead runtime. Therefore, when the program executable runs it will attempt to resolve any symbols it still doesn’t know about. Think .dll
files on windows, which stand for dynamic link library, a process might attempt to load one of these during runtime. The advantage here is that the distributed code can be smaller in size because it relies upon a common already distributed file (such as a .dll
file) that can be loaded when it’s ran. However if this file can’t be found or isn’t located in the expected location then the program will fail to execute.
The main difference when implmenting this in CMAKE is changing the STATIC
keyword for SHARED
.
add_library(libint SHARED add.c sub.c)
Pro’s and Con’s
- Dynamic linking will produce smaller binaries vs static linking.
-
- But the shared code has to be able to be resolved and also pre-distributed before-hand.
- Static linking will likely have longer build times as all of the dependencies are being baked into the one output file and not loaded during runtime like in a dynamically linked project.
- Dynamic linking can reduce memory overhead as multiple programs can share the same library that is already loaded into memory whereas static libraries require each line of code to be copied into the executable and then subsequently loaded into memory when ran.
- Dynamic linking can be more modular and therefore easier to update. If the library is updated just that shared file need be replaced. If a static library is updated it is likely the whole program needs to be relinked.
-
- However versioning can be difficult, especially if multiple programs require different versions of the same library. Static linked programs control and bake in which version of the library they use.
- Dynamic linking can impact performance as symbol resolution happens at runtime, before execution. Static linking does not cause this as the linking stage is done during the build.
- Dynamic linking can require the end user to solve dependancy issues specific to their machine or operating system. Static linking would come with the tools “ready to go”.
-
- Static linking also allows the author of the code to have control over versioning and specify what libraries should be used in which scenarios, therefore all end-users should experience the same behaviours.