Tag: mindfulness

  • 4 Must-Have C++17 Features for Embedded Developers

    C++17 is a version of the C++ programming language that was standardized in 2017, and adds additonal new features and improvements to what is considered “modern C++”. Some major new features that I have really loved in C++17 include:

    • Structured bindings: Allows you to easily extract multiple variables from a single tuple or struct, and to use them in a more readable way.
    • Inline variables: Allows for the definition of variables with the “inline” keyword in classes and elsewhere.
    • New attributes: Allows for more readable code by marking certain areas with specific attributes that the compiler understands.
    • std::shared_mutex: Allows for multiple “shared” locks and a single “exclusive” lock. This is basically a standard read/write mutex!

    In embedded systems, support for C++17 will depend on the compiler and platform you are using. Some more popular compilers, such as GCC and Clang, have already added support for C++17. However, support for C++17 for your project may be limited due to the lack of resources and the need to maintain backwards compatibility with older systems.

    A good summary of the new features and improvements to the language can be found on cppreference.com. They also provide a nice table showing compiler support for various compilers.

    Structured Bindings

    I already wrote about my love for structured bindings in another post (and another as well), but this article would be incomplete without listing this feature!

    My main use of structured bindings is to extract named variables from a std::tuple, whether that be a tuple of my own creation or one returned to me by a function.

    But I just realized there is another use for them that makes things so much better when iterating over std::maps:

    // What I used to write
    for (const auto &val : myMap)
    {
        // Print out map data, key:value
        std::cout << val.first << ':' << val.second << std::endl;
    }
    
    // What I now write
    for (const auto &[key, value] : myMap)
    {
        // Print out map data, key:value
        std::cout << key << ':' << value << std::endl;
    }Code language: PHP (php)

    The second for loop is much more readable and much easier for a maintainer to understand what is going on. Conscious coding at its best!

    Inline Variables

    An inline variable is a variable that is defined in the header file and is guaranteed to be the same across all translation units that include the header file. This means that each compilation unit will have its own copy of the variable, as opposed to a single shared copy like a normal global variable.

    One of the main benefits of inline variables is that they allow for better control over the storage duration and linkage of variables, which can be useful for creating more flexible and maintainable code.

    I find this new feature most useful when declaring static variables inside my class. Now I can simply declare a static class variable like this:

    class MyClass
    {
      public:
        static const inline std::string MyName = "MyPiClass";
        static constexpr inline double MYPI = 3.14159265359;
        static constexpr double TWOPI = 2*MYPI; // note that inline is not required here because constexpr implies inline
    };Code language: PHP (php)

    This becomes especially useful when you would like to keep a library to a single header, and you can avoid hacks that complicate the code and make it less readable.

    New C++ Attributes

    C++17 introduces a few standard attributes that make annotating your code for the compiler much nicer. These attributes are as follows.

    [[fallthrough]]

    This attribute is used to allow case body to fallthrough to the next without compiler warnings.

    In the example given by C++ Core Guidelines ES.78, having a case body fallthrough to the next case leads to hard to find bugs, and it just is plain hard to read. However, there are certain instances where this is absolutely appropriate. In those cases, you simply add the [[fallthough]]; attribute where the break statement would normally go.

    switch (eventType) {
    case Information:
        update_status_bar();
        break;
    case Warning:
        write_event_log();
        [[fallthough]];
    case Error:
        display_error_window();
        break;
    }Code language: JavaScript (javascript)

    [[maybe_unused]]

    This attribute is used to mark entities that might be unused, to prevent compiler warnings.

    Normally, when writing functions that take arguments that the body does not use, the approach has been to cast those arguments to void to eliminate the warning. Many static analyzers actually suggest this as the suggestion. The Core Guidelines suggest to simply not provide a name for those arguments, which is the preferred approach.

    However, in the cases where the argument is conditionally used, you can mark the argument with the [[maybe_unused]] attribute. This communicates to the maintainer that the argument is not used in all cases, but is still required.

    RPC::Status RPCServer::HandleStatusRequest(
        const RPC::StatusRequest &r [[maybe_unused]])
    {
      if (!m_ready)
      {
        return RPC::Status();
      }
      return ProcessStatus(r);
    }Code language: PHP (php)

    This attribute can also be used to mark a static function as possible unused, such as if it is conditionally used based on whether DEBUG builds are enabled.

    [[maybe_unused]] static std::string toString(const ProcessState state);Code language: PHP (php)

    [[nodiscard]]

    This attribute is extremely useful when writing robust and reliable library code. When marking a function or method with this attribute, the compiler will generate errors if a return value is not used.

    Many times, developers will discard return values by casting the function to void, like this.

    (void)printf("Testing...");Code language: JavaScript (javascript)

    This is against the Core Guidelines ES-48, but how do you get the compiler to generate errors for your functions in a portable, standard way? With [[nodiscard]]. When a developer fails to check the return value of your function (i.e., they don’t store the result in some variable or use it in a conditional), the compiler will tell them there is a problem.

    // This code will generate a compiler error
    [[nodiscard]] bool checkError(void)
    {
      return true;
    }
    
    int main(void)
    {
      checkError();
      return 0;
    }
    
    // Error generated
    scratch.cpp:36:13: error: ignoring return value of ‘bool checkError()’, declared with attribute ‘nodiscard’ [-Werror=unused-result]
       36 |   checkError();
    
    
    // However, this takes care of the issue because we utilize the return value
    if (checkError())
    {
      std::cout << "An error occurred!" << std::endl;
    }Code language: JavaScript (javascript)

    I love how this can be used to make your users think about what they are doing!

    Shared (Read/Write) Mutex

    A shared lock allows multiple threads to simultaneously read a shared resource, while an exclusive lock allows a single thread to modify the resource.

    In many instances, it is desirable to have a lock that protects readers from reading stale data. That is the whole purpose of a mutex. However, with a standard mutex, if one reader holds the lock, then additional readers have to wait for the lock to be released. When you have many readers trying to acquire the same lock, this can result in unnecessarily long wait times.

    With a shared lock (or a read/write mutex), the concept of a read lock and a write lock are introduced. A reader must wait for the write lock to be released, but will simply increment the read lock counter when taking the lock. A writer, on the other hand, must wait for all read and write locks to be released before it can acquire a write lock. Essentially, readers acquire and hold a shared lock, while writers acquire and hold an exclusive lock.

    Here is an example of how to use a shared_mutex:

    #include <iostream>
    #include <thread>
    #include <mutex>
    
    std::shared_mutex mtx;
    int sharedCount = 0;
    
    void writevalue()
    {
        for (int i = 0; i < 10000; ++i)
        {
            // Get an exclusive (write) lock on the shared_mutex
            std::unique_lock<std::shared_mutex> lock(mtx);
            ++sharedCount ;
        }
    }
    
    void read()
    {
        for (int i = 0; i < 10000; ++i)
        {
            // Get a shared (read) lock on the shared_mutex
            std::shared_lock<std::shared_mutex> lock(mtx);
            std::cout << sharedCount << std::endl;
        }
    }
    
    int main()
    {
        std::thread t1(writevalue);
        std::thread t2(read);
        std::thread t3(read);
    
        t1.join();
        t2.join();
        t3.join();
    
        std::cout << sharedCount << std::endl;
        return 0;
    }
    
    Code language: C++ (cpp)

    Here you can see that std::shared_mutex is used to protect the shared resource sharedCount. Thread t1 increments the counter using an exclusive lock, while threads t2 and t3 read the counter using shared locks. This allows for concurrent read operations and exclusive write operation. It greatly improves performance where you have a high number of read operations versus write operations.

    With C++17, this type of lock is standardized and part of the language. I have found this to be extremely useful and makes my code that much more portable when I need to use this type of mutex!


    C++17 offers a wide range of new features that provide significant improvements to the C++ programming language. The addition of structured bindings, inline variables, and the new attributes make code more readable and easier to maintain, and the introduction of the “std::shared_mutex” type provides performance improvements in the situations where that type of lock makes sense. Overall, C++17 provides an even more modern and efficient programming experience for developers. I encourage you to start exploring and utilizing these new features in your own projects!

  • Optimizing Boot Time on an iMX6ULL Processor

    I recently had the opportunity, or the critical need rather, to optimize the boot time of some applications on an iMX6ULL processor. For anyone unfamiliar with this processor, it is a single core ARM processor running a Cortex-A7 core at up to 900 MHz. It provides a number of common interfaces for a microprocessor, including 16-bit DDR, NAND/NOR flash, eMMC, Quad SPI, UART, I2C, SPI, etc.

    This particular implementation is running a recent Linux kernel with multiple Docker applications to execute the business logic. The problem was that the time from power on to the system being ready and available was over 6 minutes! Obviously, this was a huge issue and not acceptable performance from a user perspective, so I was tasked with reduce that boot time. I was not given any acceptable performance numbers, so I started by just doing the best that I could with the hardware.

    TL;DR

    By optimizing the kernel configuration and boot time of my applications, I was able to cut the boot time in half. Through this process, I was also able to find some issues in the hardware design of our board that we were able to address to improve read/write speeds to our eMMC.

    In total, I was able to shave over three minutes off of our boot time. It still is not optimal, but I am waiting on the hardware rework before continuing to see how much more is required.

    <insert a table showing gross estimates of time saved in kernel (drivers/modules), systemd (removing services not required), docker apps (upgrading docker to go version from python/rewriting plugin in rust)>

    Boot/Initialization PeriodOriginal TimeOptimized Time
    Bootloader and Kernel Load30 seconds22 seconds
    systemd Service Initialization5 minutes 30 seconds2 minutes 30 seconds
    Optimization Summary

    Approach

    My typical approach to optimizing a Linux system always starts by working to optimize the kernel. This can usually save only a few seconds of boot time, but it was what I was most familiar with, so that is where I began.

    The Linux kernel is designed to allow various drivers and features to be enabled or disabled at runtime via kernel modules. However, depending on your system and requirements, have everything available in the kernel by way of modules can significantly slow down your kernel load times. My approach when working in embedded systems with custom hardware is to optimize the kernel configuration in such a way that all the necessary drivers are built into the kernel and all critical features are also built into the kernel. Everything else that may be needed later in runtime by applications can be left as a kernel module. All features that are definitely not required are completely removed.

    Please note that there is a fine line here when optimizing your kernel. If you make everything built in, that bloats the size of your kernel, which means it will take longer to read from disk. So leaving some things as modules is okay to keep your kernel size smaller, this optimizing load time. But if you have too many modules, those have to be loaded before the rest of the system can be loaded that depends on them, so there is a balance to be struck.

    After I focused on the kernel, I turned my eye toward optimizing the run time and boot applications. Our system made use of systemd to manage system initialization, so it was clear that the right tool to use was going to be systemd-analyze. This is an extremely useful tool to use when you need to see how all the services interact one with another during initialization. You can get a list of how long each service took during the initialization process, view a graph of how those are all related, and even see the critical chain of services through initialization. The typical process to optimize your initialization is two-fold: a) identify what services are being run that you do not require and can turn off, and b) identify what services are taking an abnormal amount of time, so you can optimize those specific services.

    Finally, after the kernel was optimized, and I had removed all the unnecessary services, I was able to focus on the few services and applications that were simply taking forever to start up.

    Kernel Optimization

    In order to optimize the kernel, I had to make use of a few commands in the Yocto build system to manipulate the kernel configuration. In Yocto, most systems make use of kernel configuration fragments, which are a really easy and clean way to manage certain features and drivers in a hardware and kernel agnostic fashion.

    The way this works is that your primary kernel recipe in Yocto will manage a default kernel configuration that defines typical features and drivers for that version of the kernel. This primary layer typically will come from the microprocessor vendor, in this case from NXP. Your hardware layer will provide configuration fragments that override those features. Finally, you can have additional meta layers that provide additional fragments if necessary. You can read more about kernel configuration in the Yocto Project documentation here.

    bitbake linux-lmp-fslc-imx-rt -c menuconfig
    Using menuconfig from bitbake.

    With dealing with optimizations, however, you are typically overriding most everything in the default configuration. Rather than editing the defconfig file itself, or providing an entirely new one, I stuck with the fragment approach.

    # Create a kernel configuration from the default configuration
    # (i.e., build the kernel recipe through the configure step)
    bitbake linux-yocto -c kernel_configme -f
    
    # Make the necessary configuration changes desired for the fragment
    bitbake linux-yocto -c menuconfig
    # Exit menuconfig, saving the configuration
    
    # Create the fragment
    bitbake linux-yocto -c diffconfig
    
    # Copy the fragment to your repository and add to your recipe(s)Code language: PHP (php)

    You can read more about creating kernel configuration fragments in this quick tip.

    When editing the kernel configuration via menuconfig, I made all my desired changes and then generated the configuration fragment and built the kernel. This resulted in many QA warnings about certain features being defined in the configuration chain, but did not end up in the final configuration. Those warnings simply mean that the build system is having a hard time verifying what the correct configuration should be. This is usually because your defconfig file contains one CONFIG_*, but that is not part of the final configuration because you implicitly removed it (i.e., you removed a feature that the specific configuration depends on). To address this, simply take the CONFIG_* option mentioned in the QA warning and drop that into your optimization fragment with a “=n” at the end to explicitly disable it.

    Initialization Optimization via systemd-analyze

    systemd is a popular choice for managing system components in Linux systems. Most all modern Linux systems make use of it in one way or another. It provides a wide range of tools to help administrators manage their systems as well.

    In my case, since my system was using systemd to manage all the run time services and initialization, I was able to make use of the systemd-analyze tool. Here is a short snippet from the man page:

    systemd-analyze may be used to determine system boot-up performance statistics and retrieve other state and tracing information from the system and service manager, and to verify the correctness of unit files. It is also used to access special functions useful for advanced system manager debugging.

    For this exercise, I made use of the commands blame, plot, dot, and critical-chain.

    To start my debug process, I wanted to know at a high level what services were taking the longest to initialize. To do this, I made use of the blame and plot commands.

    systemd-analyze blame will look at all the services that systemd manages and provide a sorted list and the amount of time they took to initialize. This was exactly the kind of information I was after and gave me a starting point in my search for what to optimize. However, when looking at this data you have to be a little careful, because services are many times interdependent. The initialization time of one service could be really long because it cannot finish initializing until another service is also complete with its own initialization.

    user@localhost:~$ systemd-analyze blame
    22.179s NetworkManager-wait-online.service
    21.986s docker-vxcan.service
    19.405s docker.service
    14.119s dev-disk-by\x2dlabel-otaroot.device
    12.161s systemd-resolved.service
    10.973s systemd-logind.service
    10.673s containerd.service
     9.702s systemd-networkd.service
     9.443s systemd-networkd-wait-online.service
     6.789s ModemManager.service
     5.690s fio-docker-fsck.service
     5.676s systemd-udev-trigger.service
     5.552s systemd-modules-load.service
     5.127s btattach.service
     5.062s user@1001.service
     4.670s sshdgenkeys.service
     3.793s NetworkManager.service
     3.780s systemd-journald.service
     2.945s systemd-timesyncd.service
     2.837s bluetooth.service
     2.409s systemd-udevd.service
     2.084s zram-swap.service
     1.677s systemd-userdbd.service
     1.621s avahi-daemon.service
     1.284s systemd-remount-fs.service
     1.080s dev-mqueue.mount
     1.025s sys-kernel-debug.mount
     1.015s modprobe@fuse.service
     1.010s sys-kernel-tracing.mount
     1.004s modprobe@configfs.service
     1.004s modprobe@drm.service
      997ms kmod-static-nodes.service
      871ms systemd-rfkill.service
      832ms systemd-journal-catalog-update.service
      732ms systemd-tmpfiles-setup.service
      668ms systemd-sysusers.service
      592ms systemd-user-sessions.service
      562ms systemd-tmpfiles-setup-dev.service
      533ms ip6tables.service
      507ms iptables.service
      464ms systemd-sysctl.service
      390ms systemd-journal-flush.service
      364ms systemd-random-seed.service
      359ms systemd-update-utmp-runlevel.service
      346ms systemd-update-utmp.service
      318ms user-runtime-dir@1001.service
      232ms tmp.mount
      220ms var.mount
      219ms sys-fs-fuse-connections.mount
      210ms var-rootdirs-mnt-boot.mount
      207ms systemd-update-done.service
      201ms var-volatile.mount
      165ms sys-kernel-config.mount
      147ms docker.socket
      140ms sshd.socket
      134ms ostree-remount.service
      123ms dev-zram0.swapCode language: JavaScript (javascript)

    Because blame doesn’t show any dependency information, systemd-analyze plot > file.svg was the next tool in my quiver to help. This command will generate the same information as blame but will place it all in a nice plot form, so you can see what services started first, how long each took, and also pick out some dependencies between services.

    Section of plot generated with systemd-analyze plot command.

    My main use of the blame and plot commands was to identify services that were obviously taking a lot of time, but to also identify services that were simply not required. systemctl –type=service also helped with this process by simply listing all the services that systemd had enabled.

    Dependencies can still be hard to definitively spot when just looking at the plot, however. Because of that, the systemd-analyze dot ‘<pattern>’ command is really handy. When optimizing my boot time, I would use blame and plot to identify potential culprits and then run them through dot to see how they were related. For example, I found that my network configuration was taking an abnormal amount of time, so I took a look at systemd-analyze dot ‘*network*.*’ to see how the systemd-networkd and related services were interacting one with another. This led me to some information that helped me understand that half of the services that were being started in support of the network were not actually required (such as NetworkManager and the IPv6 support services). By disabling those, I was able to save over 30 seconds of boot time, simply by removing a couple of services that were not required.

    Finally, I was able to make use of the systemd-analyze critical-chain command to view just the critical chain of services. This command simply prints the time-critical chain of services that lead to a fully-initialized system. Only the services taking the most amount of time are shown in this chain.

    user@localhost:~$ systemd-analyze critical-chain
    The time when unit became active or started is printed after the "@" character.
    The time the unit took to start is printed after the "+" character.
    
    multi-user.target @1min 21.457s
    `-docker.service @1min 3.755s +17.683s
      `-fio-docker-fsck.service @26.784s +36.939s
        `-basic.target @26.517s
          `-sockets.target @26.493s
            `-sshd.socket @26.198s +258ms
              `-sysinit.target @25.799s
                `-systemd-timesyncd.service @23.827s +1.922s
                  `-systemd-tmpfiles-setup.service @23.224s +518ms
                    `-local-fs.target @22.887s
                      `-var-volatile.mount @22.596s +222ms
                        `-swap.target @22.393s
                          `-dev-zram0.swap @22.208s +113ms
                            `-dev-zram0.device @22.136sCode language: JavaScript (javascript)

    This information is also useful because it is more of a quick and dirty way of getting to the same information as using blame, plot, and dot separately. However, because it doesn’t show all services, it can only help you optimize the worst offenders.

    Application Optimization

    Finally, after all the kernel and service optimizations, I still was seeing that a couple application services were taking the majority of the time to start up. Specifically, these applications were Docker and a Docker plugin for managing CAN networks.

    These services were the last of the service chain to start up, so there were no other dependent services they were waiting on. Once the network was up and configured, these services would start up. Because they were not dependent on any other services, I was able to laser focus in on what was causing those applications to take so long to start and optimize them directly.

    First, Docker Compose was taking well over two minutes to start and load containers. Second, my Docker plugin for CAN networks was taking well over 20 seconds to startup as well.

    When I checked my version of Docker Compose, I found that it was still running a Python-based version of Compose, rather than the newer and faster Golang-based version. By upgrading my version of Compose, I was able to reduce my startup time from well over two minutes to about 1 minute 40 seconds.

    I also found that the CAN network plugin was written in Python as well. So, rather than continue using it, I rewrote that plugin in Rust, which also gave me the opportunity to fix a couple of shortcomings I found in the plugin itself. This reduced the initialization time of the plugin from over 30 seconds to under a second — a huge savings!

    488ms docker-vxcan.serviceCode language: CSS (css)

    Conclusion

    Overall, this was a great exercise for me in the steps to optimize the boot process of a Linux system. While I certainly could optimize the system some more, I believe the gains to be had will be minimal — at least until I have some new hardware with the flash memory interfaces running at full speed. Then I can revisit this and see if I can get things any faster!

    What is your experience with speeding up Linux boot and initialization on embedded systems? Any tips and tricks you would like to share? Comment them below!

  • Why Writing Good Comments Makes You a Great Developer

    Commented, xkcd.com #156

    When you think of a great developer, I’m sure someone who writes good comments often is not at the top of the list. However, writing good comments is one of the most important skills a developer can have. Good comments not only help you understand your code better, but they also make it easier for others to read and work with. In this blog post, we’ll look at why writing good comments makes you a great developer and some tips for improving your commenting style. Because if you are mindful in your commenting, it is an indication that you are mindful in your coding!

    Comments Should Be Present

    Well-written code comments are like a good road map. They provide clear direction and helpful information that can make working with code much easier. Good code comments can be incredibly useful, providing critical insights and details that might otherwise be easy to miss. Think of them as important signposts along the way that can save a developer hours of debugging.

    Here is an example of something I came across recently that was not obvious. I was writing a CMake function to add unit tests using CTest. I was passing in a CMake string as my “TEST_COMMAND” variable. When I would call add_test with that variable as the value for the COMMAND option the test would fail to run properly, especially if the command took command line arguments! After spending some time digging I learned that the COMMAND option to add_test should be a CMake list rather than a string for the arguments to be passed properly.

    I commented my CMakeLists.txt as such to ensure that was clear to the reader.

    # TRICKY: Change the command string to a CMake list for use in add_test()
    string(REPLACE " " ";" TEST_LIST ${TEST_COMMAND})
    add_test(NAME ${TEST_NAME} COMMAND ${TEST_LIST})
    Code language: CMake (cmake)

    Without the “TRICKY” comment, a maintainer of this code may look at this and see potential for an optimization, removing the conversion, and then they would be searching for solutions to the same problem I had already solved.

    Comments Should Use Proper Spelling, Casing, Grammar, and Full Sentences

    Good code comments are spelled correctly. They are also properly cased. This attention to detail shows that the programmer cares about their work.

    Take a look at the two examples of code below. Which one would you say is written by a mindful developer? Which one would you rather work to maintain?

    // copute ac/a+c
    double prodOverSum(int a, double c)
    
    {// git the nmeratr for the rtn
      double n = (double)a * c;
    
       // get the denomination for the value
      int d = a + (int)c;
    
    /// comput and return the quotient
      return n / (double)d;
    }
    Code language: C++ (cpp)
    // Compute the product over sum for
    // the provided values, a and c.
    //        (A * C)
    //   X = ---------
    //        (A + C)
    double prodOverSum(int a, double c)
    {
      double prod = (double)a * c;
      int sum = a + (int)c;
    
      // Cast sum to a double to ensure
      // the compiler does not promote prod
      // to an integer and perform integer
      // division
      return prod / (double)sum;
    }
    Code language: C++ (cpp)

    It’s clear that the programmer cares about their craft when they put so much effort into writing clear, readable comments. It would be almost impossible to maintain this level of detail by chance, which makes me believe it is intentional as opposed just being accidental! That gives me confidence that the code itself is well-written, properly tested, and ready for use.

    When writing comments, it is important to use full sentences with proper grammar as well. This will help ensure that your comments are clear and easy to understand. Additionally, using proper grammar will help to give your comments a more professional appearance.

    Comments Should Be Smartly Formatted

    Comments are meant to convey a message about the surrounding code to the developer. Sometimes information is best conveyed in a particular format. So, when commenting your code, ensure that your comment is formatted in such a way that conveys your message as clearly and concisely as possible.

    Code formatters can help and hinder this. If your comments require lots of horizontal scrolling to read, then consider breaking them into multiple lines or rewording to be more concise! However, sometimes a new line in the middle of your documentation is undesirable and you will need to instruct your formatter to leave it alone by wrapping with “control comments”.

    For example, consider this method. If I was to line up all the columns in the table neatly, this would make for some very long lines of text. Most formatters would break these lines into multiple ones. Instead, make judicious use of white space to get the message across to the reader. If you have to use multiple lines, you decide where those line breaks are – don’t leave it up to your formatter!

    void Quaternion2DCM(const double * const q, double * const dcm)
    {
      // Don't do this! Your formatter will either add new lines or ignore this
      // if you add protection blocks around the table, making for really long 
      // lines that are harder to read.
      // To compute the DCM given a quaternion, the following definition is used
      //       +-------------------------------------------------------------------------------------------+
      //       | (q4^2 + q1^2 - q2^2 - q3^2)    2*(q1q2 + q3q4)                2*(q1q3 - q2q4)             |
      // dcm = | 2*(q1q2 - q3q4)                (q4^2 - q1^2 + q2^2 - q3^2)    2*(q2q3 - q1q4)             |
      //       | 2*(q1q3 + q2q4)                2*(q2q3 − q1q4)                (q4^2 - q1^2 - q2^2 + q3^2) |
      //       +-------------------------------------------------------------------------------------------+
      // clang-format off
      // Adapted from https://www.vectornav.com/resources/inertial-navigation-primer/math-fundamentals/math-attitudetran
      // clang-format on
      dcm[0] = q[3]*q[3] + q[0]*q[0] - q[1]*q[1] - q[2]*q[2];
      dcm[1] = 2*(q[0]*q[1] + q[2]*q[3]);
    ...
      dcm[7] = 2*(q[1]*q[2] - q[0]*q[3]);
      dcm[8] = q[3]*q[3] - q[0]*q[0] - q[1]*q[1] + q[2]*q[2];
    }
    Code language: C++ (cpp)
    void Quaternion2DCM(const double * const q, double * const dcm)
    {
      // Instead, you can do this - just simple white space, still very readable
      // by the user and it fits on a single line!
      // To compute the DCM given a quaternion, the following definition is used
      //       +-------------------------------------------------------------------+
      //       | (q4^2 + q1^2 - q2^2 - q3^2)    2*(q1q2 + q3q4)    2*(q1q3 - q2q4) |
      // dcm = | 2*(q1q2 - q3q4)    (q4^2 - q1^2 + q2^2 - q3^2)    2*(q2q3 - q1q4) |
      //       | 2*(q1q3 + q2q4)    2*(q2q3 − q1q4)    (q4^2 - q1^2 - q2^2 + q3^2) |
      //       +-------------------------------------------------------------------+
      // Adapted from https://www.vectornav.com/resources/inertial-navigation-primer/math-fundamentals/math-attitudetran
      dcm[0] = q[3]*q[3] + q[0]*q[0] - q[1]*q[1] - q[2]*q[2];
      dcm[1] = 2*(q[0]*q[1] + q[2]*q[3]);
    ...
      dcm[7] = 2*(q[1]*q[2] - q[0]*q[3]);
      dcm[8] = q[3]*q[3] - q[0]*q[0] - q[1]*q[1] + q[2]*q[2];
    }
    Code language: C++ (cpp)

    How Much Should I Comment?

    Good code will be somewhat self-documenting, but there is always a limit. For example, the method below is so obvious I don’t really need to comment on it, do I?

    int sum(const int a, const int b)
    {
      return a + b;
    }
    Code language: C++ (cpp)

    However, for something more involved, comments can clarify a lot of things for the developer and can link them to more information, as in the example of the Quaternion2DCM method described above.

    So, then, how do you define what is obvious? For me, I think in terms of my average user and/or maintainer. What sort of things do I expect them to understand? What about more junior software engineers who may need to work in this code? Basic math and logic knowledge seems okay. Syntax is a given; even more advanced syntax, such as lambdas or function pointers, I would expect them to be able to read. However, anything beyond that typically indicates the need for a detailed comment that explains things.

    It also helps me to think in terms of what will help me understand this design decision tomorrow, or 6 months from now, or even a year from now. Maybe it is obvious to me now what that this conditional with multiple clauses means and why it is this way in the design, but I’ll likely forget tomorrow and want to refactor.

    Comments In Your IDE

    To make working with comments easier, look for ways to get your IDE to help you! I use VS Code for nearly all my coding right now and I found the extension Better Comments to be extremely helpful.

    Image Credit: Better Comments Plugin

    With this plugin I can add additional formatting and mark comments specifically. For example, in my code I tend to leave myself reminders using TODO and will often prioritize those TODO comments using regular TODO and prefixing with ‘*’ and ‘!’.

    // ! TODO: This is an important TODO that needs to be taken care of immediately
    // * TODO: This is an important TODO that should be taken care of soon
    // TODO: This is a TODO that should be taken care of eventuallyCode language: C++ (cpp)

    With this plugin my comments are color-coded for me, making it easy to see what needs to be done first.


    In the end, your code is the reflection of you, so it only makes sense that your commenting reflects how much care and attention to detail there really is in what’s being written. If poor commenting sends a message on its own then people will be able to tell if they need more time before investing any kind of faith into your code—or even worse, they’ll just move onto another potentially better implementation!

    What do you think? What makes a good comment in your book? Let us know in the comments below!

  • Quick Tip: Python Variables in Multithreaded Applications and Why Your Choice Matters

    When working with Python classes, you have two kinds of variables you can work with: class variables and instance variables. We won’t get deep into the similarities and differences here, nor various use cases for each kind. For that, I’ll refer you to this article. For our purposes here, just know that class variables and instance variables both have their own unique uses.

    In this post, we’ll be looking at one particular use case: using Python class and instance variables in multithreaded applications. Why does this matter? As with any programming language, choice of variable types can be critical to the success or failure of your application.

    With that in mind, let’s explore some of the options you have when working with Python variables in a multithreaded context. Knowing which option to choose can make all the difference.

    Class variables are declared at the top of your class and can be accessed by every object that belongs to this particular class (i.e., each instance shares a copy of each class variable); while on the other hand, instance variables use the indicator called “self” and are tied to a specific instance. Each instance of an object will have its own set of instance variables, but share the class variables.

    Class variables look like this:

    class obj:
      data = [0,0,0]
    Code language: Python (python)

    Instance variables look like this:

    class obj:
      def __init__(self):
        self.data = [0,0,0]
    Code language: Python (python)

    I recently ran into an issue with class vs. instance variables in a multithreaded application. A colleague of mine was debugging a UI application that was communicating on an interface and shipping the received data off to a UI for display. He found it would crash randomly, so we switched up the architecture a bit to receive data on one thread and then pass it through a queue to the UI thread to consume. This seemed to resolve the crash, but the data in the UI was wrong.

    Digging into the problem, we found that the data was changing as it passed through the queue. After some more digging, my colleague realized that the class that was implemented to push the data through the queue was utilizing class variables instead of instance variables.

    This simple program illustrates the issue:

    import queue
    import threading
    import time
    
    q1 = queue.Queue()
    
    class obj:
      id = 0
      data = [0,0,0]
    
    def thread_fn(type):
        d = q1.get()
        preprocessing = d.data
        time.sleep(3)
    
        # Check data members post "processing", after modified by other thread
        if preprocessing != d.data:
          print(f"{type}: Before data: {preprocessing} != After data: {d.data}")
        else:
          print(f"{type}: Before data: {preprocessing} == After data: {d.data}")
    
    if __name__ == "__main__":
        x = threading.Thread(target=thread_fn, args=("ClassVars",))
        obj.id = 1
        obj.data = [1,2,3]
        q1.put(obj)
    
        x.start()
    
        # Update the data
        obj.id = 2
        obj.data = [4,5,6]
        q1.put(obj)
    
        x.join()
    Code language: Python (python)

    Essentially what was happening is that the data would be received on the interface (in this case the main function) and put into the queue. As the UI was getting around to processing said data, new data would be received on the interface and put it into the queue. Since class variables were used originally (and the class object was used directly), the old data got overwritten with the new data in the class and the UI would have the wrong data and generate errors during processing.

    Once the underlying message class was changed to use instance variables, the “bad data” issue went away and the original problem of the crashing application was also resolved with the architecture change. Take a look at the difference in this program:

    import queue
    import threading
    import time
    
    q1 = queue.Queue()
    
    class obj:
      def __init__(self):
        self.id = 0
        self.data = ['x','y','z']
    
    def thread_fn(type):
        d = q1.get()
        preprocessing = d.data
        time.sleep(3)
    
        # Check data members post "processing", after modified by other thread
        if preprocessing != d.data:
          print(f"{type}: Before data: {preprocessing} != After data: {d.data}")
        else:
          print(f"{type}: Before data: {preprocessing} == After data: {d.data}")
    
    if __name__ == "__main__":
        x = threading.Thread(target=thread_fn, args=("InstanceVars",))
        obj1 = obj()
        obj1.id = 1
        obj1.data = [1,2,3]
        q1.put(obj1)
    
        x.start()
    
        # Update the data
        obj2 = obj()
        obj2.id = 2
        obj2.data = [4,5,6]
        q1.put(obj2)
    
        x.join()
    Code language: Python (python)

    As you can see, using instance variables requires that we create an instance of each object to begin with. This ensures that each object created has its own data members that are independent of the other instances, which is exactly what we required in this scenario. This single change along would have likely cleaned up the issues we were seeing, but would not have fixed the root of the problem.

    When passing through the queue, the thread would get each instance and use the correct data for processing. Nothing in the thread function had to change; only how the data feeding it was set up.

    Python is a great language, but it definitely has its quirks. The next time you hit a snag while trying to parallelize your application, take a step back and understand the features of your programming language. Understanding the peculiarities of your chosen language is the mark of a mindful programmer! With a bit of space to gather your wits and some careful, conscious coding, you can avoid these pesky pitfalls and create fast, reliable threaded applications. What other Python nuances have bitten you in the past? Let us know in the comments below!

  • Quick Tip: Its Time to Avoid the Frustration of Single Return Types in C++

    When designing a new API one of the things I put a lot of thought into is how the user will know if the API call was successful or not. I don’t want to levy large error checking requirements on my users, but in C/C++ you can only return a single data type, so many APIs will pass the real output back through a referenced argument in the function prototype and a simple Boolean or error code as the return value. I find this clunky and hard to document, so I dug my heels in to find a better way.

    std::tuple and std::tie are two useful C++ features that can help you return multiple values from a function. std::tuple is a container that holds a tuple of values, while std::tie allows you to tie objects together so that they can be accessed as if they were one object. In this post, we’ll take a look at how to use these two features to make returning multiple values from a function easier.

    std::tuple and std::pair

    std::tuple (and std::pair) are C++ templates that allow you to combine two or more objects together and pass them around as if they were one. They are the clear choice for combining multiple outputs from a function into a single return data type. This creates clean, self-documenting code that is easy for a user to understand and follow.

    For example, let’s say we were dealing with a factory that created our objects. We’d have a creation method that looks something like this:

    std::shared_ptr<Object> MyFactory::create()
    {
      return std::make_shared<Object>();
    }
    Code language: C++ (cpp)

    One shortcoming here is that the create function does not to any error checking whatsoever, putting the entire burden on the user.

    A (slightly) improved version of the create method could be:

    bool MyFactory::create(std::shared_ptr<Object> &p)
    {
      p = std::make_shared<Object>();
      if (!p) return false;
      return true;
    }
    Code language: C++ (cpp)

    The user can now easily check the return value to determine whether or not the object was created successfully and perform additional processing. However, now they have to create the shared_ptr<T> object before calling the create function; and in addition to that, they also have to understand the argument p is not an input parameter, but rather they output parameter they are after in the first place.

    Instead, let’s make use of a std::pair to return both the created object as well as whether creation was successful.

    std::pair<std::shared_ptr<Object>, bool> MyFactory::create()
    {
      auto p = std::make_shared<Object>();
      return std::make_pair(p, !!p); // !!p same as static_cast<bool>(p) or 'if (p)'
    }
    Code language: C++ (cpp)

    How is this better? You may look at this and think that the user still has to grab the success value from the pair and that is absolutely correct. In this trivial case, creation is just a couple of lines. However, in a real-world scenario your function will likely be much more complex with many more errors to handle. Now, instead of levying that requirement on your user, you have captured all the error handling logic (and maybe reporting) internally. The user just has to check whether the returned data is valid or not via the Boolean in the pair.

    You can also just as easily extend this to return multiple values in a std::tuple:

    std::tuple<UUID, std::string, bool> createMsg(const std::string &msg, const int id)
    {
      UUID uuid = makeNewUUID();
      std::string outputmsg = std::to_string(id) + ": " + msg;
      return std::make_tuple(uuid, outputmsg, isUUIDValid(uuid));
    }
    Code language: C++ (cpp)

    Using std::tie to Access Return Values

    To access multiple return values, std::tie comes to the rescue. std::tie “ties” variable references together into a std::tuple. Accessing your multiple return values becomes straightforward at this point:

    // Factory Example
    std::shared_ptr<Object> obj;
    bool objvalid{false};
    std::tie(obj, objvalid) = MyFactory::create();
    if (objvalid) obj.work();
    
    // Message Example
    UUID lUuid;
    std::string msg;
    bool msgvalid{false};
    std::tie(lUuid, msg, msgvalid) = createMsg("Test message", 73);
    if (msgvalid) std::cout << lUuid << ": " << msg << std::endl;
    Code language: C++ (cpp)

    Conclusion

    The C++ Core Guidelines make it clear that passing back a std::tuple (or std::pair) is the preferred way to return multiple return values. It is also clear that if there are specific semantics to your return value that a class or structure object is best.

    std::tuple and std::pair provide a nice way to return multiple values from a function without having to resort to ugly workarounds. By using std::tie, we can make receiving the return value a breeze. What do you think? How will you use this in your next project?

  • 10 Easy Commands You Can Learn To Improve Your Git Workflow Today

    If you’re a developer, coder, or software engineer and have not been hiding under a rock, then you’re probably familiar with Git. Git is a distributed version control system that helps developers track changes to their code and collaborate with others. While Git can be a bit complex (especially if used improperly), there are some easy commands you can learn to improve your workflow. In this blog post, we’ll walk you through 10 of the most essential Git commands.

    TL;DR

    The commands we address in this post are:

    1. git config
    2. git clone
    3. git branch / git checkout
    4. git pull
    5. git push
    6. git status / git add
    7. git commit
    8. git stash
    9. git restore
    10. git reset

    It is assumed that you have basic knowledge of what the terms like branch, commit, or checkout mean. If not, or you really want to get into the nitty-gritty details, the official Git documentation book is a must-read!

    Setup and Configuration

    First things first – to get started with Git you need to get it installed and configured! Any Linux package manager today is going to have Git available:

    # APT Package Manager (Debian/Ubuntu/etc.)
    sudo apt install git
    
    # YUM Package Manager (RedHat/Fedora/CentOS/etc.)
    sudo yum install git
    
    # APK Package Manager (Alpine)
    sudo apk add gitCode language: PHP (php)

    If you happen to be on Windows or Mac, you can find a link to download Git here.

    Once you have Git installed, it’s time to do some initial configuration using the command git config. Git will store your configuration in various configuration files, which are platform dependent. On Linux distributions, including WSL, it will setup a .gitconfig file in your user’s home directory.

    There are two things that you really need to setup at first:

    1. Who you are
    2. What editor you use

    To tell git who you are so that it can tag your commits properly, use the following commands:

    $ git config --global user.name "<Your Name Here>"
    $ git config --global user.email <youremail>@<yourdomain>Code language: HTML, XML (xml)

    The –global option tells git to store the configuration in the global configuration file, which is stored in your home directory. There are times when you might need to use different email addresses for your commits in different respositories. To set that up, you can run the following command from the git repository in question:

    $ git config user.email <your-other-email>@<your-other-domain>Code language: HTML, XML (xml)

    To verify that you have your configuration setup properly for a given repo, run the following command:

    $ git config --list --show-originCode language: PHP (php)

    Finally, to setup your editor, run the following command:

    $ git config --global core.editor vimCode language: PHP (php)

    Working With Repositories

    In order to work with repositories, there are a few primary commands you need to work with — clone, branch, checkout, pull, and push.

    Cloning

    git clone is the command you will use to pull a repository from a URL and create a copy of it on your machine. There are a couple protocols you can use to clone your repository: SSH or HTTPS. I always prefer to set up SSH keys and use SSH, but that is because in the past it wasn’t as easy to cache your HTTPS credentials for Git to use. Those details are beyond the scope of this post, but there is plenty of information about using SSH and HTTPS here.

    To clone an existing repository from a URL, you would use the following command:

    $ git clone https://github.com/jhaws1982/zRPC.gitCode language: PHP (php)

    This will reach out to the URL, ask for your HTTPS credentials (if anonymous access is not allowed), and then download the contents of the repository to a new folder entitled zRPC. You can then start to work on the code!

    Sometimes a repository may refer to other Git repositories via Git submodules. When you clone a repository with submodules, you can save yourself a separate step to pull those by simply passing the --recursive option to git clone, like so:

    $ git clone --recursive https://github.com/jhaws1982/zRPC.gitCode language: PHP (php)

    Branches

    When working with Git repositories, the most common workflow is to make all of your changes in a branch. You can see a list of branches using the git branch command and optionally see what branches are available on the remote server:

    $ git branch         # list only your local branches
    $ git branch --all   # list all branches (local and remote)Code language: PHP (php)

    To checkout an existing branch, simply use the git checkout command:

    $ git checkout amazing-new-feature
    Switched to branch 'amazing-new-feature'
    Your branch is up to date with 'origin/amazing-new-feature'.Code language: JavaScript (javascript)

    You can also checkout directly to a new branch that does not exist by passing the -b option to git checkout:

    $ git checkout -b fix-problem-with-writer
    Switched to a new branch 'fix-problem-with-writer'Code language: JavaScript (javascript)

    Interacting with the Remote Server

    Let’s now assume that you have a new bug fix branch in your local repository, and have committed your changes to that branch (more on that later). It is time to understand how to interact with the remote server, so you can share your changes with others.

    First, to be sure that you are working with the latest version of the code, you will need to pull the latest changes from the server using git pull. This is best done before you start a branch for work and periodically if other developers are working in the same branch.

    $ git pull

    This will reach out to the server and pull the latest changes to your current branch and merge those changes with your local changes. If you have files that have local changes and the pull would overwrite those, Git will notify you of the error and ask you to resolve it. If there are no conflicts, then you are up-to-date with the remote server.

    Now that you are up-to-date, you can push your local commits to the remote server using git push:

    $ git push

    git push will work as long as the server has a branch that your local one is tracking. git status will tell you whether that is the case:

    $ git status
    On branch master
    Your branch is up to date with 'origin/master'.
    
    nothing to commit, working tree cleanCode language: JavaScript (javascript)
    $ git status
    On branch fix-problem-with-writer
    nothing to commit, working tree cleanCode language: JavaScript (javascript)

    If you happen to be on a local branch with no remote tracking branch, you can use git push to create a remote tracking branch on the server:

    $ git push -u origin fix-problem-with-writerCode language: JavaScript (javascript)

    Working with Source Code

    Git makes it very easy to work with your source code. There are a few commands that are easy to use and make managing code changes super simple. Those commands are: status, add, commit, stash, and reset.

    Staging Your Changes

    To stage your changes in Git means to prepare them to be added in the next commit.

    In order to view the files that have local changes, use the git status command:

    $ git status
    On branch fix-problem-with-writer
    Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git restore <file>..." to discard changes in working directory)
            modified:   CMakeLists.txt
            modified:   README.md
    
    no changes added to commit (use "git add" and/or "git commit -a")Code language: Bash (bash)

    Once you are ready to stage your changes, you can stage them using git add:

    $ git add README.md

    If README.md has a lot of changes, and you want to separate them into different commits? Just pass the -p option to git add to add specific pieces of the patch.

    $ git add -p README.md
    $ git status
    On branch fix-problem-with-writer
    Changes to be committed:
      (use "git restore --staged <file>..." to unstage)
            modified:   README.md
    
    Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git restore <file>..." to discard changes in working directory)
            modified:   CMakeLists.txt
    Code language: JavaScript (javascript)

    To commit these changes you have staged, you would use the git commit command:

    $ git commit

    Git commit will bring up an editor where you can fill out your commit message (for a good commit message format, read this; you can also read this for details on how to set up your Git command line to enforce a commit log format).

    You can also amend your last commit if you forgot to include some changes or made a typo in your commit message. Simply stage your new changes, then issue:

    $ git commit --amend

    Storing Changes For Later

    Git has a fantastic tool that allows you to take a bunch of changes you have made and save them for later! This feature is called git stash. Imagine you are making changes in your local branch, fixing bug after bug, when your manager calls you and informs you of a critical bug that they need you to fix immediately. You haven’t staged all your local changes, nor do you want to spend the time to work through them to write proper commit logs.

    Enter git stash. git stash simply “stashes” all your local, unstaged changes off to the side, leaving you with a pristine branch. Now you can switch to a new branch for this critical bug fix, make the necessary changes, push to the server, and jump right back into what you were working on before. That sort of flow would look like this:

    <working in fix-problem-with-writer>
    $ git status
    On branch fix-problem-with-writer
    Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git restore <file>..." to discard changes in working directory)
            modified:   CMakeLists.txt
    
    no changes added to commit (use "git add" and/or "git commit -a")
    
    $ git stash
    Saved working directory and index state WIP on fix-problem-with-writer
    
    $ git status
    On branch fix-problem-with-writer
    nothing to commit, working tree clean
    
    $ git checkout fix-problem-with-reader
    Switched to branch 'fix-problem-with-reader'
    
    <make necessary changes>
    $ git add <changes>
    $ git commit
    $ git push
    
    $ git checkout fix-problem-with-writer
    Switched to branch 'fix-problem-with-writer'
    
    $ git status
    On branch fix-problem-with-writer
    nothing to commit, working tree clean
    
    $ git stash pop
    On branch fix-problem-with-writer
    Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git restore <file>..." to discard changes in working directory)
            modified:   CMakeLists.txt
    
    no changes added to commit (use "git add" and/or "git commit -a")
    Dropped refs/stash@{0} (5e3a53d36338f1906e871b52d3c97236f139b75e)Code language: JavaScript (javascript)

    There are a couple of things to understand about git stash:

    • The stash is a stack – you can stash as much as you want on it and when you pop, you’ll get the last thing stashed
    • The stash will try to apply all the changes in the stash, and in the event of a conflict, will notify you of the conflict and leave the stash on the stack

    I run into the second bullet quite often, but it isn’t hard to fix. If I run into that sort of issue, it is usually simple conflicts that are easily addresses manually. Manually address the conflicts in the file, restore all staged changes from the git stash pop, and then drop the last stash.

    $ git stash pop
    Auto-merging CMakeLists.txt
    CONFLICT (content): Merge conflict in CMakeLists.txt
    The stash entry is kept in case you need it again.
    
    $ git status
    On branch fix-problem-with-writer
    Unmerged paths:
      (use "git restore --staged <file>..." to unstage)
      (use "git add <file>..." to mark resolution)
            both modified:   CMakeLists.txt
    
    no changes added to commit (use "git add" and/or "git commit -a")
    
    $ vim CMakeLists.txt   # manually edit and resolve the conflicts
    $ git status
    On branch fix-problem-with-writer
    Unmerged paths:
      (use "git restore --staged <file>..." to unstage)
      (use "git add <file>..." to mark resolution)
            both modified:   CMakeLists.txt
    
    no changes added to commit (use "git add" and/or "git commit -a")
    
    $ git restore --staged CMakeLists.txt
    
    $ git stash drop
    Dropped refs/stash@{0} (6c7d34915b38e5d75072eacee856fb427f916aa8)Code language: HTML, XML (xml)

    Undoing Changes or Commits

    There are often times when I need to undo the previous commit or I accidentally added the wrong file to my stage. When this happens it is useful to know that you have ways to back up and try again.

    To remove files from your staging area, you would use the git restore command, like so:

    $ git restore --staged <path to file to unstage>Code language: HTML, XML (xml)

    This will remove the file from your staging area, but your changes will remain intact. You can also use restore to revert a file back to the version in the latest commit. To do this, simply omit the --staged option:

    $ git restore <path to file to discard all changes>Code language: HTML, XML (xml)

    You can do similar things with the git reset command. One word of caution with the git reset command — you can truly and royally mess this up and lose lots of hard work — so be very mindful of your usage of this command!

    git reset allows you to undo commits from your local history — as many as you would like! To do this, you would use the command like so:

    $ git reset HEAD~n
    
    # For example, to remove 3 commits
    $ git reset HEAD~3
    Unstaged changes after reset:
    M       CMakeLists.txt
    M       tests/unit.cppCode language: PHP (php)

    The HEAD~n indicates how many commits you want to back up, replacing n with the number you want. With this version of the command, all the changes present in those commits are placed in your working copy as unstaged changes.

    You can also undo commits and discard the changes:

    $ git reset --hard HEAD~n
    
    # For example, to discard 1 commit
    $ git reset --hard HEAD~1
    HEAD is now at 345cd79 fix(writer): upgrade writer to v1.73Code language: PHP (php)

    So there you have it – our top 10 Git commands to help improve your workflow. As we have mentioned before, when you take the time to understand your language and tools, you can make better decisions and avoid common pitfalls! Improving your Git workflow is a conscious decision that can save you a lot of time and headaches! Do you have a favorite command that we didn’t mention? Let us know in the comments below!

  • 6 Tips for an Absolutely Perfect Little Code Review

    Code reviews are an important part of the software development process. They help ensure that code meets certain standards and best practices, and they can also help improve code quality by catching errors early on. However, code reviews can also be a source of frustration for developers if they’re not done correctly.

    Image Credit Manu Cornet @ Bonker’s World

    As a code reviewer, your job is to help make the code better. This means providing clear and concise feedback that helps the developer understand what works well and what needs improvement. A mindful, conscious approach to code reviews can yield incredible dividends down the road as you build not only a solid, reliable codebase; but strong relationships of trust with your fellow contributors.

    Here are some initial guidelines or best practices for performing a great code review:

    • Read the code thoroughly before commenting. This will help you get a better understanding of what the code is supposed to do and how it works.
    • Be specific in your comments. If there’s something you don’t like, explain why. Simply saying “this doesn’t look right” or “I don’t like this” isn’t helpful.
    • Offer suggestions for how to improve the code. If you have a suggestion for how something could be done differently, provide details about the suggestion and even some links and other material to back it up.
    • Be respectful. Remember that the code you’re reviewing is someone else’s work. Criticizing someone’s work can be difficult to hear, so try to be constructive with your feedback. Respectful, polite, positive feedback will go a long way in making code review a positive experience for everyone involved.
    • Thank the developer for their work. Code reviews can be tough, so make sure to thank the developer for their efforts.

    Following these practices will help ensure that code reviews are a positive experience for both you and the developer whose code you’re reviewing.

    In addition to these, here are a few specifics I look for when performing a code review:

    0. Does the code actually solve the problem at hand?

    Sometimes when I get into code reviews I forget to check if the written software meets the requirements set forth. As you do your initial read-through of the code, this should be your primary focus. If the code does not solve the problem properly or flat out misses the mark, the rest of the code review is pointless as much of it will likely be rewritten. There’s nothing worse than spending an hour reviewing code only to find out later that it doesn’t work. So save yourself the headache and make sure the code compiles and does what it’s supposed to do before you start.

    1. Is the code well written and easy to read? Does the code adhere to the company’s code style guide?

    It’s important to format code consistently so that code reviews are easier to perform. Utilizing standard tools to format code can help ensure that code is formatted consistently. Additionally, the code should be reviewed according to a standard set of guidelines. Many formatters will format the code per a chosen (or configured) style, handling all the white space for you. Other stylistic aspects to look for are naming conventions on variables and functions, the casing of names, and proper usage of standard types.

    Structural and organizational standards are important as well. For example, checking for the use of global variables, const-correctness, file name conventions, etc. are all things to look out for and address at the time of the code review.

    Last of the color coding” by juhansonin is licensed under CC BY 2.0.

    2. Is the code well organized?

    Well-organized code is very subjective, but it is still something to look at. Is the structure easy to follow? As a potential maintainer of this code, are you able to find declarations and definitions where you would expect them? Is the module part of one monolithic file or broken down into digestible pieces that are built together?

    In addition, be sure to look out for adequate commenting in the code. Comments should explain what the code does, why it does it, and how it works, especially to explain anything tricky or out of the ordinary. Be on the lookout for spelling errors as well because a well-commented codebase rife with spelling errors looks unprofessional.

    3. Is the code covered by tests? Are all edge cases covered? What about integration testing?

    Anytime a new module is submitted for review, one of the first things I look for are unit tests. Without a unit test, I almost always reject the merge/pull request because I know that at some point down the road a small, insignificant change will lead to a broken module that cascades through countless applications. A simple unit test that checks the basic functionality of the module, striving for 100% coverage, can save so much time and money in the long term. Edge cases are tricky, but if you think outside the box and ensure that the unit test checks even the “impossible” scenarios, you’ll be in good shape.

    Integration tests are a different matter. In my line of work, integration testing must be done with hardware-in-the-loop and that quickly becomes cost-prohibitive. However, as integration tests are developed and a test procedure is in place, any and all integration tests must be performed before a change will be accepted; especially if the integration test was modified in the change!

    4. Are there any code smells?

    Common code smells I look out for are:

    • Code bloat: long functions (> 100 lines), huge classes
    • Dispensable code: duplication (what I look for the most), stray comments, unused classes, dead code, etc.
    • Complexity: cyclomatic complexity greater than 7 for a function is prime fodder for refactoring
    • Large switch statements: could you refactor to polymorphic classes and let the type decide what to do?

    Many other smells exist – too many to check in detail with each code review. For a great deep-dive I refer you to the Refactoring Guru. Many static analysis tools will check for various code smells for you, so be sure to check the reports from your tools.

    The presence of a code smell does not mean that the code must be changed. In some cases, a refactor would lead to more complex or confusing code. However, checking for various smells and flagging them can lead to discussions with the developer and produce a much better product in the end!

    5. Would you be happy to maintain this code yourself?

    One of the last things I check for during a code review is whether I would be okay to maintain this code on my own in the future. If the answer is no, then that turns into specific feedback to the developer on why that is the case. Maybe the structure is unclear, the general approach is questionable, or there are so many smells that it makes me nervous. Typically in cases like this, I find it best to give the developer a call (or talk with them in person) and discuss my misgivings. An open, honest discussion about the potential issues often leads to added clarity for me (and it’s no longer an issue) or added clarity for them and they can go fix the problem.

    These are just a few of the specific things I look for, but following the practices above will help make code reviews a positive experience for everyone involved. What are some best practices or tips you have found to lead to a good code review? Thanks for reading!