Category: Learning and Development

Strategies for continual learning and training in software development, as well as techniques for identifying and resolving bugs and other issues, and effective approaches for solving problems and overcoming obstacles in software development.

  • Quick Tip: Generating Kernel Configuration Fragments with Bitbake

    Generating a kernel configuration fragment is a common task in kernel development when working with the Yocto Project. Configuration fragments are extremely useful to define groups of kernel configuration options that you can then reuse between projects simply by adding the fragment to your kernel bbappend file.

    For example, if I wanted to enable USB to serial UART device drivers via a kernel configuration fragment, I’d run through the following steps:

    1. Configure the kernel to setup the configuration baseline to work from
    2. Run menuconfig to enable/disable the desired options
    3. Generate the fragment by running the diffconfig command of bitbake
    4. Copy the fragment to my recipe overlay directory
    5. Add a reference to the fragment to my kernel bbappend file and rebuild

    1. Default Kernel Configuration

    Most all Yocto kernels are going to have a defconfig file that defines the default options for the kernel. When you run the configme command of bitbake, this defconfig will be copied to the kernel build directory as the .config for the build.

    # Create a kernel configuration from the default configuration
    # (i.e., build the kernel recipe through the configure step)
    bitbake linux-lmp-fslc-imx-rt -c kernel_configme -fCode language: PHP (php)

    2. Make Configuration Changes for Fragment

    Once you have configured your kernel with bitbake, now you can edit the kernel configuration using menuconfig. Simply make all the necessary changes to the kernel configuration required for your device and save the configuration by exiting menuconfig.

    # Make the necessary configuration changes desired for the fragment
    bitbake linux-lmp-fslc-imx-rt -c menuconfigCode language: PHP (php)
    Use menuconfig to make the desired changes to the kernel configuration.

    3. Generate the Kernel Configuration Fragment

    Now that you have saved the kernel configuration, the .config file in your build folder is updated with your changes. However, these only reside in your build directory and will not persist. To keep these changes around, you need to generate the configuration fragment with the diffconfig command of bitbake.

    # Create the fragment
    bitbake linux-yocto -c diffconfigCode language: PHP (php)

    The output of this command will tell you where the fragment was stored. In my case, it was stored in:

    /build/lmp/_bld/tmp-lmp/work/imx6ullwevse-lmp-linux-gnueabi/linux-lmp-fslc-imx-rt/5.10.90+gitAUTOINC+ec9e983bd2_fcae15dfd5-r0/fragment.cfg

    4. Copy Fragment to Kernel Overlay

    Now, I can copy that configuration fragment to my directory with my recipe and add it to my recipe overlay via the bbappend.

    cp /build/lmp/_bld/tmp-lmp/work/imx6ullwevse-lmp-linux-gnueabi/linux-lmp-fslc-imx-rt/5.10.90+gitAUTOINC+ec9e983bd2_fcae15dfd5-r0/fragment.cfg ../layers/meta-consciouslycode/recipes-kernel/linux/linux-lmp-fslc-imx-rt/prolific-pl2303.cfg

    5. Add Fragment to Kernel Recipe and Rebuild Kernel

    Finally, add the fragment to the kernel recipe’s bbappend and rebuild the kernel!

    Kernel recipe bbappend file containing the SRC_URI addition with the configuration fragment.
    # Build the new kernel!
    bitbake linux-lmp-fslc-imx-rtCode language: PHP (php)

    Conclusion

    You can use this method to capture any group of kernel configuration options you want in a fragment. That fragment can then be reused across many projects to easily enable and disable certain kernel features.

  • Using Bitbucket Pipelines to Automate Project Releases

    Generating releases for your project shouldn’t be a chore, yet many times it does prove to be a pain. If you don’t release very often or on a regular schedule, you have to go back and remember how to do it. This can result in inconsistencies in your releases, which makes it harder on your users. Read on to learn how you can define steps in your Bitbucket Pipelines to automate your project releases.

    I recently set up automated releases for a C++ project in Bitbucket. This process automates the steps to create a release branch, bump the version per Semantic Versioning, generate a changelog according to Conventional Commits, and push the release back to git, fully tagged and ready to go. Here is how I did it.

    Define the common steps in your pipeline

    # Default image to use - version 3.x
    image: atlassian/default-image:3
    definitions:
      services:
        docker:
          memory: 7128
      steps:
          - step: &Build-Application
              name: Build Application
              image: rikorose/gcc-cmake
              size: 2x
              script:
                # Update the submodules
                - git submodule update --recursive --init
                # Install the dependencies
                - apt-get update && export DEBIAN_FRONTEND=noninteractive
                - apt-get -y install --no-install-recommends uuid-dev libssl-dev libz-dev libzmq5 libzmq3-dev
                # Print the Linux version.
                - uname -a
                # Print the gcc version.
                - gcc --version
    
                # Print the CMake version.
                - cmake --version
                # Setup the build
                - mkdir _bld && cd _bld
                # Call CMake
                - cmake -DCMAKE_BUILD_TYPE=Debug ..
                # Build project
                - make -j10
          - step: &Build-Container
              name: Test Container Build
              size: 2x
              script:
                # Update the submodules
                - git submodule update --recursive --init
                # Build the container
                - docker build --file ./Dockerfile .
              services:
                - dockerCode language: PHP (php)

    In this snippet, I define which image to use by default for all the steps. I chose to use the default Atlassian image, but specified version 3. If you do not specify a version here (with the :3) you wind up with a really old version of the image that is kept around for backwards compatibility.

    I also define two common build steps, called Build-Application and Build-Container, which I use later on in my pipeline.

    Define the pipeline(s)

    pipelines:
      custom:
        generate-release:
          - step:
              name: Generate release branch
              script:
                - git checkout master
                - git pull --ff-only
                - git checkout -b release/next
                - git push -u origin release/next
      pull-requests:
        '**': # all PRs
          - step: *Build-Application
          - step: *Build-Container
      branches:
        master:
          - step: *Build-Application
          - step: *Build-ContainerCode language: PHP (php)

    This snippet generates a few pipelines, one that runs each time the master branch is updated on the server, one that runs for every pull request created, and one custom pipeline that must be run manually.

    The master branch and pull-requests pipelines are identical and simply utilize the defined steps from the previous step. The custom pipeline however has a single role: create a new branch called release/next and push that back to the server. As you’ll see in the next section, this will trigger another branch pipeline.

    Define the release generation pipeline

      branches:
        # master branch defined here previously
        release/next:
          - step:
              name: Generate Release
              script:
                # Configure npm to work properly as root user in Ubuntu
                - npm config set user 0
                - npm config set unsafe-perm true
                # Install necessary release packages and generate release, pushing back to repo
                - npm install -g release-it @release-it/conventional-changelog @j-ulrich/release-it-regex-bumper --save-dev
                - release-it --ci
          - parallel:
            - step:
                name: Publish to External Continuous Delivery System
                script:
                  - export APP="name_of_app"
                  - git clone --recursive https://url.of.your.cd.com/your-cd-repo.git
                  - cd containers
                  - git checkout testing
                  - git submodule update --recursive --init
                  - cd ${APP} && git checkout master
                  - git pull
                  - export VERSION=$(git tag | sort -V | tail -1)
                  - >
                    echo "Updating ${APP} to Release Version: ${VERSION}"
                  - git checkout ${VERSION}
                  - cd ../
                  - git add ${APP}
                  - >
                    git -c user.name='Bitbucket Pipeline' -c user.email='bitbucket-pipeline@witricity.com' commit -m "${APP}: update to version ${VERSION}"
                  - git push
            - step:
                name: Create Pull Request
                caches:
                  - node
                script:
                  - apt-get update
                  - apt-get -y install curl jq
                  - export DESTINATION_BRANCH="master"
                  - export CLOSE_ME="true"
                  - >
                    export BB_TOKEN=$(curl -s -S -f -X POST -u "${BB_AUTH_STRING}" \
                      https://bitbucket.org/site/oauth2/access_token \
                      -d grant_type=client_credentials -d scopes="repository" | jq --raw-output '.access_token')
                  - >
                    export DEFAULT_REVIEWERS=$(curl https://api.bitbucket.org/2.0/repositories/${BITBUCKET_REPO_OWNER}/${BITBUCKET_REPO_SLUG}/default-reviewers \
                      -s -S -f -X GET \
                      -H "Authorization: Bearer ${BB_TOKEN}" | jq '.values' | jq 'map({uuid})' )
                  - >
                    curl https://api.bitbucket.org/2.0/repositories/${BITBUCKET_REPO_OWNER}/${BITBUCKET_REPO_SLUG}/pullrequests \
                      -s -S -f -X POST \
                      -H 'Content-Type: application/json' \
                      -H "Authorization: Bearer ${BB_TOKEN}" \
                      -d '{
                            "title": "Release '"${BITBUCKET_BRANCH}"'",
                            "description": "Automated PR release :)",
                            "source": {
                              "branch": {
                                "name": "'"${BITBUCKET_BRANCH}"'"
                              }
                            },
                            "destination": {
                              "branch": {
                                "name": "'"${DESTINATION_BRANCH}"'"
                              }
                            },
                            "close_source_branch": '"${CLOSE_ME}"',
                            "reviewers": '"${DEFAULT_REVIEWERS}"'
                          }'
    Code language: PHP (php)

    This is a rather large block, but is pretty straight-forward.

    The first step, called “Generate Release”, is where the release magic happens. It uses the NPM tool called release-it to generate the release. Basically, this utilizes a configuration file in the repository named .release-it.json. Based on that file, it will automatically do the following:

    • Bump the version, based on how you define it in .release-it.json
    • Generate and update a changelog
    • Git commit, tag, and push
    • And much more if you so choose…

    Since this is run in the release/next branch, the version, changelog and all other changes are made and pushed here. At that point, I wanted to do two things: first, publish this new release to my external continuous delivery system; and second, automatically generate a pull request in Bitbucket to get the release back in the master branch.

    Note, that when installing release-it on Ubuntu 22.04 (or older), you may run into issues with older versions of nodejs. To remedy this, run these commands:

    # Remove old version of nodejs
    sudo apt-get purge nodejs
    sudo apt-get autoremove # remove any lingering dependencies
    
    # Install updated nodejs (20 is latest at time of this writing)
    curl -sL https://deb.nodesource.com/setup_20.x | sudo -E bash -
    sudo apt-get install -y nodejs
    
    # Finally, install release-it
    npm install -g release-it @release-it/conventional-changelog @j-ulrich/release-it-regex-bumper --save-dev
    Code language: Bash (bash)

    My .release-it.json file looks like this:

    {
      "git": {
        "commitMessage": "[skip ci] ci: release v${version}"
      },
      "plugins": {
        "@release-it/conventional-changelog": {
            "preset": {
                "name": "conventionalcommits",
                "commitUrlFormat": "{{host}}/{{owner}}/{{repository}}/commits/{{hash}}",
                "compareUrlFormat": "{{host}}/{{owner}}/{{repository}}/compare/{{currentTag}}..{{previousTag}}",
                "types": [
                  {
                    "type": "feat",
                    "section": "Features"
                  },
                  {
                    "type": "fix",
                    "section": "Bug Fixes"
                  },
                  {
                    "type": "perf",
                    "section": "Performance Improvements"
                  }
                ]
            },
            "infile": "CHANGELOG.md"
        },
        "@j-ulrich/release-it-regex-bumper": {
            "out": [
                {
                    "file": "CMakeLists.txt",
                    "search": "VERSION {{semver}}",
                    "replace": "VERSION {{versionWithoutPrerelease}}"
                },
                {
                    "file": "Dockerfile",
                    "search": "Version={{semver}}",
                    "replace": "Version={{versionWithoutPrerelease}}"
                }
            ]
        }
      }
    }
    Code language: JSON / JSON with Comments (json)

    At this point, to generate a release, just go to the pipelines page for your repository and select Run pipeline. Then choose what branch you want to use for the basis of your release (I typically release from master) and choose the ‘custom: generate-release’ pipeline and off you go!

    Conclusion

    This process greatly simplifies my life when it comes to release a new version of my projects. Could this be fully automated? Absolutely — I’m just not there quite yet.

    I hope you find this useful!

    Links

  • Why Writing Good Comments Makes You a Great Developer

    Commented, xkcd.com #156

    When you think of a great developer, I’m sure someone who writes good comments often is not at the top of the list. However, writing good comments is one of the most important skills a developer can have. Good comments not only help you understand your code better, but they also make it easier for others to read and work with. In this blog post, we’ll look at why writing good comments makes you a great developer and some tips for improving your commenting style. Because if you are mindful in your commenting, it is an indication that you are mindful in your coding!

    Comments Should Be Present

    Well-written code comments are like a good road map. They provide clear direction and helpful information that can make working with code much easier. Good code comments can be incredibly useful, providing critical insights and details that might otherwise be easy to miss. Think of them as important signposts along the way that can save a developer hours of debugging.

    Here is an example of something I came across recently that was not obvious. I was writing a CMake function to add unit tests using CTest. I was passing in a CMake string as my “TEST_COMMAND” variable. When I would call add_test with that variable as the value for the COMMAND option the test would fail to run properly, especially if the command took command line arguments! After spending some time digging I learned that the COMMAND option to add_test should be a CMake list rather than a string for the arguments to be passed properly.

    I commented my CMakeLists.txt as such to ensure that was clear to the reader.

    # TRICKY: Change the command string to a CMake list for use in add_test()
    string(REPLACE " " ";" TEST_LIST ${TEST_COMMAND})
    add_test(NAME ${TEST_NAME} COMMAND ${TEST_LIST})
    Code language: CMake (cmake)

    Without the “TRICKY” comment, a maintainer of this code may look at this and see potential for an optimization, removing the conversion, and then they would be searching for solutions to the same problem I had already solved.

    Comments Should Use Proper Spelling, Casing, Grammar, and Full Sentences

    Good code comments are spelled correctly. They are also properly cased. This attention to detail shows that the programmer cares about their work.

    Take a look at the two examples of code below. Which one would you say is written by a mindful developer? Which one would you rather work to maintain?

    // copute ac/a+c
    double prodOverSum(int a, double c)
    
    {// git the nmeratr for the rtn
      double n = (double)a * c;
    
       // get the denomination for the value
      int d = a + (int)c;
    
    /// comput and return the quotient
      return n / (double)d;
    }
    Code language: C++ (cpp)
    // Compute the product over sum for
    // the provided values, a and c.
    //        (A * C)
    //   X = ---------
    //        (A + C)
    double prodOverSum(int a, double c)
    {
      double prod = (double)a * c;
      int sum = a + (int)c;
    
      // Cast sum to a double to ensure
      // the compiler does not promote prod
      // to an integer and perform integer
      // division
      return prod / (double)sum;
    }
    Code language: C++ (cpp)

    It’s clear that the programmer cares about their craft when they put so much effort into writing clear, readable comments. It would be almost impossible to maintain this level of detail by chance, which makes me believe it is intentional as opposed just being accidental! That gives me confidence that the code itself is well-written, properly tested, and ready for use.

    When writing comments, it is important to use full sentences with proper grammar as well. This will help ensure that your comments are clear and easy to understand. Additionally, using proper grammar will help to give your comments a more professional appearance.

    Comments Should Be Smartly Formatted

    Comments are meant to convey a message about the surrounding code to the developer. Sometimes information is best conveyed in a particular format. So, when commenting your code, ensure that your comment is formatted in such a way that conveys your message as clearly and concisely as possible.

    Code formatters can help and hinder this. If your comments require lots of horizontal scrolling to read, then consider breaking them into multiple lines or rewording to be more concise! However, sometimes a new line in the middle of your documentation is undesirable and you will need to instruct your formatter to leave it alone by wrapping with “control comments”.

    For example, consider this method. If I was to line up all the columns in the table neatly, this would make for some very long lines of text. Most formatters would break these lines into multiple ones. Instead, make judicious use of white space to get the message across to the reader. If you have to use multiple lines, you decide where those line breaks are – don’t leave it up to your formatter!

    void Quaternion2DCM(const double * const q, double * const dcm)
    {
      // Don't do this! Your formatter will either add new lines or ignore this
      // if you add protection blocks around the table, making for really long 
      // lines that are harder to read.
      // To compute the DCM given a quaternion, the following definition is used
      //       +-------------------------------------------------------------------------------------------+
      //       | (q4^2 + q1^2 - q2^2 - q3^2)    2*(q1q2 + q3q4)                2*(q1q3 - q2q4)             |
      // dcm = | 2*(q1q2 - q3q4)                (q4^2 - q1^2 + q2^2 - q3^2)    2*(q2q3 - q1q4)             |
      //       | 2*(q1q3 + q2q4)                2*(q2q3 − q1q4)                (q4^2 - q1^2 - q2^2 + q3^2) |
      //       +-------------------------------------------------------------------------------------------+
      // clang-format off
      // Adapted from https://www.vectornav.com/resources/inertial-navigation-primer/math-fundamentals/math-attitudetran
      // clang-format on
      dcm[0] = q[3]*q[3] + q[0]*q[0] - q[1]*q[1] - q[2]*q[2];
      dcm[1] = 2*(q[0]*q[1] + q[2]*q[3]);
    ...
      dcm[7] = 2*(q[1]*q[2] - q[0]*q[3]);
      dcm[8] = q[3]*q[3] - q[0]*q[0] - q[1]*q[1] + q[2]*q[2];
    }
    Code language: C++ (cpp)
    void Quaternion2DCM(const double * const q, double * const dcm)
    {
      // Instead, you can do this - just simple white space, still very readable
      // by the user and it fits on a single line!
      // To compute the DCM given a quaternion, the following definition is used
      //       +-------------------------------------------------------------------+
      //       | (q4^2 + q1^2 - q2^2 - q3^2)    2*(q1q2 + q3q4)    2*(q1q3 - q2q4) |
      // dcm = | 2*(q1q2 - q3q4)    (q4^2 - q1^2 + q2^2 - q3^2)    2*(q2q3 - q1q4) |
      //       | 2*(q1q3 + q2q4)    2*(q2q3 − q1q4)    (q4^2 - q1^2 - q2^2 + q3^2) |
      //       +-------------------------------------------------------------------+
      // Adapted from https://www.vectornav.com/resources/inertial-navigation-primer/math-fundamentals/math-attitudetran
      dcm[0] = q[3]*q[3] + q[0]*q[0] - q[1]*q[1] - q[2]*q[2];
      dcm[1] = 2*(q[0]*q[1] + q[2]*q[3]);
    ...
      dcm[7] = 2*(q[1]*q[2] - q[0]*q[3]);
      dcm[8] = q[3]*q[3] - q[0]*q[0] - q[1]*q[1] + q[2]*q[2];
    }
    Code language: C++ (cpp)

    How Much Should I Comment?

    Good code will be somewhat self-documenting, but there is always a limit. For example, the method below is so obvious I don’t really need to comment on it, do I?

    int sum(const int a, const int b)
    {
      return a + b;
    }
    Code language: C++ (cpp)

    However, for something more involved, comments can clarify a lot of things for the developer and can link them to more information, as in the example of the Quaternion2DCM method described above.

    So, then, how do you define what is obvious? For me, I think in terms of my average user and/or maintainer. What sort of things do I expect them to understand? What about more junior software engineers who may need to work in this code? Basic math and logic knowledge seems okay. Syntax is a given; even more advanced syntax, such as lambdas or function pointers, I would expect them to be able to read. However, anything beyond that typically indicates the need for a detailed comment that explains things.

    It also helps me to think in terms of what will help me understand this design decision tomorrow, or 6 months from now, or even a year from now. Maybe it is obvious to me now what that this conditional with multiple clauses means and why it is this way in the design, but I’ll likely forget tomorrow and want to refactor.

    Comments In Your IDE

    To make working with comments easier, look for ways to get your IDE to help you! I use VS Code for nearly all my coding right now and I found the extension Better Comments to be extremely helpful.

    Image Credit: Better Comments Plugin

    With this plugin I can add additional formatting and mark comments specifically. For example, in my code I tend to leave myself reminders using TODO and will often prioritize those TODO comments using regular TODO and prefixing with ‘*’ and ‘!’.

    // ! TODO: This is an important TODO that needs to be taken care of immediately
    // * TODO: This is an important TODO that should be taken care of soon
    // TODO: This is a TODO that should be taken care of eventuallyCode language: C++ (cpp)

    With this plugin my comments are color-coded for me, making it easy to see what needs to be done first.


    In the end, your code is the reflection of you, so it only makes sense that your commenting reflects how much care and attention to detail there really is in what’s being written. If poor commenting sends a message on its own then people will be able to tell if they need more time before investing any kind of faith into your code—or even worse, they’ll just move onto another potentially better implementation!

    What do you think? What makes a good comment in your book? Let us know in the comments below!

  • Quick Tip: Python Variables in Multithreaded Applications and Why Your Choice Matters

    When working with Python classes, you have two kinds of variables you can work with: class variables and instance variables. We won’t get deep into the similarities and differences here, nor various use cases for each kind. For that, I’ll refer you to this article. For our purposes here, just know that class variables and instance variables both have their own unique uses.

    In this post, we’ll be looking at one particular use case: using Python class and instance variables in multithreaded applications. Why does this matter? As with any programming language, choice of variable types can be critical to the success or failure of your application.

    With that in mind, let’s explore some of the options you have when working with Python variables in a multithreaded context. Knowing which option to choose can make all the difference.

    Class variables are declared at the top of your class and can be accessed by every object that belongs to this particular class (i.e., each instance shares a copy of each class variable); while on the other hand, instance variables use the indicator called “self” and are tied to a specific instance. Each instance of an object will have its own set of instance variables, but share the class variables.

    Class variables look like this:

    class obj:
      data = [0,0,0]
    Code language: Python (python)

    Instance variables look like this:

    class obj:
      def __init__(self):
        self.data = [0,0,0]
    Code language: Python (python)

    I recently ran into an issue with class vs. instance variables in a multithreaded application. A colleague of mine was debugging a UI application that was communicating on an interface and shipping the received data off to a UI for display. He found it would crash randomly, so we switched up the architecture a bit to receive data on one thread and then pass it through a queue to the UI thread to consume. This seemed to resolve the crash, but the data in the UI was wrong.

    Digging into the problem, we found that the data was changing as it passed through the queue. After some more digging, my colleague realized that the class that was implemented to push the data through the queue was utilizing class variables instead of instance variables.

    This simple program illustrates the issue:

    import queue
    import threading
    import time
    
    q1 = queue.Queue()
    
    class obj:
      id = 0
      data = [0,0,0]
    
    def thread_fn(type):
        d = q1.get()
        preprocessing = d.data
        time.sleep(3)
    
        # Check data members post "processing", after modified by other thread
        if preprocessing != d.data:
          print(f"{type}: Before data: {preprocessing} != After data: {d.data}")
        else:
          print(f"{type}: Before data: {preprocessing} == After data: {d.data}")
    
    if __name__ == "__main__":
        x = threading.Thread(target=thread_fn, args=("ClassVars",))
        obj.id = 1
        obj.data = [1,2,3]
        q1.put(obj)
    
        x.start()
    
        # Update the data
        obj.id = 2
        obj.data = [4,5,6]
        q1.put(obj)
    
        x.join()
    Code language: Python (python)

    Essentially what was happening is that the data would be received on the interface (in this case the main function) and put into the queue. As the UI was getting around to processing said data, new data would be received on the interface and put it into the queue. Since class variables were used originally (and the class object was used directly), the old data got overwritten with the new data in the class and the UI would have the wrong data and generate errors during processing.

    Once the underlying message class was changed to use instance variables, the “bad data” issue went away and the original problem of the crashing application was also resolved with the architecture change. Take a look at the difference in this program:

    import queue
    import threading
    import time
    
    q1 = queue.Queue()
    
    class obj:
      def __init__(self):
        self.id = 0
        self.data = ['x','y','z']
    
    def thread_fn(type):
        d = q1.get()
        preprocessing = d.data
        time.sleep(3)
    
        # Check data members post "processing", after modified by other thread
        if preprocessing != d.data:
          print(f"{type}: Before data: {preprocessing} != After data: {d.data}")
        else:
          print(f"{type}: Before data: {preprocessing} == After data: {d.data}")
    
    if __name__ == "__main__":
        x = threading.Thread(target=thread_fn, args=("InstanceVars",))
        obj1 = obj()
        obj1.id = 1
        obj1.data = [1,2,3]
        q1.put(obj1)
    
        x.start()
    
        # Update the data
        obj2 = obj()
        obj2.id = 2
        obj2.data = [4,5,6]
        q1.put(obj2)
    
        x.join()
    Code language: Python (python)

    As you can see, using instance variables requires that we create an instance of each object to begin with. This ensures that each object created has its own data members that are independent of the other instances, which is exactly what we required in this scenario. This single change along would have likely cleaned up the issues we were seeing, but would not have fixed the root of the problem.

    When passing through the queue, the thread would get each instance and use the correct data for processing. Nothing in the thread function had to change; only how the data feeding it was set up.

    Python is a great language, but it definitely has its quirks. The next time you hit a snag while trying to parallelize your application, take a step back and understand the features of your programming language. Understanding the peculiarities of your chosen language is the mark of a mindful programmer! With a bit of space to gather your wits and some careful, conscious coding, you can avoid these pesky pitfalls and create fast, reliable threaded applications. What other Python nuances have bitten you in the past? Let us know in the comments below!

  • Quick Tip: Its Time to Avoid the Frustration of Single Return Types in C++

    When designing a new API one of the things I put a lot of thought into is how the user will know if the API call was successful or not. I don’t want to levy large error checking requirements on my users, but in C/C++ you can only return a single data type, so many APIs will pass the real output back through a referenced argument in the function prototype and a simple Boolean or error code as the return value. I find this clunky and hard to document, so I dug my heels in to find a better way.

    std::tuple and std::tie are two useful C++ features that can help you return multiple values from a function. std::tuple is a container that holds a tuple of values, while std::tie allows you to tie objects together so that they can be accessed as if they were one object. In this post, we’ll take a look at how to use these two features to make returning multiple values from a function easier.

    std::tuple and std::pair

    std::tuple (and std::pair) are C++ templates that allow you to combine two or more objects together and pass them around as if they were one. They are the clear choice for combining multiple outputs from a function into a single return data type. This creates clean, self-documenting code that is easy for a user to understand and follow.

    For example, let’s say we were dealing with a factory that created our objects. We’d have a creation method that looks something like this:

    std::shared_ptr<Object> MyFactory::create()
    {
      return std::make_shared<Object>();
    }
    Code language: C++ (cpp)

    One shortcoming here is that the create function does not to any error checking whatsoever, putting the entire burden on the user.

    A (slightly) improved version of the create method could be:

    bool MyFactory::create(std::shared_ptr<Object> &p)
    {
      p = std::make_shared<Object>();
      if (!p) return false;
      return true;
    }
    Code language: C++ (cpp)

    The user can now easily check the return value to determine whether or not the object was created successfully and perform additional processing. However, now they have to create the shared_ptr<T> object before calling the create function; and in addition to that, they also have to understand the argument p is not an input parameter, but rather they output parameter they are after in the first place.

    Instead, let’s make use of a std::pair to return both the created object as well as whether creation was successful.

    std::pair<std::shared_ptr<Object>, bool> MyFactory::create()
    {
      auto p = std::make_shared<Object>();
      return std::make_pair(p, !!p); // !!p same as static_cast<bool>(p) or 'if (p)'
    }
    Code language: C++ (cpp)

    How is this better? You may look at this and think that the user still has to grab the success value from the pair and that is absolutely correct. In this trivial case, creation is just a couple of lines. However, in a real-world scenario your function will likely be much more complex with many more errors to handle. Now, instead of levying that requirement on your user, you have captured all the error handling logic (and maybe reporting) internally. The user just has to check whether the returned data is valid or not via the Boolean in the pair.

    You can also just as easily extend this to return multiple values in a std::tuple:

    std::tuple<UUID, std::string, bool> createMsg(const std::string &msg, const int id)
    {
      UUID uuid = makeNewUUID();
      std::string outputmsg = std::to_string(id) + ": " + msg;
      return std::make_tuple(uuid, outputmsg, isUUIDValid(uuid));
    }
    Code language: C++ (cpp)

    Using std::tie to Access Return Values

    To access multiple return values, std::tie comes to the rescue. std::tie “ties” variable references together into a std::tuple. Accessing your multiple return values becomes straightforward at this point:

    // Factory Example
    std::shared_ptr<Object> obj;
    bool objvalid{false};
    std::tie(obj, objvalid) = MyFactory::create();
    if (objvalid) obj.work();
    
    // Message Example
    UUID lUuid;
    std::string msg;
    bool msgvalid{false};
    std::tie(lUuid, msg, msgvalid) = createMsg("Test message", 73);
    if (msgvalid) std::cout << lUuid << ": " << msg << std::endl;
    Code language: C++ (cpp)

    Conclusion

    The C++ Core Guidelines make it clear that passing back a std::tuple (or std::pair) is the preferred way to return multiple return values. It is also clear that if there are specific semantics to your return value that a class or structure object is best.

    std::tuple and std::pair provide a nice way to return multiple values from a function without having to resort to ugly workarounds. By using std::tie, we can make receiving the return value a breeze. What do you think? How will you use this in your next project?

  • Quick Tip: Improve Code Readability By Using C++17’s New Structured Bindings

    C++17 introduced a language feature called “structured bindings” which allows you to bind names to elements of another object easily. This makes your code more concise and easier to read, and also drives down maintenance costs. In this quick tip, we’ll take a look at how structured bindings work and give some examples of how you might use them in your own programs.

    Accessing std::tuple

    std::tuple is an extremely useful way to quickly combine multiple objects into a single object. I have used this often to combine various items that I want to serialize into a single byte stream for transmission somewhere (typically using MessagePack, for example see my code in the zRPC library). You can also use them effectively to return multiple values from a function (similar to std::pair, which is basically just a tuple of two objects).

    When using std::tuple the canonical way of gaining access to the members of the tuple is to use std::get<T> like so:

    // Given std::tuple<int, std::string, ExampleObject>
    const int i = std::get<0>(tpl);
    const std::string s = std::get<1>(tpl);
    const ExampleObject o = std::get<2>(tpl);
    Code language: C++ (cpp)

    This always felt clunky to me, yet the benefits of tuples were tremendous, so I just dealt with it.

    Structured Binding Approach

    Fast-forward to the C++17 standard and the ability to use structured bindings. These allow you to tie names to elements of any object, std::tuple included! Now, your access to the tuple becomes a single line:

    // Given std::tuple<int, std::string, ExampleObject>
    const auto [i,s,o] = tpl; // decltype(i) = int
                              // decltype(s)=std::string
                              // decltype(o)=ExampleObject
    Code language: C++ (cpp)

    So much cleaner and easier for the developer to read and follow!

    You can also get fancy with dealing with multiple return values from a function (see C++ Core Guidline F.21):

    ExampleObject obj;
    bool success{false};
    
    // Use structured binding to get object and success value
    // If creation succeeds, then process it
    if (auto [obj, success] = createObject(); success) processObject(obj);
    Code language: C++ (cpp)

    Structured bindings are a great new feature in C++17. They make your code more readable and maintainable, and they’re easier to parse for humans. I think you’ll find that they make your life a lot easier. What are some ways you see yourself using them in your own code?

  • 10 Easy Commands You Can Learn To Improve Your Git Workflow Today

    If you’re a developer, coder, or software engineer and have not been hiding under a rock, then you’re probably familiar with Git. Git is a distributed version control system that helps developers track changes to their code and collaborate with others. While Git can be a bit complex (especially if used improperly), there are some easy commands you can learn to improve your workflow. In this blog post, we’ll walk you through 10 of the most essential Git commands.

    TL;DR

    The commands we address in this post are:

    1. git config
    2. git clone
    3. git branch / git checkout
    4. git pull
    5. git push
    6. git status / git add
    7. git commit
    8. git stash
    9. git restore
    10. git reset

    It is assumed that you have basic knowledge of what the terms like branch, commit, or checkout mean. If not, or you really want to get into the nitty-gritty details, the official Git documentation book is a must-read!

    Setup and Configuration

    First things first – to get started with Git you need to get it installed and configured! Any Linux package manager today is going to have Git available:

    # APT Package Manager (Debian/Ubuntu/etc.)
    sudo apt install git
    
    # YUM Package Manager (RedHat/Fedora/CentOS/etc.)
    sudo yum install git
    
    # APK Package Manager (Alpine)
    sudo apk add gitCode language: PHP (php)

    If you happen to be on Windows or Mac, you can find a link to download Git here.

    Once you have Git installed, it’s time to do some initial configuration using the command git config. Git will store your configuration in various configuration files, which are platform dependent. On Linux distributions, including WSL, it will setup a .gitconfig file in your user’s home directory.

    There are two things that you really need to setup at first:

    1. Who you are
    2. What editor you use

    To tell git who you are so that it can tag your commits properly, use the following commands:

    $ git config --global user.name "<Your Name Here>"
    $ git config --global user.email <youremail>@<yourdomain>Code language: HTML, XML (xml)

    The –global option tells git to store the configuration in the global configuration file, which is stored in your home directory. There are times when you might need to use different email addresses for your commits in different respositories. To set that up, you can run the following command from the git repository in question:

    $ git config user.email <your-other-email>@<your-other-domain>Code language: HTML, XML (xml)

    To verify that you have your configuration setup properly for a given repo, run the following command:

    $ git config --list --show-originCode language: PHP (php)

    Finally, to setup your editor, run the following command:

    $ git config --global core.editor vimCode language: PHP (php)

    Working With Repositories

    In order to work with repositories, there are a few primary commands you need to work with — clone, branch, checkout, pull, and push.

    Cloning

    git clone is the command you will use to pull a repository from a URL and create a copy of it on your machine. There are a couple protocols you can use to clone your repository: SSH or HTTPS. I always prefer to set up SSH keys and use SSH, but that is because in the past it wasn’t as easy to cache your HTTPS credentials for Git to use. Those details are beyond the scope of this post, but there is plenty of information about using SSH and HTTPS here.

    To clone an existing repository from a URL, you would use the following command:

    $ git clone https://github.com/jhaws1982/zRPC.gitCode language: PHP (php)

    This will reach out to the URL, ask for your HTTPS credentials (if anonymous access is not allowed), and then download the contents of the repository to a new folder entitled zRPC. You can then start to work on the code!

    Sometimes a repository may refer to other Git repositories via Git submodules. When you clone a repository with submodules, you can save yourself a separate step to pull those by simply passing the --recursive option to git clone, like so:

    $ git clone --recursive https://github.com/jhaws1982/zRPC.gitCode language: PHP (php)

    Branches

    When working with Git repositories, the most common workflow is to make all of your changes in a branch. You can see a list of branches using the git branch command and optionally see what branches are available on the remote server:

    $ git branch         # list only your local branches
    $ git branch --all   # list all branches (local and remote)Code language: PHP (php)

    To checkout an existing branch, simply use the git checkout command:

    $ git checkout amazing-new-feature
    Switched to branch 'amazing-new-feature'
    Your branch is up to date with 'origin/amazing-new-feature'.Code language: JavaScript (javascript)

    You can also checkout directly to a new branch that does not exist by passing the -b option to git checkout:

    $ git checkout -b fix-problem-with-writer
    Switched to a new branch 'fix-problem-with-writer'Code language: JavaScript (javascript)

    Interacting with the Remote Server

    Let’s now assume that you have a new bug fix branch in your local repository, and have committed your changes to that branch (more on that later). It is time to understand how to interact with the remote server, so you can share your changes with others.

    First, to be sure that you are working with the latest version of the code, you will need to pull the latest changes from the server using git pull. This is best done before you start a branch for work and periodically if other developers are working in the same branch.

    $ git pull

    This will reach out to the server and pull the latest changes to your current branch and merge those changes with your local changes. If you have files that have local changes and the pull would overwrite those, Git will notify you of the error and ask you to resolve it. If there are no conflicts, then you are up-to-date with the remote server.

    Now that you are up-to-date, you can push your local commits to the remote server using git push:

    $ git push

    git push will work as long as the server has a branch that your local one is tracking. git status will tell you whether that is the case:

    $ git status
    On branch master
    Your branch is up to date with 'origin/master'.
    
    nothing to commit, working tree cleanCode language: JavaScript (javascript)
    $ git status
    On branch fix-problem-with-writer
    nothing to commit, working tree cleanCode language: JavaScript (javascript)

    If you happen to be on a local branch with no remote tracking branch, you can use git push to create a remote tracking branch on the server:

    $ git push -u origin fix-problem-with-writerCode language: JavaScript (javascript)

    Working with Source Code

    Git makes it very easy to work with your source code. There are a few commands that are easy to use and make managing code changes super simple. Those commands are: status, add, commit, stash, and reset.

    Staging Your Changes

    To stage your changes in Git means to prepare them to be added in the next commit.

    In order to view the files that have local changes, use the git status command:

    $ git status
    On branch fix-problem-with-writer
    Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git restore <file>..." to discard changes in working directory)
            modified:   CMakeLists.txt
            modified:   README.md
    
    no changes added to commit (use "git add" and/or "git commit -a")Code language: Bash (bash)

    Once you are ready to stage your changes, you can stage them using git add:

    $ git add README.md

    If README.md has a lot of changes, and you want to separate them into different commits? Just pass the -p option to git add to add specific pieces of the patch.

    $ git add -p README.md
    $ git status
    On branch fix-problem-with-writer
    Changes to be committed:
      (use "git restore --staged <file>..." to unstage)
            modified:   README.md
    
    Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git restore <file>..." to discard changes in working directory)
            modified:   CMakeLists.txt
    Code language: JavaScript (javascript)

    To commit these changes you have staged, you would use the git commit command:

    $ git commit

    Git commit will bring up an editor where you can fill out your commit message (for a good commit message format, read this; you can also read this for details on how to set up your Git command line to enforce a commit log format).

    You can also amend your last commit if you forgot to include some changes or made a typo in your commit message. Simply stage your new changes, then issue:

    $ git commit --amend

    Storing Changes For Later

    Git has a fantastic tool that allows you to take a bunch of changes you have made and save them for later! This feature is called git stash. Imagine you are making changes in your local branch, fixing bug after bug, when your manager calls you and informs you of a critical bug that they need you to fix immediately. You haven’t staged all your local changes, nor do you want to spend the time to work through them to write proper commit logs.

    Enter git stash. git stash simply “stashes” all your local, unstaged changes off to the side, leaving you with a pristine branch. Now you can switch to a new branch for this critical bug fix, make the necessary changes, push to the server, and jump right back into what you were working on before. That sort of flow would look like this:

    <working in fix-problem-with-writer>
    $ git status
    On branch fix-problem-with-writer
    Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git restore <file>..." to discard changes in working directory)
            modified:   CMakeLists.txt
    
    no changes added to commit (use "git add" and/or "git commit -a")
    
    $ git stash
    Saved working directory and index state WIP on fix-problem-with-writer
    
    $ git status
    On branch fix-problem-with-writer
    nothing to commit, working tree clean
    
    $ git checkout fix-problem-with-reader
    Switched to branch 'fix-problem-with-reader'
    
    <make necessary changes>
    $ git add <changes>
    $ git commit
    $ git push
    
    $ git checkout fix-problem-with-writer
    Switched to branch 'fix-problem-with-writer'
    
    $ git status
    On branch fix-problem-with-writer
    nothing to commit, working tree clean
    
    $ git stash pop
    On branch fix-problem-with-writer
    Changes not staged for commit:
      (use "git add <file>..." to update what will be committed)
      (use "git restore <file>..." to discard changes in working directory)
            modified:   CMakeLists.txt
    
    no changes added to commit (use "git add" and/or "git commit -a")
    Dropped refs/stash@{0} (5e3a53d36338f1906e871b52d3c97236f139b75e)Code language: JavaScript (javascript)

    There are a couple of things to understand about git stash:

    • The stash is a stack – you can stash as much as you want on it and when you pop, you’ll get the last thing stashed
    • The stash will try to apply all the changes in the stash, and in the event of a conflict, will notify you of the conflict and leave the stash on the stack

    I run into the second bullet quite often, but it isn’t hard to fix. If I run into that sort of issue, it is usually simple conflicts that are easily addresses manually. Manually address the conflicts in the file, restore all staged changes from the git stash pop, and then drop the last stash.

    $ git stash pop
    Auto-merging CMakeLists.txt
    CONFLICT (content): Merge conflict in CMakeLists.txt
    The stash entry is kept in case you need it again.
    
    $ git status
    On branch fix-problem-with-writer
    Unmerged paths:
      (use "git restore --staged <file>..." to unstage)
      (use "git add <file>..." to mark resolution)
            both modified:   CMakeLists.txt
    
    no changes added to commit (use "git add" and/or "git commit -a")
    
    $ vim CMakeLists.txt   # manually edit and resolve the conflicts
    $ git status
    On branch fix-problem-with-writer
    Unmerged paths:
      (use "git restore --staged <file>..." to unstage)
      (use "git add <file>..." to mark resolution)
            both modified:   CMakeLists.txt
    
    no changes added to commit (use "git add" and/or "git commit -a")
    
    $ git restore --staged CMakeLists.txt
    
    $ git stash drop
    Dropped refs/stash@{0} (6c7d34915b38e5d75072eacee856fb427f916aa8)Code language: HTML, XML (xml)

    Undoing Changes or Commits

    There are often times when I need to undo the previous commit or I accidentally added the wrong file to my stage. When this happens it is useful to know that you have ways to back up and try again.

    To remove files from your staging area, you would use the git restore command, like so:

    $ git restore --staged <path to file to unstage>Code language: HTML, XML (xml)

    This will remove the file from your staging area, but your changes will remain intact. You can also use restore to revert a file back to the version in the latest commit. To do this, simply omit the --staged option:

    $ git restore <path to file to discard all changes>Code language: HTML, XML (xml)

    You can do similar things with the git reset command. One word of caution with the git reset command — you can truly and royally mess this up and lose lots of hard work — so be very mindful of your usage of this command!

    git reset allows you to undo commits from your local history — as many as you would like! To do this, you would use the command like so:

    $ git reset HEAD~n
    
    # For example, to remove 3 commits
    $ git reset HEAD~3
    Unstaged changes after reset:
    M       CMakeLists.txt
    M       tests/unit.cppCode language: PHP (php)

    The HEAD~n indicates how many commits you want to back up, replacing n with the number you want. With this version of the command, all the changes present in those commits are placed in your working copy as unstaged changes.

    You can also undo commits and discard the changes:

    $ git reset --hard HEAD~n
    
    # For example, to discard 1 commit
    $ git reset --hard HEAD~1
    HEAD is now at 345cd79 fix(writer): upgrade writer to v1.73Code language: PHP (php)

    So there you have it – our top 10 Git commands to help improve your workflow. As we have mentioned before, when you take the time to understand your language and tools, you can make better decisions and avoid common pitfalls! Improving your Git workflow is a conscious decision that can save you a lot of time and headaches! Do you have a favorite command that we didn’t mention? Let us know in the comments below!

  • 6 Tips for an Absolutely Perfect Little Code Review

    Code reviews are an important part of the software development process. They help ensure that code meets certain standards and best practices, and they can also help improve code quality by catching errors early on. However, code reviews can also be a source of frustration for developers if they’re not done correctly.

    Image Credit Manu Cornet @ Bonker’s World

    As a code reviewer, your job is to help make the code better. This means providing clear and concise feedback that helps the developer understand what works well and what needs improvement. A mindful, conscious approach to code reviews can yield incredible dividends down the road as you build not only a solid, reliable codebase; but strong relationships of trust with your fellow contributors.

    Here are some initial guidelines or best practices for performing a great code review:

    • Read the code thoroughly before commenting. This will help you get a better understanding of what the code is supposed to do and how it works.
    • Be specific in your comments. If there’s something you don’t like, explain why. Simply saying “this doesn’t look right” or “I don’t like this” isn’t helpful.
    • Offer suggestions for how to improve the code. If you have a suggestion for how something could be done differently, provide details about the suggestion and even some links and other material to back it up.
    • Be respectful. Remember that the code you’re reviewing is someone else’s work. Criticizing someone’s work can be difficult to hear, so try to be constructive with your feedback. Respectful, polite, positive feedback will go a long way in making code review a positive experience for everyone involved.
    • Thank the developer for their work. Code reviews can be tough, so make sure to thank the developer for their efforts.

    Following these practices will help ensure that code reviews are a positive experience for both you and the developer whose code you’re reviewing.

    In addition to these, here are a few specifics I look for when performing a code review:

    0. Does the code actually solve the problem at hand?

    Sometimes when I get into code reviews I forget to check if the written software meets the requirements set forth. As you do your initial read-through of the code, this should be your primary focus. If the code does not solve the problem properly or flat out misses the mark, the rest of the code review is pointless as much of it will likely be rewritten. There’s nothing worse than spending an hour reviewing code only to find out later that it doesn’t work. So save yourself the headache and make sure the code compiles and does what it’s supposed to do before you start.

    1. Is the code well written and easy to read? Does the code adhere to the company’s code style guide?

    It’s important to format code consistently so that code reviews are easier to perform. Utilizing standard tools to format code can help ensure that code is formatted consistently. Additionally, the code should be reviewed according to a standard set of guidelines. Many formatters will format the code per a chosen (or configured) style, handling all the white space for you. Other stylistic aspects to look for are naming conventions on variables and functions, the casing of names, and proper usage of standard types.

    Structural and organizational standards are important as well. For example, checking for the use of global variables, const-correctness, file name conventions, etc. are all things to look out for and address at the time of the code review.

    Last of the color coding” by juhansonin is licensed under CC BY 2.0.

    2. Is the code well organized?

    Well-organized code is very subjective, but it is still something to look at. Is the structure easy to follow? As a potential maintainer of this code, are you able to find declarations and definitions where you would expect them? Is the module part of one monolithic file or broken down into digestible pieces that are built together?

    In addition, be sure to look out for adequate commenting in the code. Comments should explain what the code does, why it does it, and how it works, especially to explain anything tricky or out of the ordinary. Be on the lookout for spelling errors as well because a well-commented codebase rife with spelling errors looks unprofessional.

    3. Is the code covered by tests? Are all edge cases covered? What about integration testing?

    Anytime a new module is submitted for review, one of the first things I look for are unit tests. Without a unit test, I almost always reject the merge/pull request because I know that at some point down the road a small, insignificant change will lead to a broken module that cascades through countless applications. A simple unit test that checks the basic functionality of the module, striving for 100% coverage, can save so much time and money in the long term. Edge cases are tricky, but if you think outside the box and ensure that the unit test checks even the “impossible” scenarios, you’ll be in good shape.

    Integration tests are a different matter. In my line of work, integration testing must be done with hardware-in-the-loop and that quickly becomes cost-prohibitive. However, as integration tests are developed and a test procedure is in place, any and all integration tests must be performed before a change will be accepted; especially if the integration test was modified in the change!

    4. Are there any code smells?

    Common code smells I look out for are:

    • Code bloat: long functions (> 100 lines), huge classes
    • Dispensable code: duplication (what I look for the most), stray comments, unused classes, dead code, etc.
    • Complexity: cyclomatic complexity greater than 7 for a function is prime fodder for refactoring
    • Large switch statements: could you refactor to polymorphic classes and let the type decide what to do?

    Many other smells exist – too many to check in detail with each code review. For a great deep-dive I refer you to the Refactoring Guru. Many static analysis tools will check for various code smells for you, so be sure to check the reports from your tools.

    The presence of a code smell does not mean that the code must be changed. In some cases, a refactor would lead to more complex or confusing code. However, checking for various smells and flagging them can lead to discussions with the developer and produce a much better product in the end!

    5. Would you be happy to maintain this code yourself?

    One of the last things I check for during a code review is whether I would be okay to maintain this code on my own in the future. If the answer is no, then that turns into specific feedback to the developer on why that is the case. Maybe the structure is unclear, the general approach is questionable, or there are so many smells that it makes me nervous. Typically in cases like this, I find it best to give the developer a call (or talk with them in person) and discuss my misgivings. An open, honest discussion about the potential issues often leads to added clarity for me (and it’s no longer an issue) or added clarity for them and they can go fix the problem.

    These are just a few of the specific things I look for, but following the practices above will help make code reviews a positive experience for everyone involved. What are some best practices or tips you have found to lead to a good code review? Thanks for reading!

  • Setting Up Your git Environment for the CLI

    Updated 2023-03-10: The regular git-pre-commit-format hook script would not work with submodules properly. I have fixed this in my version, and it is uploaded to my repository. I have updated the link to point to this version, which is still 100% based on the original from barisione.

    Setting up your git environment in Linux may seem straight-forward and not a big deal; and while getting git installed and running is super easy, there are some tricks that will certainly make your life easier as a developer!

    Install git

    Every package manager for Linux that I am aware of is going to have a git package you can install. Depending on the package maintainers this will usually be a fairly recent version of git (v2.36.1 is the latest release as of this writing), but anything 2.26 or newer should be just fine.

    # APT Package Manager (Debian/Ubuntu/etc.)
    sudo apt install git
    
    # YUM Package Manager (RedHat/Fedora/CentOS/etc.)
    sudo yum install git
    
    # APK Package Manager (Alpine)
    apk add git
    Code language: Bash (bash)

    Basic git Configuration

    First off, you need to setup your basic git configuration, such as your name and your email address. This can be done on a per repository basis or globally. I typically have my work email setup globally and then configure a different email address for other repositories as required (i.e., my open source projects on GitHub).

    # Global Configuration
    git config --global user.name "Your Name"
    git config --global user.email "your_email@work.com"
    
    # Per Repository Configuration
    git config user.email "your_email@personal.com"
    
    # Or your GitHub no-reply email address
    git config user.email "YourID+username@users.noreply.github.com"
    Code language: Bash (bash)

    Storage Folders and Hooks

    I prefer to keep all my git repositories in one location, typically in my home directory somewhere like ~/git. When working in Windows (under WSL) I make sure that I clone my repositories inside the WSL environment, otherwise when working with the source (building, editing, etc.) there can be performance losses in the interaction between WSL and Windows.

    One of the first things I do when setting up my git environment is make sure I am setup with all the hooks I will need. As a C/C++ developer primarily, there really are only two that are a must have for me – my pre-commit format hook and my Conventional Commits hook (with Commitizen). To make setup of these easier, I wrote a script to easily install these hooks, which you can find here.

    mkdir -p ~/git/_globalhooks && cd ~/git/_globalhooks
    wget https://raw.githubusercontent.com/jhaws1982/git-hook-installer/master/git-hook-installer.sh
    
    chmod a+x git-hook-installer.sh
    Code language: Bash (bash)

    Auto Code-formatting Before Committing

    This is very useful to enforce a particular style for your repository.

    First, make sure clang-format is installed:

    sudo apt install clang-format
    Code language: Bash (bash)

    Pull the necessary scripts into your _globalhooks directory:

    wget https://raw.githubusercontent.com/barisione/clang-format-hooks/master/apply-format
    wget https://raw.githubusercontent.com/jhaws1982/git-hook-installer/master/git-pre-commit-format
    
    chmod a+x apply-format
    chmod a+x git-pre-commit-format
    Code language: Bash (bash)

    Finally, install the hook:

    ./git-hook-installer.sh git-pre-commit-format pre-commit <path-to-repository>
    Code language: Bash (bash)

    Now, every time you commit, your source code will be checked for proper formatting and fixed automatically if you approve.

    Conventional Commit Enforcement

    By setting up a git hook to walk you through writing a proper commit log, you can make sure that your history is very readable! Commitizen makes this super easy and you can customize the template as you see fit. I prefer to follow Conventional Commits, and the setup for that is described here:

    # Install npm, commitizen, and template for Conventional Commits
    sudo apt install npm
    npm install -g commitizen
    npm install -g cz-conventional-changelog
    echo '{ "path": "cz-conventional-changelog" }' > ~/.czrc
    
    # Create the git hook
    cat >> ~/git/_globalhooks/commitizen-commit-msg
    #!/bin/bash
    exec < /dev/tty && $HOME/.npm-global/bin/cz --hook || true
    ^EOF
    Code language: Bash (bash)

    Finally, install the hook:

    ./git-hook-installer.sh commitizen-commit-msg prepare-commit-msg <path-to-repository>
    Code language: Bash (bash)

    Please note, that the above example assumes that you have cleared up any NPM permissions issues by following the manual steps as found here.

    Global Hooks

    With the hooks installed in the ~/git/_globalhooks directory, you can easily copy the hooks to a new machine. This enables you to get setup rather quickly. You can also use these hooks globally for all repositories if you so choose. I usually don’t because most repos don’t follow Conventional Commits or the same formatting rules. To apply hooks globally, specify the global flag to the script when installing the commit. This will set up the hook in a “global” folder of your choosing (I would choose ~/git/_globalhooks). It also adds the necessary configuration line to your git config to specify the path as the hookspath.

    ./git-hook-installer.sh -g git-pre-commit-format pre-commit ~/git/_globalhooks/
    
    git config --list
        core.hookspath=$HOME/git/_globalhooks/
    Code language: Bash (bash)

    Hopefully some of the tips I have found will help you out as you setup your environment from scratch or just update it with new settings!