The Beehive, City Place, Gatwick, RH6 0PA, United Kingdom
+44 (0)20 801 74646

Automated detection and control of unexpected third party dependencies

In this post I’m exploring how to use dependency analysis using technology from our partner, Lattix, to detect and control third party library usage.

It’s rare for any reasonably sized development these days to not take advantage of third party functionality. Development cost savings are their fundamental value, due to not having to “reinvent the wheel” (or maintain it). Tangible benefits to your customers are many too: familiar user interfaces appearance (C++ QT, Java Swing, C# Mono), interoperability (OpenSSL, Xerces), database (Oracle, Postgres) to name just three. And anything your customers like is just another advantage for you. In fact, there is so much third party functionality which is just a few clicks away via a Google search, that it’s going to be super tempting to take advantage come what may – or super difficult to avoid depending on which side of the debate fence you sit on. Whichever view you take, the case for using them is unequivocal, they just have to be well managed, especially considering today’s distributed development teams and how potentially easy it is for code to be submitted to the central automated build system without raising a whimper.

Still, why worry you may ask?

Besides the obvious potential thorny reasons of cost and license restrictions and compliance, there are some more fundamental engineering considerations…

Firstly, what did the initial design mandate? Good time and money is spent on this opening development phase, so adherence to its requirements probably should be honoured. Related, the architectural relationships within a codebase can evolve over the project lifetime. Therefore, its important to be be able to define and easily control whats allowed.

Secondly, how a third party library is used is quite important. That original design will no doubt have specified some nice layering or modularity that limited the usage of certain third party libraries to some particular corners of your codebase. In the case of a GUI library for example, it may be fairly safe to assume that only the parts of your codebase that should interact with the library will do so. However, for a super handy tools or utilities library such as Boost, it may be fairly equally safe to assume that your codebase and Boost will be conjoined.

Still, what harm you might still be wondering? Well, what happens when said third party library needs removing from your codebase or a new version arrives. All of a sudden, that deeply intertwined relationship your codebase and it has suddenly becomes a big maintenance problem, otherwise known as technical debt with a major cost impact. And that cost could be worse when you don’t even know just how coupled things are, or where. At this point, hindsight is shouting that you should have kept a tight reign on your developers invisible and distributed free hands. But how?

Introducing Lattix

Let me introduce Lattix to you. In overview, Lattix is a tool for:

  1. Revealing the existing architecture within a codebase (C/C++, .NET, Java, Oracle, UML, to name some), which you can compare with the original design intent.
  2. Improve that architecture using Lattix tools and features.
  3. Defining user specified relationship rules to enforce that structure so it cant degrade again going forwards.
  4. Immediately detecting and reporting violations automatically as an extra build process step.

You’d be correct if that final feature sounded suspiciously relevant to this blog.

Getting Lattix set up to automatically enforce the reporting of unexpected dependencies on third party libraries follows some obvious setup. First, you have to acquire the dependencies that exist within the codebase. Lattix utilises two approaches. It either relies on third party parsers to extract the dependencies which Lattix is later capable of loading in, or Lattix has inbuilt parsing abilities. For my simple C based example, I will use an inbuilt parser based on Clang. Alternate external parsers include SciTools Understand and the public domain tool Doxygen. At this point, for sake of brevity I will avoid the urge to keep going off in tangents to mention this or that feature, otherwise, we’ll be here a long time. Suffice to say, Lattix is highly configurable and capable. Here is the initial configuration for my codebase:

Lattix Configuration

I have pointed Lattix to the root directory of my codebase, and even though sub directory “THIRD_PARTY” is a required component in my build, I have removed it from the tree, hence the strike through. This could represent an immediately identified third pary library I already know I dont want to depend on. Clicking OK results in Lattix going off and parsing all the files in the hierarchy with Clang which then brings us to the next view, the extracted model:

Theres a lot to absorb here, even for a codebase of only 4 files (a conscious choice after years of presenting tools, simple makes the point much more clearly for the casual observer), but Im just going to focus on the areas of relevance. You can see in the usage pane on the right that the main() routine in prime.c is depending on an external function call called third_party(). This is just confirmation really of the conscious decision earlier to limit the initial parse to just the code under the “PRIME” directory. Already you can see how external items can be nicely itemised. So, thats all well and good, but that “external” was manually identified. How do we get Lattix to do the hard work?

Here I bring the rules pane into view. Lattix defines some default rules, all of which can removed, edited or augmented. These rules are currently very open about what we are willing to allow our codebase to use (the 4 rules are saying things can be used), so let me tighten one up:

Im about to change the fourth rule to say our codebase cannot use any unresolveds. As you can see, we are placing this rule on the root of the hierarchy, meaning anything in our main body of code that we load in to Lattix cannot use anything that is not part of that loaded codebase. There is space for an explanatory comment and even scope to limit the rules to particular types of dependencies. Once applied, we get some red flags appearing in our model, indicating that that thing or dependency is involved in a violation of a rule. Clicking on any red flag brings the violations pane into view:

This violations report can be made available both online via a web based repository or shared in an exportable form (all the typical formats, HTML, xml, etc, etc). So, at this point, we’ve still only got Lattix complaining about something we already knew about. Before I proceed with something more interesting, its worth reiterating that these rules are applied at a user defined point in the hierarchy. In other words, I could define a whole set of restrictive rules at the root level of our model, for example, denying our codebase from using anything external, but then apply an overriding “allow” rule at some lower point in the model hierarchy. In the subsequent screen shots, I’ve done this by adding a rule to allow prime.c to use unresolved externals.

For this next image, Ive updated my codebase to include an extra library which as its not part of the defined codebase, has caused a red flag to automatically appear, the explanation of which can be seen in the violations tab again:

It’s probably useful to be able to jump into the code to see these problems in the flesh. Assuming you have configured a suitable editor, you can open the file at the offending line, as you can also see.

The final most useful step would be to automate the discovery of unexpected dependencies. Lattix permits this by the addition of a couple of command line calls which are placed into your automated build system. Their job is to take a snapshot of the code just built and compare it to the rules in our saved model. If any rule violations are discovered, its possible to detect the creation of Lattix violations reports during the build and attach some behaviour to it, for example, halting the build or generating notifications. The benefit of bringing these violations to someones attention immediately upon their discovery is two fold:

1. There will be fewer things to fix than if problems had been left to accumulate until the next manual assessment
2. The changes will be fresher in the developers minds, so that they can address them more easily.

DID YOU ENJOY THIS POST?

Subscribe to our newsletter and to keep up to date on blog posts, product updates and events.

Leave a comment