Search code examples
pythonutility-methodrequirements-management

Where to put a small utility function that I would like to use across multiple packages/projects that I develop?


Right now I have one function that would be useful in a number of distinct packages that I work on. The function is only a handful of lines. But I would like to be able to use this code in a number of packages/projects that I work on and are deployed, I would like this code to be version controlled etc. There isn't, for example, one package that all the other packages already have as a requirement, otherwise I could put this code inside of that one and import it that way.

During my time coding I've come across this issue a couple times. For concreteness some functions f that have this characteristic might be:

  • A wrapper or context manager which times a block of code with some log statements
  • A function which divides a range of integers into as small number of evenly spaced strides while not exceeding a maximum number of steps
  • A function which converts a center value and a span into a lower and upper limit

Basically the options I see are:

  • Put the code f in one of the existing packages R and then make any package A that wants to use f have R as a requirement. The downside here is that A may require nothing from R other than f in which case most of the requirement is wasteful.
  • Copy the code f into every package A that requires it. One question that comes up is then where should f live within A? Because the functionality of f is really outside the scope of package A does. This is sort of a minor problem, the bigger problem is that if an improvement is made to f in one package it would be challenging to maintain uniformity across multiple packages that have f in them.
  • I could make an entire package F dedicated to functions like f and import it into each package that needs f. This seems like technically the best approach from a requirements management and separation of responsibility management perspective. But like I said, right now this would be an entire package dedicated to literally one function with a few lines of code.
  • If there is a stdlib function that has the functionality I want I should definitely use that. If there is not, I may be able to find a 3rd party package that has the functionality I want, but this brings about a zoo of other potential problems that I'd prefer to avoid.

What would be the suggested way to do this? Are there other approaches I haven't mentioned?


Solution

  • The entire packaging system is designed to solve exactly this problem - sharing code between different applications. So yes, the ideally you'd want to create a package out of this and add it as a dependency to all the other packages that use this code. There are a few upsides to this option:

    • Package management is clean
    • Future repositories can also include this code
    • Any changes to this code can still be handled with proper versioning and version pinning - thus not breaking code in other places
    • Future such functions f2, f3, etc. can potentially be added to this package, allowing you to share them across packages too

    But this also comes with some (potential) downsides:

    • You now have to maintain an additional package, complete with its deployment pipeline and versioning - this however should not be too much of a hassle if there is already a pipeline in place
    • Poorly managed versioning can cause systems to collapse rather quickly, whenever breaking changes are introduced - this typically is harder to trace

    Having said that, the option of copying code to each of the packages that use f is still an option. Consider these points:

    • Often, such code is also tweaked over time to adapt it to the requirements of the parent package, in such cases sharing it between packages no longer makes sense - and attempts to generalize it more often than not lead to bad abstractions. If adhering to DRY is your concern, do checkout this talk from Dan Abramov on the 'WET codebase'
    • Regarding maintaining uniformity - you may not have to do so all the time, depending on the usecase. Package A could be using updated code, while package B could be using the older one. Regardless, whatever approach you use, you'd still need to update every package to maintain uniformity - for example if you go with a dedicated package, you'd still need to update the version used everywhere.
    • Regarding where this code will reside in each package's codebase - If f does something very specific, it can reside in an appropriately named file of its own. If nothing else, there is always the notoriously overused util.py ¯\(ツ)

    Recommendation

    • Begin with copying over the code to all packages. Update them individually as required in every package.
    • Over time if you observe that any updates to f is being propagated to all other packages every time, then put f in a package of its own and replace the code in the other packages with an import from this new package.
    • Finally, don't fret the small things. Most things in software are reversible. Pick one approach and change it to the other if it does not workout. Just remember to not drag the decision - delay too much and you'd be left with a huge mountain of tech debt over time.

    PS: Someone may recommend using a git submodule for sharing this code - DO NOT do it, managing versions isn't clean and will soon get out of hand - you'd rather just create a new package instead