index

Installing cuda on Ubuntu 18.04 for pytorch or tensorflow

I recently needed to update some servers running an old Ubuntu LTS (Xenial, 16.04) to a slightly less old Ubuntu LTS (Bionic, 18.04). I had been putting it off for some time, mostly due to the noise I heard about problems installing the Nvidia CUDA toolkit. But that …


Three reasons to use Shapley values

Last time, we discussed Shapley values and how they are defined, mathematically. This time, let's turn our attention to how to use them.1

We discussed how explainable artificial intelligence (XAI) is focused around taking models which have high predictive power (high variance, or high VC models) and providing an …


How Shapley values work

A common concern in machine learning (ML) solutions is that apparent predictive power is coming from a problematic source.1 For example, a model might learn to predict burrito quality from latitude and longitude. In this case, the actual signal is likely coming from a particular city or neighborhood having …


A multi-file torchtext data loader

To help models generalize, it's common to use some form of data augmentation. This is where the original training data are modified in some way that preserves their semantics while changing their values. The model is fitted to the original training data, plus one or more augmented versions of it …


A faster way to generate lagged values

At Novi Labs, we spend a lot of time working with timeseries data. Generically speaking, these data are formatted something like this:

id  time  value
---------------
a     0      1
a     1      2
a     2      3
b     0      4
b     1      5
c     0      6

where we have individual sensors represented by …