We form mental models of various phenomenon that are happening around us. These mental models help us arrive at heuristics to quickly evaluate situations. The heuristics mostly work, except when they don't. And when they don't it is prudent to look deeper and change the heuristic and update one's mental model.
The areas for which we form these heuristics vary widely: evaluating whether a work item is worth pursuing, or an article is worth reading, or as in this case, whether some analysis is correct or not. In this case, it took me a bit of time to first accept that my heuristic was wrong, and then even larger time to update my heuristic. This relates to approximating numerical calculations when some deviations are small.
First we need to understand a bit of background. At Simpl, we have "TPV" (Total Payment Volume), which is the money transacted through Simpl, and "delinquent TPV" ie. the TPV for which we could not make recovery. Delinquent TPV divided by TPV is "delinquency rate".
A few weeks ago, Ashish Jain told me the results of an experiment which involved a purportedly improved version of a model. They were something like the following (numbers are not factual)
Current Model | Proposed Model | |
TPV | 100 | 99.75 |
Delinquent TPV | 2.1 | 2.08 |
Delinquency rate (=Delinquent TPV / TPV) | 2.1000% | 2.0852% |
Looking at the results, I immediately concluded that there is some calculation error, since $$\frac{2.1}{100} - \frac{2.08}{99.75} \approx \frac{2.1 - 2.08}{100} = \frac{0.02}{100} = 0.0002 = 0.02\%$$
and 0.02 is very different from 0.0148. There had to be some error in how Ashish arrived at delinquency difference.
At the time of doing this computation, I was sure that whether I take 100 or 99.75 in the denominator does not matter they are different by less than 1%. However, clearly it matters and changes the answer disproportionately.
After a reasonably amount of thought, I was able to understand that when numerator and denomonator both are changing by small amounts, you cannot disregard the change in denominator for approximate calculations. In particular
$$\frac{D}{T} - \frac{D-\Delta D}{T - \Delta T} $$ $$= \frac{D}{T} - \frac{D(1-\frac{\Delta D}{D})}{T(1 - \frac{\Delta T}{T})}$$ $$= \frac{D}{T} - \frac{D}{T} (1 - \frac{\Delta D}{D}) (1 - \frac{\Delta T}{T})^{-1}$$ $$= \frac{D}{T} - \frac{D}{T} (1 - \frac{\Delta D}{D}) (1 + \frac{\Delta T}{T} + (\frac{\Delta T}{T})^2 + (\frac{\Delta T}{T})^3 + \dots)$$ $$\approx \frac{D}{T} - \frac{D}{T} (1 - \frac{\Delta D}{D}) (1 + \frac{\Delta T}{T})$$ $$= \frac{D}{T} - \frac{D}{T}(1 - \frac{\Delta D}{D} + \frac{\Delta T}{T} - \frac{\Delta D\Delta T}{DT})$$ $$\approx \frac{D}{T} - \frac{D}{T}(1 - \frac{\Delta D}{D} + \frac{\Delta T}{T})$$ $$= \frac{D}{T} - \frac{D}{T} + \frac{\Delta D}{T} - \frac{D\Delta T}{T^2}$$ $$= \frac{\Delta D}{T} - \frac{D\Delta T}{T^2}$$ and the second term cannot be disregarded.The mistake that I did was that I thought that $$\frac{D}{T} - \frac{D-\Delta D}{T - \Delta T} $$ $$\approx \frac{D}{T} - \frac{D-\Delta D}{T} $$ $$= \frac{\Delta D}{T}$$
which is incorrect.
So, the moral is that when two variables are changing by small amounts then you need to be very careful when approximating how their function changes. And when your real world disagrees with your mental model, you adjust your mental model!