Saturday, May 30, 2020

BMI Table with Undue Precision

The undue precision will be on my part because this post isn't really about your health, it's about doing a little mathematics. In a spreadsheet, because sometimes it's more fun there.

I have a little spreadsheet with a few examples of interesting formulas in Excel. The thing that is a bit pitiful about it is that it isn't an inline formula but a dependency which makes it hard to copy and paste. I did a piece-meal copy and paste job from here (NIH) into a table. 

If you just want a spreadsheet that does this stuff for you and don't care about my blathering, download it here.

The thing that makes this table a beast is that the general usage requires you to first choose only one row of the table based on your height and then find which two weights you fall between and look up to find the BMIs you are between. And since this is a post about interpolation, you do straight interpolation to get your BMI. So, if I'm 5' - 10 1/4", and I'm dead sure to rights I'm not a 1/4" shorter, how do I proceed to get a more accurate value.

Of course, it is eye-roll worthy to care about such precision when the BMI number is a ballpark notion, anyway. But, from the stand point of data and formulas, can we do it anyway? (And, what might we learn if we do?) Here is what I have right now in all of its glorious ridiculousness:


Fig. 1 - Interpolation on steroids. Or maybe, interpolation without boundaries. Well, ok, interpolation without reason.

The simplest way of combining linear interpolations I could think of was to determine boundaries and look up the corresponding values. So we start with finding the low and high lines, in this case the lines for 70 and 71. Looking up the BMI weight included array formulas on the right hand side of the table that reference the given weight.

On the far right of the table I have formulas like the following in columns AL to AO:
  • =MAX(IF(B4:AK4<=B$25,B4:AK4, 0))
    • find the largest value in B4:AK4 that is still smaller than the weight in B25
  • =MIN(IF(B4:AK4>=B$25,B4:AK4, 100000))
    • find the smallest value in B4:AK4 that is still bigger than the weight in B25
  • =LOOKUP(AL4, $B4:$AK4, $B$2:$AK$2)
  • =LOOKUP(AM4, $B4:$AK4, $B$2:$AK$2)
It is important to use the <ctrl> + <shift> + <enter> key combination to make the first two of these formulas work as this is how you create array formulas out of formulas that are not inherently array based. (Formulas that are expecting ranges by default are inherently array based.) Taking the first formula as an example, the meaning of the syntax is something like this: "For every value in the list  B4:AK4, apply this formula." The formula being applied is the IF.  Anywhere you see B4:AK4 you can read, "any x that is in B4:AK4." If you C#'d this, you have something like:

     (B4:AK4).select(x => if(x < B25, x, 0)).max()

LOOKUP is listed as available for backward compatibility. Sounds like it is not really desired and probably the VLookup and HLookup functions are mostly preferred, but LOOKUP has a leg up on them in one respect. I don't have to modify the layout of my table to accommodate my present purpose. I need the search value to be the current row and the return value to be the top value of a column. If I use HLookup, I need the searched value to be the top row (as far as I can see, at least). If I didn't have LOOKUP, I still have a way, of course. Copy the BMI to the bottom and return the bottom row. To my mind, the design of LOOKUP communicates purpose much more effectively: "Here is what I'm searching--shown first, Here is what I want returned from, respectively to the matching item in the search space." 

The rest is just repeated application of linear interpolation. Do linear interpolation for 70 inches (in my case) and then for 71 inches. Then interpolate between those resulting BMIs to get a final BMI. In some cases I omit denominators and/or calculations where I know the result will give me 1 due to the way the spreadsheet and data set is constructed and so these get simplified out of the formulas.