The Flaw of Averages (2020)

https://causal.app/blog/forecasting-with-uncertainty

41 points by refrigerator on 2024-05-14 | 13 comments

Automated Summary

The article 'The Flaw of Averages' discusses the importance of considering uncertainty in decision-making models. When creating a budgeting model, for example, it is not enough to consider average expenses, income, and rate of return, as these averages may not accurately represent the 'probably' outcomes. Instead, it is important to consider best-case and worst-case scenarios, as well as the likelihood of these scenarios occurring. The article also discusses the limitations of humans in reasoning about probabilities and the benefits of using tools like Monte Carlo Simulation to run simulations and account for uncertainty. The overall message of the article is that uncertainty should be considered in decision-making models in order to make more informed and accurate decisions.

Comments

mjburgess on 2024-05-15

The mean of a set of measures is only characteristic of those measures if the error in each is random, ie., if there's equally probably positive errors, vs. negative.

We teach means as something somehow fundamental and commonly useful; but they're rarely useful and highly derivative.

hinkley on 2024-05-16

Goldratt is suggesting in Critical Chain that you convince people to give accurate estimates (as in 50% confidence) instead of building in safety. He’s asserting that this fixes your problem with estimates only ever being under instead of over.

But I just finished the book a couple days ago and I’m still trying to wrap my head around how on earth you’d pull that off in software. Doesn’t work with Scrum at all, and management has to be 100% on board or the betrayal when people get punished for “bad” estimates will destroy the company. I’ve only had that kind of trust one time. And maybe half of that level a couple others.

Among other things the reasoning is this: if you give someone two weeks to finish something that takes “about a week” they will start it a week later and chew up all their safety. And if they complete 4 day tasks in three days they’ll delay delivery because they get punished for asking for more time than they need. And further that a lot of these delays aren’t just procrastination, or funny accounting to work on tech debt off the books (though that does happen IME). Instead it’s the consequence of multiple chains having steps that require the same constrained individual or machine. And trying to reserve that resource when the start dates are up in the air.

dang on 2024-05-14

Discussed at the time:

Forecasting with uncertainty - https://news.ycombinator.com/item?id=22842847 - April 2020 (20 comments)

tqi on 2024-05-15

> but it's very rare for everything to go wrong all of the time.

I think in reality the inputs to these models are a lot less independent than they would like. Even in the example at the start of the article, I suspect that for most folks income from a side hustle and investment rate of return are both highly correlated to the overall economic conditions. Attempting to add statistical rigor to something as open ended as predicting the future feels quixotic.

jkaptur on 2024-05-15

I like to say "it's not the variance that gets you, it's the correlation." In my experience, "fat tail" risks are really about incorrect modeling of correlation.

m0llusk on 2024-05-15

This is kind of rude in that it uses the term "The Flaw of Averages" as well as some ideas and even an image from the book by the same name by Sam Savage and Jeff Danziger but gives them no credit. If you can't come up with your own expressions and drawings then you should credit the source that you end up using.

SkyMarshal on 2024-05-15

At the end he gives two ways running Monte Carlo simulations, spreadsheet plugins, and his own app Casual. Surely there are more. Are there other tools for Monte Carlo sims that HN'ers use and like? Preferably local FOSS apps?

tunesmith on 2024-05-15

You can do a monte carlo simulation in a spreadsheet without a plugin, too. For each of the inputs, create a column where the cell retrieves a random value according to the probability distribution you pick. Then, drag down 1000 rows or 10000 rows or whatever, and finally calculate percentiles based off of the outputs of each row.

A few years ago when I met my now-wife, I was paying a mortgage and she was paying rent, and we wanted to get an idea of what we could afford if we wanted to stick with the budget were were already comfortable with. So the inputs were the possible sale price of my house, the possible interest rate of the new mortgage, etc. We experienced what the article describes; while we were quite conservative about each of all of those elements, the 99th percentile was a lot less conservative than the worst-case scenario of each and every input we plugged in, and it gave us an accurate target of what kind of home price we felt we could actually afford.

Yossarrian22 on 2024-05-16

I use the @Risk tool he references, its handy because its easy to spit out tables that upper management can sagely nod their heads to and ask what you're doing about the top 2 drivers of uncertainty. that and those with finance backgrounds are also familiar with the tool and like to see it.

jmount on 2024-05-15

STAN is an industrial grade Monte Carlo realization.

SkyMarshal on 2024-05-15

Thanks!

https://github.com/stan-dev

https://mc-stan.org/docs/reference-manual/mcmc.html#hamilton...

adammarples on 2024-05-15

Pymc3

SkyMarshal on 2024-05-15

Thanks!

https://www.pymc.io/