Value at Risk (VaR) to Determine Emergency Fund Size

A few weeks ago, I posted an article on why I am using double entry accrual bookkeeping for my personal finances. In that article, I mentioned that the main reason why I started doing it, and driving my wife nuts begging and pleading for receipts to all the things she spends money on, was to have a better understanding of where our finances sit by collecting pristine data. Particularly, I wanted to figure out how big our emergency fund ought to be by using Value at Risk (VaR) calculations.

Therefore, the main benefit of double-entry accrual accounting is that you have clean data. To do my data entry (which let’s be honest, is what bookkeeping really is), I used gnuCash. I didn’t give a whole lot of reasons for it other than it is free to download and use. Thank you open source community!

Here’s the deal though, gnuCash has a really cool functionality. I can connect up with a postgres server, and have all of my data at my fingertips via SQL queries. I know, that is so fantastic right! My inner nerd is squealing right now because that means that there are infinite ways to slice and dice my personal data other than the simple canned balance sheet and or accounting reports that come prepackaged in gnuCash.

One of my main reasons for doing this is to get a sense of how big my emergency fund needs to be. Experts use the rule of thumb for 3 to 6 months of expenses. Indeed, following this advice would make for a very solid emergency fund. However, I want to introduce you to one of my favorite concepts. It is Value at Risk, or VaR for short. This isn’t the first time that I have talked about VaR. That is to say that I think that we can use this concept. It will be a more sophisticated with our personal finances than just blindly following this rule of thumb.

How VaR Applies to Personal Finance

Here’s the thing, my expenses tend to be variable. It is variable in the sense that some months I may have more transactions in a certain expense category than others, let’s say for concreteness, car repairs. But those car repair transactions are also variable in their size. Some of them might be a few dollars to change a tail light that has gone out, and others might be absolutely huge like replacing a transmission.

The idea with VaR is to get a sense of how much money I need to have on hand to cover losses (in this case expenses) such that I will be able to cover some reasonable percentage of all possible scenarios. Perhaps we want to cover 99% of all scenarios with our emergency fund. Doing this analysis on a personal level is something that the average person usually lacks the data for, and hence the need to do personal double-entry accounting.

It is important though. Think about it like this, you will never be able to save enough money to cover every possible expense. A judge could order you to pay billions of dollars in restitution because you accidentally started a wildfire. You would immediately need to declare bankruptcy, unless you happen to be a billionaire…

However, doing this sort of analysis will allow you to self-insure against typical issues, and buy insurance against the other percentage of larger cases. Furthermore, it will tell you how often in the time period that you are looking at that you will exceed your emergency fund. Then you can pick the amount that feels the most comfortable to you.

The Emergency Fund and VaR

For example, if you want to take on more risk, maybe you self-insure with a VaR at 95%. What does it mean? It means in 1 out of every 20 possible futures, you will run out of money.

If you want to be more conservative, you could self-insure with a VaR at 99.99%. This would mean that out of every 10,000 possible futures, you would have enough money in 9,999 of them.

This simply means that you will have a bigger emergency fund. I am a big advocate of developing an analysis that is based entirely on your own data. Here’s the thing, your essentially holding provisions when you do this against a rainy day. You should hold provisions. If you had massive risk of fraud occurring in your business or loans defaulting, you would hold aside a certain amount of money to cover those losses.

The issue is finding that right balance. The more provisions that you carry, the worse your return on equity is going to be. Which means that your net worth will grow slower because of the provisions that you hold. The way to think about this is that if you have too big of an emergency fund, you lose out on the opportunity to invest those funds and earn a return on them, or to use them to pay down debt.

I think that for most people, somewhere in the neighborhood of 90% VaR and 95% VaR would feel good. Imagine that you knew that 90-95% of every 6-month periods you could experience were covered with savings. It would feel good, and you would be able to sleep at night.

Here’s the thing. If I looked at a typical month for me and my house, following the rule of thumb that I should have 3 to 6 months worth of expenses saved up, I would be under funded. I would be pretty close to my 50% VaR number. Which means that about half the time I would be able to cover my expenses for 6 months, the other half of the time, I would be looking for a handout.

Pulling Data From gnuCash

As I said at the top of this article, the cool thing about using an open source tool like gnuCash is that I have complete access to all of my data. The double entry bookkeeping is all about keeping my data clean so that it is useful. gnuCash can be set up to run on a database locally (or remotely for that matter). That means we can pull it and do some analysis on the data.

It turns out that this is very simple to do:

import pandas as pd
import psycopg2
from sqlalchemy import create_engine
import matplotlib.pyplot as plt
from sklearn.neighbors import KernelDensity
import numpy as np

pd.set_option("display.max_rows", 101)
pd.set_option("display.max_columns", 101)

conn = psycopg2.connect(
    host="localhost",
    database="gnucash",
    user="postgres",
    password="not_my_real_password")

With this block of code, I’ve imported everything that I need, and established a connection to my local postgres database. Next we want to pull in some actual data into a dataframe that we can manipulate.

trans = """
SELECT 
    b.name as category,
    cast(a.quantity_num as float)/cast(a.quantity_denom as float) as amount,
    c.post_date as dt
FROM 
    splits a 
    inner join accounts b on a.account_guid=b.guid
    inner join transactions c on c.guid=a.tx_guid
WHERE 
    account_type='EXPENSE';
"""

df = pd.read_sql_query(trans,conn)

This is a pretty straightforward SQL query where we pull down individual transactions out of expense accounts. Note that since we are going to be doing double-entry accounting, we just need to grab the part of a transaction that affects the expense accounts. Since transactions are composed of a debit and a credit, the transaction date is held in a separate table.

I close my books monthly. That means that I’m going to have transactions moving balances from these expense accounts to the equity account. We’ll deal with that. We also have pesky income tax expenses in here as well that we will need to deal with. At any rate, this is what my dataframe looks like:

categoryamountdt
0Miscellaneous86.812020-09-01 10:59:00
1Groceries72.842020-09-01 10:59:00
2Miscellaneous74.652020-09-02 10:59:00
3Miscellaneous44.922020-09-02 10:59:00
4Miscellaneous29.572020-09-02 10:59:00
208Books64.952020-10-05 10:59:00
209Recreation57.622020-10-05 10:59:00
210Sales Tax2.932020-10-05 10:59:00
211Sales Tax3.142020-10-07 10:59:00
212Groceries93.922020-10-07 10:59:00

So now, we’ll deal with the fact that we have to drop the transactions that we occur when we close our books. Fortunately, these are all going to show up as negative numbers in the table. We’ll also exclude income taxes, which admitedly we could have done in our SQL query, but I only realized that they were in here after I pulled the data. We’ll also clean up our dates.

df = df[df['amount']>0]
df = df[df['category']!='Federal']
df = df[df['category']!='Medicare']
df = df[df['category']!='State/Province']
df = df[df['category']!='Social Security']
df['dt'] = df['dt'].dt.date 
kde_amount = KernelDensity(kernel='gaussian', bandwidth=2).fit(np.array(df[df['amount']>0]['amount']).reshape(-1, 1))
X = np.linspace(0,1000,1001).reshape(-1, 1)
plt.plot(X,np.exp(kde_amount.score_samples(X)))
plt.show()
kde_count = KernelDensity(kernel='gaussian', bandwidth=0.01).fit(np.array(df.groupby('dt').count()['amount']).reshape(-1,1))

Finally, we fit a model. This model is a kernel density estimate. It is going to allow us to develop our very own, custom, non-parametric probability distributions. This is useful because we can use those probability distributions to take draws from in our simulation months. We do 1000 runs of the simulation.

efs = []
for r in range(1000):
    emergency_fund = 0
    for i in range(6*30):
        tx_count = np.round(kde_count.sample(),0).astype('int')[0][0]
        for j in range(tx_count):
            emergency_fund += np.round(kde_amount.sample(),2)[0][0]
    efs.append(emergency_fund)
plt.hist(efs,bins=50)
plt.show()

This code will produce a histogram of possible future expenses.

Here’s what I learned. I’m going to spend on average about $32,000 over a six month period. Which seems pretty reasonable. It does seem a tad high, but when you consider that this includes health insurance costs, life insurance costs for both me and my wife, my mortgage, etc. That doesn’t seem too far off to me. It does seem quite a bit higher than I would have liked, maybe I could cut back in some areas of my life (perhaps in another blog post). I mean this does imply that I spend about $64k just to run my household per year. With the average household spending about $20k per year on housing costs, I guess my $64k doesn’t seem too crazy. It does exclude taxes though.

Can you see me justifying my extravagent life-style in real-time? I did find this article that suggests that I am pretty typical though. It looks like my expenditures are pretty close to the average american’s.

Anyway, this histogram is the useful bit. We can use it to determine how much money I need to put away for a rainy day. To keep my financial house held together for 6-months no income. I will need to have about $36k stashed away. That will give me a 90% VaR. You can calculate your own by using this little code snippet:

np.percentile(efs,90)

Actually, going down to the penny, I need to have $36,326.67 in reserves to cover 90% of all possible future 6-month periods. A 3-month 90% VaR would mean that I would only need to have $19,380.17 (I did a 3 and a 6 month simulation for my own benefit).