## Tax and Cognitive Dissonance

For years I’ve held two beliefs fairly confidently:

1. We should not have a flat tax (in which everyone’s income is taxed at a constant percentage) because the relationship between losing some percent of one’s income and the reduction in one’s happiness changes depending on income level.
Our taxes should vary by income, so that the experienced burden of maintaining the country is felt equally among all taxpayers.
Different incomes should pay different tax rates so that we each feel our taxes equally.
2. The effect that income has on happiness is logarithmic, following the form:
h(x) = a+b Log(x)
where x is income and h(x) is happiness as a function of income.There is actually good evidence for this, as can be seen in the plot below.
A University of Michigan study showed this relationship more or less holds for each of the 13 countries surveyed.

For the US in particular, we can refer to the Gravity and Levity blog for this very nice fit:

which holds for the range of data collected (incomes from 15k to 115k per year).

The problem is that these two beliefs actually are completely contradictory.

Lets use the following convention:

x: Income per year
h(x): Happiness resulting from that income
t(x): The tax paid for each income
Δ: The uniform happiness reduction we want our taxes to enact

Suppose h(x) = a+b Log(x). What kind of tax should we have to make sure that regardless of income, x, everyone’s happiness is reduced equally?
(i.e. We all ‘feel’ taxes equally, regardless of income)

We can determine t(x) as follows:

Δ = h(x) – h(x-t(x))
Δ = (a+b Log(x)) – (a+b Log(x – t(x)))
-Δ/b = Log(1-t(x)/x)

t(x) = x (1-Exp[-Δ/b])

In other words, a constant fraction, (1-Exp[-Δ/b]), of everyone’s wealth is taken.
This is a flat tax.

First of all, the Michigan data only covers an annual income of 15k to 115k. Clearly the a + b Log(x) trend can’t extend indefinitely — happiness ratings only go up to 10 — so at some point it must level off.

In both limits, it becomes unclear how to treat taxes. Inevitably the tax code would need to become progressive, making absurdly wealthy people (who, on average, can’t theoretically get much happier) pay huge percentages.

I’m also sure we wouldn’t want a society in which people are taxed into misery — one insisting that taxes are taken even from people who are on average at 1/10 happiness. Figuring out how the happiness vs income curve should behave in the low limit, and what to do with that information, is something we would have to think about.

What’s the point

In the end, this short calculation doesn’t really say anything about how we ‘should’ tax at all.

If you believe that a completely progressive tax is best, you can still think that. You don’t have to believe either of the premises, the data I show, or or even the simple arithmetic was done.

What’s important (to me), and the reason that I’m writing any of this down, is that from the start I believed both of the premises, and I would have agreed with the math I did to compare them. Yet I had no idea that they were in conflict with each other.

Making things worse is that as beliefs go, these ones are on the quantitative, more verifiable end. We all have an unlimited supply of beliefs that are infinitely more vague than “well-being follows a logarithmic relationship with income”.

The Boolean Satisfiability Problem (also called ‘SAT’) involves checking the truth values of a set of variables connected by boolean operations (AND, OR, NOT).

For example, suppose I hold n beliefs (each of which can be either true or false) all connected by AND, OR, and NOT.

(belief_1 AND belief_2 AND belief_3) OR (belief_3 AND belief_1) OR … (belief_n AND NOT(belief_1) AND belief_(n-1))

The SAT problem is to determine the values of the variables (the beliefs) which make the entire statement true.

Checking the truth value of a long boolean statement composed of True, False, AND, OR, and NOT actually sounds pretty simple. In particular, it sounds infinitely simpler than checking whether all of the uncountable and potentially nebulous beliefs of a single human being agree with one another.

Unfortunately, SAT happens to be the first problem in computer science to ever be proven NP-complete. In its worst case, the computational complexity of the problem makes it untenable. It is the low-hanging fruit of incredibly hard problems.

We’re all inevitably walking around in a mess of cognitive dissonance, with some huge set of our beliefs in contradiction with one another.

With all that in mind, we should probably all be dramatically less confident in what we think is true.

## A Dumb Battleship Algorithm

A while ago my friend Chad suggested that we make algorithms to play battleship on sets of random boards and see whose was better. This was all inspired by a blogpost Chad had read from datagenetics, where Nick Berry applied battleship algorithms to random boards and saw which ones worked better than others.

I only had a few free hours, so I decided that I wanted to make an algorithm that was as good as possible, under the strict constraint that it was also as simple and dumb as possible. I wanted to avoid complicated math, difficult combinatorics, and any algorithm that would require walking down some huge decision tree I would have to manually populate.

As basic as the underlying idea was, the project raised a number of interesting ideas, and put the pros and cons of neural networks (NN’s) in sharp relief.

## I Before E, also ‘Their’

Every once in a while, I see the following claim made:

There are more exceptions to ‘I before E except after C’ than there are words that follow the rule, so it doesn’t count as a useful rule of English.

I first read this in a fact book in high school, and — holding a grudge from an especially low grade I received on a middle school spelling test about that very rule — I immediately believed the criticism.

But now that I’m a mature adult, I’ve gained some distance from that completely unfair and really dumb 6th grade spelling test (that really shouldn’t have counted ’cause I was sick the day before but whatever).  And I find myself wondering how ineffective this rule actually is.

[Note: There is a longer version of this rule, but it’s less often quoted, and so I’m going to ignore it]

This raises the question: how do we measure the accuracy of such a rule?

The common criticism (expressed in the quote) is that there are more words which break the rule than follow it. Lets name this metric $M_{stupid}$ for reasons which, if not already, will soon become obvious.

$M_{stupid} \equiv (\text{Words Following Rule}) - (\text{Exceptions})$

Then if $M_{stupid}$ negative, we can definitively say that this is a bad rule for English grammar.

After half a second’s thought, however, we can see why this metric is called $M_{stupid}$: not all words are used at equal frequencies. For all we know, most of the words which break this rule are obscure Latin names of chemicals that nobody ever really comes across, but are technically English. (this isn’t the case, but you get the point)

What we really care about is how accurate this rule is in practice. That is, for each instance we come across to deploy the rule in written language, how often does it give you the correct answer.

Lets assume we know how to spell everything in English, but when we come across an “ei” or “ie” in a word, we have no specific information about the arrangement. We can only rely on the rule “I before E except after C”. How accurate is our writing?

Simple math gets us most of the way there

$Accuracy = \frac{\text{Number of Correct Rule Applications}}{\text{Total Number of Rule Applications}}$

The total number of rule applications is simply the number of instances of either “ie” or “ei” when we write. We’ll denote this as follows:

$Total = (ie + ei)$

The total number of correct applications of the rule is the number of times where words in language follow the rule (obviously). See the image below for a highly technical representation of this formula.

We can also write this as follows:

$Correct = ((ie - cie) + cei)$

Putting this all together, we have

$Accuracy = ((ie - cie) + cei)/(ei + ie)$

At this point, we have a rule, but we still haven’t addressed what data set to apply this metric to. Ideally, we would be able to search through some corpus of correctly spelled words, weighted by their use frequency.

I’m not a computational linguist, so this point only a few options occurred to me here.

The first was to use Google NGram. This would have been great, but I’m looking for substrings within words, and NGram searches for prevalence of whole words.

There are lovely corpora of words with frequency data attached (http://norvig.com/ngrams/) which are completely free, but these contain misspellings, which is renders it virtually useless by itself.

Finally, there are the beautiful top tier word frequency data sets (http://www.corpusdata.org/) with perfectly parsed data from the entirety of Wikipedia. This, however, costs money, so I’m ruling it out.

[Note: I could have downloaded all of Wikipedia and turned it into a text file on my own, but this seemed hard, and I have better things to do with my life]

My solution was to combine the free options: use a dictionary package, remove the misspelled words from the free frequency data, and perform the calculation on that (here’s a link to the data for the curious: http://norvig.com/ngrams/count_1w.txt).

Finally, after all this, I found an accuracy of ….

I before E except after C: 76.4%

Now this doesn’t seem like an entirely terrible rule after all.

This is not to say that this is a particularly desirable score, but it’s markedly better than the “over half of cases wrong” accusation that is thrown around so often.

Something interesting (which has been noticed by others) is that the rule “I before E” with no reference to C at all has a higher accuracy:

I before E: 80.99%

Lets see if we can do better. Is there a letter that can take C’s place to improve accuracy?

C comes in near the bottom, in 21st place. That’s not good. You can literally choose a letter at random to inset into “I before E except after _”, and have a ~77% chance at a better spelling rule.

On the positive side, H seems to be the clear winner here, at 86.8% accurate. But a lot of this advantage is simply due to the existence of the word ‘their’. If I incorporate this fact, I can produce the most accurate variation I’ve seen yet.

Summing up, I would like to propose a permanent replacement to ‘I before E except after C’ (which yielded a measly 76.4% accuracy) :

I before E, also ‘their’: 87.9%