The bigger the effect, the fewer the number of observations necessary to see it. You only need to touch a hot stove once to realise that it’s dangerous. You may need to drink coffee thousands of times to determine whether it tends to give you a headache
What is the book about?
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are is written by Seth Stephens-Davidowitz, a New York Times op-ed contributor and former Google data scientist. He is a graduate from Stanford and a post-doc in economics from Harvard.
This book is not about the TV series ‘House’ even though it uses the same catchphrase. Everybody Lies covers the power and implications of Big Data. Seth Stephens-Davidowitz calls his book the next level of Freakonomics. Some of the insights this books are as follows
We are inherently racist even in this day and age. We talk the right things but act differently. He demonstrates, using the data from Trump election, that while Americans told pollsters and surveys that they opposed his policies, this did not reflect in their voting patterns
The data tells us that a man has a significantly improved chance of reaching the NBA if he is born in a middle class family who is reasonably well off and in a wealthy county. This goes against our usual thought process that people from economically disadvantaged classes are more likely to make it in big league sports like cricket or basketball.
Violent movies actually bring down crime.
The best educational institutions do not make it any easier to succeed. People who tend to succeed join them. The cause and effect are contrary to what we think.
People who invoke God are more likely to default on loans.
Everybody lies, especially on Facebook
Immigration accelerates success.
Support Digital Amrit by Buying Everybody Lies at Amazon
What does this book cover?
Everybody Lies has three sections.
The first section (actually a single chapter) explains the need for Big Data. It also explains why Google Searches are valuable – the assumption is that while people might lie to their friends, relatives and surveys, they do not lie (much) when searching on their own. This means that the Google Search data is the single most important dataset from the perspective of sociology.
The second section dives into the details of the power of Big Data. Some of the topics covered are sex, child abuse, truth, behaviour, buying patterns among others. For example: did you know that child abuse goes up significantly when there is an economic downturn. The increase in google searches around topics like ‘my mom beat me’ or ‘ my dad hit me’ during these times, validates this assertion. Similarly, google search activities also show that the amount of gay people in the population is around 4% and not 2%; which means that even in liberal states like California, there is some stigma that society ascribes to them. The amount of google searches about self-induced abortion went up a lot when abortion clinics closed down in states hostile to abortion. This shows that reducing the funding for abortion clinics or closing them is counter-productive.
The section also talks about using Big Data to slice and dice audiences and A/B testing
The third section talks about the ethics and moralities of Big Data. Corporations and governments use Big Data to predict your dispositions. How much freedom should they have? How much privacy should we have?
What did I like?
Seth Stephens-Davidowitz gives us four important powers of Big Data along with examples. These are
The value of Big Data is not its size; it’s that it can offer you new kinds of information to study.
It provides us honest data. It allows us to see what people really want and really do, not what they say they want and say they do.
Big Data allows us to zoom in on small subsets of people.
It makes random experiments easier to conduct.
These four points, which form the crux of the book, are well written and explained; with examples that tend to stick in your mind.
What did I not like?
There are a couple of sticking points for me. The first one is the obsession with sex. I understand why the author chose it, since, you know, sex sells. But, this becomes a bit distasteful after a while. The second point is that more global examples could have been used. There is nothing wrong with the US-centric approach being used, but other than facebook and sex, this book became too WEIRD centric.
My Recommendation
I strongly recommend this book.
If you wanted to know about Big Data and its implications, this is the book for you. This is a good spiritual successor to Freakonomics,
Support Digital Amrit by Buying Everybody Lies at Amazon
Commentaires