Distribution of a ordered pile of rubble

Imagine a pile of rubble (X) where the separated elements of the pile are stones (x_i). By picking n stones we form a sample that we can sort by weight. A sequence x_1,x_2,...,x_n becomes x_{(1)},x_{(2)},...,x_{(m)},...x_{(n)}, where (m) is called “rank”.

Pretend that we do the following. Apon picking a sample and sorting it we put stones into n drawers and mark each drawer by rank. Now repeat the procedure again and again (picking a sample, sorting and putting stones into drawers). After several repetitions, we find out that drawer #1 contains the lightest stones, whereas drawer #n the heaviest. An interesting observation is that by repeating the procedure indefinitely we would be able to put all parenting set (the whole pile or the whole range of parenting distribution) into drawers and later do the opposite — take all stones (from all drawers) mix them to get back the parenting set. (The fact that distributions (and moments) of stones of particular rank and the parenting distribution are related is probably the most thought-provoking)

Now let us consider the drawers. Obviously, the weight of stones in a given drawer (in a rank) is not the same. Furthermore, they are random and governed by some distribution. In other words, they are, in turn, a random variable, called order statistics. Let us label this random variable X_{(m)}, where m is a rank. Thus a sorted sample looks like this


Its elements X_{(m)} (a set of elements (stones) x from the general set X (pile) with rank m (drawer)) are called m order statistics.


Elements X_{1} and X_{(n)} are called “extreme”. If n is odd, a value with number m=\frac{(n+1)}{2} is central. If m is of order \frac{n}{2} this statistics is called “m central” A curious question is how define “extreme” elements if n \to \infty. If n increases, then m increases as we.


Let us derive a density function of m order statistics with the sample size of n. Assume that parenting distribution F(x) and  density f(x) are continues everywhere. We’ll be dealing with a random variable X_{(m)} which share the same range as a parenting distribution (if a stone comes from the pile it won’t be bigger than the biggest stone in that pile).


The figure has F(x) and f(x) and the function of interest \varphi_n (\cdot). Index n indicates the size of the sample. The x axis has values x_{(1)},...,x_{(m)},...,x_{(n)} that belong to a particular realization of X_{(1)},X_{(2)},...,X_{(m)},...,X_{(n)}

The probability that m-order statistics X_{(m)} is in the neuborhood of x_{(m)} is by definition (recall identity: dF = F(X + dx) - F(x) = \frac{{F(x + dx) - F(x)}}{{dx}} \cdot dx = f(x) \cdot dx &s=-3):

dF_{n}(x_{(m)})=p[x_{(m)}<X_{(m)}<x_{(m)}+dx_{(m)}]=\varphi_n (x_{(m)})dx_{m}

We can express this probability in term of parenting distribution F(x), thus relating \varphi_n (x_{(m)}) and F(x).

(This bit was a little tricky for me; read it twice with a nap in between) Consider that realization of x_1,...,x_i,...,x_n is a trias (a sequence generated by parenting distribution, rather then the order statistics; remember that range is common) where “success” is when a value X<x_{(m)} is observed, and “failure” is when X>x_{(m)} (if still necessary return to a pile and stone metaphor). Obviously, the probability of success is F(x_{(m)}), and of a failure is 1-F(x_{(m)}). The number of successes is equal to m-1, failures is equal to n-m, because m value of x_m in a sample of a size n is such that m-1 values are less and n-m values are higher than it.

Clearly, that the process of counting of successes has a binomial distribution. (recall that probability of getting exactly k &s=-3 successes in n &s=-3 trials is given by pms: p(k;n,p) = p(X = k) = \left( \begin{array}{l} n\\ k \end{array} \right){p^k}{(1 - p)^{n - k}} &s=-3 In words, k &s=-3 successes occur with p^k &s=-3 and n-k &s=-3 failures occur with probability (1-p)^{n-k} &s=-3. However, the k &s=-3 successes can occur anywhere among the n &s=-3 trials, and there are \left( \begin{array}{l} n\\ k \end{array} \right) &s=-3 different ways of distributing k &s=-3 successes in a sequence of n &s=-3 trials. A little more about it)

The probability for the parenting distribution to take the value close to x_{(m)} is an element of dF(x_{(m)})=f(x_{(m)})dx.

The probability  of sample to be close to x_{(m)} in such a way that m-1 elements are to the left of it and n-m to the rights, and the random variable X to be in the neighborgood of it is equal to:

C_{n - 1}^{m - 1}{[F({x_{(m)}})]^{m - 1}}{[1 - F({x_{(m)}})]^{n - m}}f({x_m})dx

Note that this is exactly p[x_{(m)}<X_{(m)}<x_{(m)}+dx_{(m)}], thus:

\varphi_n (x_{(m)})dx_{m}=C_{n - 1}^{m - 1}{[F({x_{(m)}})]^{m - 1}}{[1 - F({x_{(m)}})]^{n - m}}f({x_m})dx

Furthermore if from switching from f(x) to \varphi_n (x_{(m)}) we maintaine the scale of x axis then

\varphi_n (x_{(m)})=C_{n - 1}^{m - 1}{[F({x_{(m)}})]^{m - 1}}{[1 - F({x_{(m)}})]^{n - m}}f({x_m})

The expression shows that the density of order statistics depends on the parenting distribution, the rank and the samples size. Note the distribution of extreme values, when m=1 and m=n

The maximum to the right element has the distribution F^{n}(x) and the minimumal 1-[1-F(x)]^n. As an example observe order statistics for ranks m=1,2,3 with the sample size n=3 for uniform distribution on the interval [0,1]. Applying the last formula with f(x)=1 (and thus F(x)=x we get the density of the smallest element

\varphi_3 (x_{(1)})=3(1-2x+x^2);

the middle element

\varphi_3 (x_{(2)})=6(x-x^2)

and the maximal

\varphi_3 (x_{(3)})=3x^2.

With full concordance with the intuition, the density of the middle value is symmetric in regard to the parenting distribution, whereas the density of extreme values is bounded by the range of the parenting distribution and increases to a corresponding bound.

Note another interesting property of order statistics. By summing densities  \varphi_3 (x_{(1)}), \varphi_3 (x_{(2)}), \varphi_3 (x_{(3)}) and dividing the result over their number:

\frac{1}{3}\sum\limits_{m = 1}^3 {{\varphi _3}({x_{(m)}}) = \frac{1}{3}(3 - 6x + 3{x^2} + 6x - 6{x^2} + 3{x^2}) = 1 = f(x)}

on the interval [0,1]

The normolized sum of order statistics turned out to equla the parenting distribution f(x). It means that parenting distibution is combination of order statistics X_{(m)}. Just like above had been mentioned that after sorting the general set by ranks we could mix the sorting back together to get the general set.

Further read: Ефимов-1980; Arnord-balakrishnan-2008.

Math is the extension of common sense

What makes math? Isn’t it just common sense?

Yes. Mathematics is common sense. On some basic level, this is clear. How can you explain to someone why adding seven things to five things yields the same result as adding five things to seven? You can’t: that fact is baked into our way of thinking about combining things together. Mathematicians like to give names to the phenomena our common sense describes: instead of saying, “This thing added to that thing is the same thing as that thing added to this thing,” we say, “Addition is commutative.” Or, because we like our symbols, we write:

For any choice of a and b, a + b = b + a.

Despite the official-looking formula, we are talking about a fact instinctively understood by every child.

Multiplication is a slightly different story. The formula looks pretty similar:

For any choice of a and b, a × b = b × a.

The mind, presented with this statement, does not say “no duh” quite as instantly as it does for addition. Is it “common sense” that two sets of six things amount to the same as six sets of two?

Maybe not; but it can become common sense. Eight groups of six were the same as six groups of eight. Not because it is a rule I’d been told, but because it could not be any other way.

We tend to teach mathematics as a long list of rules. You learn them in order and you have to obey them, because if you don’t obey them you get a C-. This is not mathematics. Mathematics is the study of things that come out a certain way because there is no other way they could possibly be.

Now let’s be fair: not everything in mathematics can be made as perfectly transparent to our intuition as addition and multiplication. You can’t do calculus by common sense. But calculus is still derived from our common sense—Newton took our physical intuition about objects moving in straight lines, formalized it, and then built on top of that formal structure a universal mathematical description of motion. Once you have Newton’s theory in hand, you can apply it to problems that would make your head spin if you had no equations to help you. In the same way, we have built-in mental systems for assessing the likelihood of an uncertain outcome. But those systems are pretty weak and unreliable, especially when it comes to events of extreme rarity. That’s when we shore up our intuition with a few sturdy, well-placed theorems and techniques, and make out of it a mathematical theory of probability.

The specialized language in which mathematicians converse with each other is a magnificent tool for conveying complex ideas precisely and swiftly. But its foreignness can create among outsiders the impression of a sphere of thought totally alien to ordinary thinking. That’s exactly wrong.

Math is like an atomic-powered prosthesis that you attach to your common sense, vastly multiplying its reach and strength. Despite the power of mathematics, and despite its sometimes forbidding notation and abstraction, the actual mental work involved is little different from the way we think about more down-to-earth problems. I find it helpful to keep in mind an image of Iron Man punching a hole through a brick wall. On the one hand, the actual wall-breaking force is being supplied, not by Tony Stark’s muscles, but by a series of exquisitely synchronized servomechanisms powered by a compact beta particle generator. On the other hand, from Tony Stark’s point of view, what he is doing is punching a wall, exactly as he would without the armor. Only much, much harder.

To paraphrase Clausewitz: Mathematics is the extension of common sense by other means.

Without the rigorous structure that math provides, common sense can lead you astray. That’s what happened to the officers who wanted to armor the parts of the planes that were already strong enough. But formal mathematics without common sense—without the constant interplay between abstract reasoning and our intuitions about quantity, time, space, motion, behavior, and uncertainty—would just be a sterile exercise in rule-following and bookkeeping. In other words, math would actually be what the peevish calculus student believes it to be.

A citation from Ellenberg’s “How Not To Be Wrong…” book. Kinda liked it.

A conjecture on mating

What is dating and why do we even need it? Here is mine theory. I have not cross-referenced it with existing sciency literature, thus, it could lack originality or could be just nuts (it’s really just some random thoughts). The theory naturally follows from several observations, so I start with those. Medical science has a good understanding of how a perfect, textbook, human body looks like. In reality, a perfect body does not exist. It is just an idea that is useful to understand what is right and what is wrong with a patient. A deviation from this conceptual body can help in classification. Noteworthy is that there has been a considerable change in classifying deviation into the right and wrong. Many diseases that were classified before as a subject for a treatment, today let roll on their own.

Nature never creates perfect bodies, because it is not sure how a perfect body looks like. The process of a human creation by nature can be understood as following. Design a perfect, textbook, body and then introduce disturbances into some system of the body. The disturbances are known as mutations, and the whole process is known as evolution. Put differently, nature randomizes human bodies and then the environment trims the randomization that was not useful. It is conceptually indistinguishable from how a programmer develops a code of a program. There is a core functionality and over time the programmer introduces features and see if they make the program better. The key difference is that the programmer controls the “trimming” process. He’d know very well which feature came through testing and which require further testing because they are promising, but initial tests were not very successful. Well here is an amazingly awesome news. There is a conceptual analog of a programmer. A woman. Natur is agnostic about which features became successful and which were not. Let’s start with counterposition. If there were no women then the progress of medical science (with its moto “no pain is great”) generates this:

Ok, that might be obvious. So, having woman improves the sorting process and trims unsuccessful mutations. But how exactly? This is the best bit. The process of a woman picking a man has exactly the same characteristics as a patient picking a doctor and a firm picking a worker. When you come to see a doctor you would like him to know medical stuff more than you do. When you come to see a surgeon you would like him to make right choices during surgery when you are asleep and unavailable for consultation. The problem is that when you see a doctor you see a head, two legs, and two arms. These observables are not very useful to infer the unobserved characteristics of a person that actually matter for you. That is why you use potentially useless and silly observables as proxies for unobservables that matter.

Let me start with “a flip of a coin in the vacuum”. Imagine all people have perfect, text-book bodies, they are exactly the same. Then people can form a group to archive the economy of scale (to hunt elephants or to produce iphones or to make healthy fat well-nourished kids) with anyone. Then one does not need friends, family or anyone really. There is no need to designate anyone as special. If you feel like having a beer or sex you just talk to the next person next to you ask if he/she don’t mind and just do it. The same happens with kids, you have kids and if you need anyone to babysit you just give a kid to a next person on the street. One does not even have to go home in the same place every night. Just crush to the closest bed. This is a benchmark.

Now imagine nature intentionally introduce noise to every person. Sort of introducing random features and then the environment needs to test the feature by killing the versions that are no good. Now people are different and they possess characteristics that could be useful in the current environment or could be useless. Now it matters who is in your group. To form groups quickly our brain can classify people into a bad person (immoral people) and good person (moral person). There are even the whole institutions that people created to facilitate the sorting, reputation and even… church. The church is kinda like education, it helps to send signals about types. A religious person is an unconditional contributor (speaking a language of public goods provision game). Religious people are usually intense, so for them, it is a computational shortcut (it requires extra commentary, don’t worry about it at this point. In short, usually, people spend tons of time to sort people into bad and good and to come out as good. That’s what peoples’ brain is hardwired to do. Some of us decide not to spend too much time strategizing, but simply contribute all the time (hard working people, like productive scientists), but they expect others to contribute when it is crucial for them).

A family is a special case of a group. A man possesses some properties that are unobserved, thus a woman chooses observable as a proxy. Good physiqueis good, but it is not a sufficient indicator for skills. Money is better. Both already serve as a better proxy, they convey more information. Those observable are more likely to indicate strong providing properties of a person. A woman also wants a man to be responsive to incentives, thus, non-cognitive skill also matter. A woman wants someone who has good social skills. This approach refines the sorting process and makes it very intelligent; you could be narcoleptic, so you probably will lose a fight with a crocodile and won’t survive in a forest a day, but you still do fine as a scientist. In this manner, a narcoleptic gene still persists in the population even though it manifests in a really really weird behavior of passing out randomly during the day. It is not designated for trimming, the opposite it designated as a potential feature that might over year constitute a “perfect” text-book body.

It could be shown that if the world consisted of identical people a woman would interact with whoever is closer, thus it could be stated that the total amount of time a woman interacts with a man (in aggregate) is:

S=M \times T

S is given, it could be that interaction is needed due to a physical property of environment (a group is necessary because there are many dinosaurs or food is scarce, thus several people need to search for it to get a healthy fat kid). For the reasoning at hand is it given. M is a number of men in the social fragment. If all men are the same then a woman is indifferent, thus all stock of man is used because time per a man (T) is high. If the environment is too risky, woman are too cautious and they socialize with less and less per man, thus for given stock of men and in a given environment more and more men are designated for being trimmed.

I think that this conjecture naturally follows the idea that is advocated by any famous social scientists that ever existed (e.g. Hayek, Friedman). People have evolutionarily developed to construct social structures and they do work with fantastic efficiency. States, markets, mating all of these examples of social structures existed way before scientists had any say in it. People should interfere as less with those as possible, any interferences have to be very gentle. A woman has to be free in her choices because if she is not, then tons of terrible men are not trimmed.

Some interesting manifestations of it: ban on divorce produces massive suicide rates and violenceban on abortions produces massive criminalization

Если ты еще не потеряешь интерес к этой теме, то не забудь про эти статьи: 1, 2.

Game theory is ridiculous

Game theory is ridiculous. The first acquaintance with main “solution concepts” usually produces a question “wtf?!” in a man with a good common sense.

Good economics approximates essentials with assumptions to overcome limitations of verbal reasoning. Assumptions in game theory mostly exist to confuse readers without really saying anything that matters.

I believe those are not assumptions but conventions and the only question that a man with a good common sense should be trying to ask is “why so many individually silly things when come together say so many astonishingly amazing stories?!”

Why so many big lines into terrible restaurants…

It must be a good restaurant since the line is so long. Hm… you are likely just failed to update your beliefs in a rational way.

Imagine you are in a classroom and there is an urn with three balls in front of everyone. You don’t see the colour of balls, but you do know equally likely it could be majority blue (2 blue 1 red) or majority red (1 blue 2 red). Since you don’t know which urn exactly is there (true state of the world) you need some evidence before making a guess. Now every person in class one by one come and pick one ball from the urn and without showing it announces his choice. Believe it or not, but this is your restaurant choice situation.

Two possibilities for the urn is an analogue to whether this restaurant good or bad. A person that comes to make a choice has several pieces of information to combine. Taking one ball from urn is the same as if you have read some review about the restaurant before. The information is not perfect, the reviews could be biased or not representative for your taste. However, you also observed the choices of people before you. You do not know their private signal (what ball they picked from urn, i.e. what was their conclusion after studying the restaurant reviews), but you do know their choices.

Claiming that the restaurant must be good because the line is long would be true only if all people that come sequentially followed only their private signals. Then when your time has come to make a choice the line indicates independent draws of balls from the urn. If it the true state of the world was that the urn is majority blue you would have much more people that say so.

The thing is that those draws are clearly not independent. At some point, a person that has a private signal that states the urn is majority blue might see too many people choosing majority red and he will abandon his private signal and follow the crowd. So that when it is your turn to make a choice and you observe a line (i.e. heaps of people claiming their choice) it does not necessarily mean that the restaurant is good. Put differently, you do not account for correlation of public beliefs (a belief based on the observed choice before seeing your private signal) and private signals.

Well that is herding. And here is a presentation about it….

If that stuff sounded crazy awesome then read this and in the very very end this

It is obviously not about restaurants at all, it could be a choice of major for a college degree. Is being a doctor a good choice or not? There is no way to know for sure, you just have to combine your private signal with the public belief. If you don’t have a strong private belief, then it will be overwhelmed by the public belief and you just follow the crowd. It also could explain why in Russia or Germany during good times aaalll people would put out Nazi flags outside or put Stalin’s portrait on the wall at home and office. Or pretty much anything that involves guessing the state of the world by combining information from your guess and choices of others.

A practical advice on non-parametric density estimation.

Always start from the histogram, any non-parametric density estimation methods are essentially fancier versions of a histogram.

Compare the problem of choosing and optimal size of bins in histogram with choice of h in kernel estimator

The number of bins is too small. Important features, such as mode, of this distribution are not revealed
The number of bins is too small. Important features, such as
mode, of this distribution are not revealed
Optimal number of bins (Optimal according to Sturges' rule, but the rule is besides the point)
Optimal number of bins (Optimal according to Sturges’ rule, but the rule is besides the point)
The number of bins is too large. The distribution is overtted.

The point of the exercise is to reveal all features of data; and that what important to keep in mind.

The bandwidth h is too large. Local features of this distribution are not revealed
The bandwidth h is too large. Local features of this distribution
are not revealed
The bandwidth h is selected by a rule-of-thumb called normal reference bandwidth
The bandwidth h is selected by a rule-of-thumb called normal
reference bandwidth
The bandwidth h is too small. The distribution is overtted.
The bandwidth h is too small. The distribution is overfitted.



While histogram takes an average within a bin, kernel estimation naturally extends this idea and takes a fancier version of average around given point. How much info around a point to use is governed by the bandwidth. Conceptually a bandwidth and a bin are identical.


And now take a look at a perfect application of the idea in

Nissanov, Zoya, and Maria Grazia Pittau. “Measuring changes in the Russian middle class between 1992 and 2008: a nonparametric distributional analysis.” Empirical Economics 50.2 (2016): 503-530.

Comparison between income distributions in the period 1992–2008. Authors’ calculation on weighted household income data from RLMS. Kernel density estimates are obtained using adaptive bandwidth
Comparison between income distributions in the period 1992–2008. Authors’ calculation on
weighted household income data from RLMS. Kernel density estimates are obtained using adaptive bandwidth

Going back to advice: keep in mind that you doing it to reveal features of data and it has to be strictly more informative than a histogram, otherwise the computational costs are not justified.

Spatial competition… and what science is really about.

Check my presentation on an empirical model of firm entry with endogenous product-type choices. (here)

A normal reaction to the presentation’s topic should be “whaat? why would anyone want to do this stuff for a living?”. It is a great question, I don’t have an answer to it. It is indeed viciously technical and deadly boring.

But I do have something really cool to share. Back home I was driving my 15-year-old niece to a museum and failed to find a humanly understandable combination of words to explain what science is. So now you check this combination of words, I think it is a really cool fit….

A human eye is able to capture a quite limited portion of light wave spectrum (Visible spectrum). We are unable to travel in time or reach most of the planets in the galaxy. Yet there is no need to be able to physically see the whole light wave spectrum to actually “see” it. And you do not need to be able to travel in time to “see” the past, just like you do not need to be able to travel to another planet to “see” that planet. Here is a cool angel on it. An information integration theory of consciousness, an exceptionally creative idea that, if appreciated properly, will blow your mind.

Human bodies have an enormous amount of systems like no other living being. We feel temperature, objects, we see and hear, feel emotions like fear, shame, happiness etc.. Our brain integrates all of this information from all the systems into a sense of reality. Put differently the reality as seen by a person is but an aggregated sensations from a set of systems, which continuously register information. Think about a feeling of pain. Pain is your body’s language. If your body needs attention from you, it sends a signal. However, the signal has only one dimension, it is kind of like a baby cry. Baby can only change the intensity of a cry but it is your job to give to this cry an interpretation. Your brain does the same. (To be more precise you do it yourself but unconsciously, it is one of that automatic processe, kinda like intuition) A conscience, or a capacity to separate yourself from other things, is just another trick of your brain. Instead of giving you a row information from systems that systematically aggregate information it gives you interpretation. Instead of overwhelming you with tonnes of sensations brain gives you a meaning of them. The reality is a brain’s interpretation of the aggregation of information from a number of systems that supply raw data.

Holy bologna!! But is it not what science is? Yes, indeed. Science is nothing but a natural extension of a process that your body does almost automatically. Aggregating information from systems that continuously register information and assign meaning to them (there is also this thesis that mathematics is nothing but common sense, a quite dense at times. I’ll see if I can make this post compact and readable enough if I do I’ll give you that idea as well)

It is also interesting to look at people’s temperaments. The system integrator (our brain, our consciousness) assigns different weights to different system’s from which it gets information. That’s why sometimes we observe people who are always scared or calm, sympathetic or cold. Of course, there are other things that define character, or predilection to specific kinds of decisions, such as upbringing and genetics, yet the system integrator has the last word.

Ok. Your brain has the capacity to integrate information from systems that systematically aggregate information and assign meaning, one of a product of this process is a conscience or a sense of reality. But the systems do not have to physiological, they do not necessarily have to be attached to your brain through common nerve system. It just has to be something that contains information. Let’s go back to the very beginning of this post. Yes indeed people see a quite narrow spectrum of the lightwave, however, there are devices which can capture those waves. Cameras, for example, continuously aggregate information. It would never have been done if we limited ourselves to physiological systems. However, for your brain information which is captured by the camera will have the same value as the information captured by your eyes. The only difference is that your brain will have to readjust itself to be able to aggregate information from it. And that is why in the beginning when you look at some figure which contains information you will be confused but with time you have to realign the integration process. In other words, you have to be able to incorporate this new information and combine it with information from other systems. When you do mathematics it’s very important at some point to stop and think what is the meaning of the equations that you have. You have to integrate this information with other information that your brain has and assign meaning to it. That is, in fact, a process of co-integration of information from different sources. And it is very costly for your brain to do, that is why it is so annoying. Another example from the beginning is our incapacity to travel across time. Well, the physical world, unfortunately, has this dimension which only goes one way and the speed of this going can not normally be changed. But all of us has some videotapes from the past. Imagine that there is a probe that is able to capture some information from the past and keep it (picture, videotape, documentary movies). Some system even allows us to travel through time and for our brain this is identical to if we were to travel in past ourselves. You just have to put in some effort to integrate the information from new systems. People who study history or work on documentary movies emerge themselves with systems that continuously register information from the past and their brain is trained well enough to easily incorporate this knowledge and assign a meaning to it. Another example is that to get the information about faraway planets one does not have to physically travel there, astronomical spectroscopy allows to systematically capture the information about the planets and then you can realign this knowledge so that your brain would incorporate and integrate into a perception of reality just like it would do from your eyes. And the final example is a statistical work. So if you have some data sets you can do some statistics to make some conclusions. But most often to do some statistical work a person has to merge two data sets. If those two different data sets are nothing but systems that continuously capture the information about some object. Put differently there are two independent systems that continuously register information about some object (it is other people that put down a number, in theory instead of a number they could have used words, but then we are back to crying baby case, the signal is not rich enough). They look at the same place and what people can do the camp combine this knowledge to assign some meaning to eat.

The point is our brain is capable to aggregate information from many many systems that supply information than physiological limits dictate.

In some sense, our brain is a prisoner of our physiological systems. So one way to say is science is setting your brain free. Seeing and thinking are the same thing when your eyes are closed. Put different things that we physically see here or feel is just a little fraction of what we potentially can see if we allow our brain to aggregate information and assign meanings from much wider systems that continuously register information. The sense of reality, conscience, is a computational shortcut. Because otherwise your brain would be overwhelmed with information.

In fact, any meaning is a computational shortcut that only your brain requires. The objective reality exists as an enormous mostly meaningless set of data. Life exists only because it can, asking for the meaning of life is the most idiotic question of all. Meaning itself is senseless it is nothing but a trick of your brain to aggregate information easier (It sounds really weird… hm… I probably should wrap up with this one, better do another post).

P.S. To survive people developed a capacity to form groups very quickly (morality) and to make decisions in uncertainty very quickly. A sense of reality, or consciousness, is sort of a “sufficient statistics”. For the decision at hand (to survive) we can form one parameter, a meaning, that would contain all useful information from the data that surround us. It economized on computational requirements and minimizes the risk of a mistake (sometimes a cost of a mistake is your life)


The key difference between developed and undeveloped countries

To overcome its physical vulnerability ants developed a unique way to navigate in uncertainty. The chemical trace allows to an ant and all his bodies to find a way from food to home (it is very close to how people use market prices to send information. A pencil example by Friedman). Spiders have developed the web to catch insects the same size as spiders themselves (morality is an evolved part of human nature, much like a tendency to weave nets is an evolved part of spiders’ nature. See figures with “gossiping” here). What’s so special about humans? There no better way to demonstrate than with a movie Allied. What do you choose an allegiance to your family or your country? The choice evokes a range of thoughts, feelings, emotions, and intuitions about what to do, what is the right thing to do, what one ought to do—what is the moral thing to do. Nobody except humans possesses morality, but why over million years of evolution nature decided to develop such a peculiar attribute? Morality is what makes people come together and play non-zero-sum games, it was evolutionary necessitated device that ensured the survival. The feeling of “right” and “wrong”, “good” and “bad” is nothing but your brain figuring out how to act in groups and use groups to its advantage. (Next time when you go to the park and see many groups of people keep in mind that this is happening because an action of cooperating is remunerated with oxytocin (brain uses hormones like carrot and stick to incentive a particular form of behaviour, the one that proved to increase the chances of survival))

What are these moral thoughts and feelings, where do they come from, how do they work, and what are they for? There is a scientific answer to these questions. It is possible to use the mathematical theory of cooperation—the theory of nonzero-sum games—to transform this commonplace observation into a precise and comprehensive theory, capable of making specific testable predictions about the nature of morality. (Curry 2016)

A little experiment called Public Good Game (aka n-player prisoners dilemma; Imagine you have a baby and you and your partner have to do something very important for themselves so that each would like the other one to sit with the baby. But if both bail on sitting with baby… then we both suffer because the little one might fall, choke or something. It is individually rational to defect in providing the public good and “free-ride”. If there are many players – it makes it a Public Good Game) captured a feature that is unique to the animal world – “reciprocal altruism”. People trust to the strangers if they see that they are eager to cooperate. Only humans possess this.

This feature manifests itself in technologies of trust (exchange and reciprocity) such as money, written contracts, ‘mechanical cheater detectors’ such as ‘[c]ash register tapes, punch clocks, train tickets, receipts, accounting ledgers’, handcuffs, prisons, electric chairs, CCTV, branding of criminals, and criminal records. And this very feature allows humans to create social structures such as markets, political elections and …states. People had money, laws and elections way before political science and economics had anything to say about it. All these social structures, markets, elections and states themselves allow strangers – not genetically related species – to beneficially coexist.

Ok, but what is has to do with the key difference between developed and underdeveloped countries? Well, everything.

The developing countries are simply unable to form social structures effectively. They can not fairly elect political leaders, they can not maintain market economy without terrible abuses that potentially come with market economies. If people generally do not follow laws, a country practically does not have any laws. Financial technologies are a pure manifestation of “reciprocal altruism”, where the complexity and richness of financial instruments are based on nothing but a piece of paper that has power only if people trust it. The problem of developed countries is that people in these countries are unable to cooperate effectively. They are unable to play a zero-sum game. In the US strangers came together and created iPhone, in Russia, people fail to organise themselves into homeowner associations (another interesting example is how Russians treat national currency, everybody ditches it whenever the opportunity arises, that leads to volatility and self-fulfilled prophecy that currency had to be ditched). In general, the breakdown of cooperation in such games as Public Good Game or Minimum Effort Game are called coordination failure.

What is curious is that playing non-zero-sum games is a natural evolutionary developed tendency in any human. In the absence of interference, people will eventually form an effective cooperation. They will come up with the sets of rules and believes that will allow for an effective non-zero-sum game. My favourite example is a lovely place called Russia, where the government does practically everything possible to break down the effective cooperation by systematically taking actions that induce the negative beliefs.

Hm… I have started the post with morality. Morality is what makes you feel like punishing defectors in Public Good Game (you say “this is wrong”) and makes you contribute if everyone else contributes (you say “I feel bad by not doing the right thing”) or makes you feel offended if you contribute but most did not (You say “I feel like an idiot by doing this”). All people say these things in their head and that what makes them come together and do a great thing. Or, if you leave in some underdeveloped country, never do anything great.

Some reports that I need on this topic 1, 2

 P.S. Check this awesome quotation from here:

Cooperation depends on trust, which in turn requires evaluating individuals and groups as potential cooperation partners. Oxytocin, a neuropeptide known for its role in social attachment and affiliation in mammals appears to be important for both kinds of decisions. Intranasal administration of oxytocin increases investment in a “trust game”, but also biases judgment and behavior toward ingroup members and against outgroup members. Likewise, genetic variants associated with oxytocin are associated with increased prosocial behavior, particularly when the world is seen as threatening. From an evolutionary perspective, the double-edged sword of human morality comes as no surprise. Morality evolved, not as device for universal cooperation, but as a competitive weapon, as a system for turning Me into Us, which in turn enables Us to outcompete Them. Morality’s dark, tribalistic side is powerful, but there’s no reason why it must prevail. The flexible thinking enabled by our enlarged prefrontal cortices may enable us to retain the best of our moral impulses while transcending their inherent limitations.

How to catch the market collusion using a bit of algebra and public data

A little report on a paper about a collision in the electricity market in the UK.

In the late 1990s, the combination of game theory and econometrics produced new techniques for collision detection. The advantage of this technique is that you just need readily available public data and few simple equations that reasonably captures firms behaviour.

The big picture is that if you know the costs of the firm you can already tell if the prices are way too high.

Some other examples of this approach: 1, 2.

Papers are essentially identical. This new technique is used and then the results compared with more conventional methods, e.g. using cross market variations (by definition require way more data). The bottom line is that this technique works. Hurray.

… and a little aside as per usually. A market is only one case of a social structure where strangers interact, there are many others, e.g. elections, law enforcement. (these social structures are all trust-based technologies, trusting to a stranger, or a piece of paper, is a unique evolutionary feature that observed exclusively in people. It allows us to play non-zero-sum games (cooperate, build states and stuff) and kick butts even if we are physically weaker than most predators in the animal world) What’s nice about markets is that inhere things are sort of black and white, everyone knows what they are doing. Yet, almost any concept that has been designed to capture interaction in a market can be generalised to any other social structure. Few examples. An idea of being small so that you take environment as given, like in, you can’t do anything about it. When you vote for president your single vote is indeed very small to influence the outcome, when you vote within the local community, though, your vote matters a lot and environment is not exogenous at all. An idea of elasticity transcends directly to, for example, the relationship between men and women. If the market is inelastic then you can abuse it. Just like you can abuse a woman that doesn’t have anywhere to go (cool kinda related paper). Yet if there are many more “men” among which a “woman” can choose from then market becomes very elastic and one cannot abuse. Indeed, a lot that happens in the market can be generalised to any other social structure.

Blackandwhiteness of market comes from the fact that everybody kinda aware what game is being played. The problem with other social structures is that people don’t really know what game they really playing. (Crooked politicians, for example, will do everything they can to make sure that people are clueless what they are actually choosing among)  Peoples’ minds have (another evolutionarily developed feature that allows people to form groups) morality that plays a very very important role in social structures, i.e. the notions of right and wrong and their boundaries (more technically they affect believes whether your “high” effort will be supported by other and not taken advantage of). Think about a country where it is customary for men to have way more rights at the expense of the rights of women, it would be very typical to see that women, in fact, are happy to give those rights to men, because they truly believe that their place is in the kitchen or at the lower paid job or something like that. Put differently, social norms very often prevent players from realising what game exactly is being played. Markets in this sense away less “contaminated” by those social norms, yet they are still very much affected. General notions of right and wrong play important roles in market, just like they do in any other social structures. (check this experiment that says that economists are more “rational” (read selfish)) American culture of winner takes it all leads to very aggressive corner solutions by the corporate world, naturally, to offset those the US has a very strong regulatory body.

Think Russians before the 1990s didn’t have any market experience. And when the markets were introduced after the 1990s rules were taken quite literally. Of course, there was a lot of influence of, so-called, market fundamentalists from IMF, which reinforced this idea that since this is capitalism and this is markets you can do everything which is not directly prohibited and even if this prohibited it is in the rules of the game to break the rules if you can.

Great news for Alexey Navalny

Great news for Alexey Navalny came out recently by the means of this little experimental paper. The results are very intuitive and do not really expand our knowledge too much, but is it always nice to confirm the commonplace observation with something sciency. The abstract: Social movements are critical agents of change that vary greatly in both tactics and popular support. Prior work shows that extreme protest tactics – actions that are highly counter-normative, disruptive, or harmful to others, including inflammatory rhetoric, blocking traffic, and damaging property – are effective for gaining publicity. However, we find across three experiments that extreme protest tactics decreased popular support for a given cause because they reduced feelings of identification with the movement. Though this effect obtained in tests of popular responses to extreme tactics used by animal rights, Black Lives Matter, and anti-Trump protests (Studies 1-3), we found that self-identified political activists were willing to use extreme tactics because they believed them to be effective for recruiting popular support (Studies 4a & 4b). The activist’s dilemma – wherein tactics that raise awareness also tend to reduce popular support – highlights a key challenge faced by social movements struggling to affect progressive change.

This paper says that violence against Alexey actually gives him support. Well, the paper’s results can be applied, of course, only if people actually know about the violence and if they are not constantly bombarded by propaganda. But, you know, keeping things equal,  this zelenka accident is good for Alexey’s career. Symmetrically, the paper’s result implies that Alexey’s leadership in what most consider excessive violence in 2011-13 protests turned many people away from him. 

The results of the paper also explain why very soon people become disillusioned about protests in Russia. Very soon the protests movements become contaminated by radicals and “normal” people just leave the movement and eventually the movements boil down to a collection of crazy people.