Hindi- And the Deep History of India.

Ten years back, a friend, who was from West Bengal, said casually:

“Hey- Many of the ordinary people in my state resemble you guys from the south.”

Hmm…She had used the qualifier ordinary. That was interesting.

 “You mean, compared to the North- West of India?” I asked.

“Yeah. But our language, Bengali, doesn’t resemble your language at all.” She said.

Intense curiosity about most things in the world- that was what I was. I don’t know why- it is a bug in my brain.

But curiosity can kill cats. Truth can definitely be dangerous.

It can also be incredibly difficult. This particular question posed by my friend can never be answered factually. It is very vague.

But the deep history of India is fascinating. Probable answers are there.

“I can conjecture. But it will be controversial.” I had evaded the answer then.

But the last six to seven years, have seen an explosion in the field of pre- history, by a new discipline, called population genetics.

In my earlier post- ‘the formulophobe manifesto’, I had written that any deep belief or grand idea, like the Christian belief, or being a Thamizh or Malayalam language fanatic, can lead someone, to hold ludicrous and fringe ideas like ‘the world is only 6000 years old’, or ‘my language is the oldest in the world’ and so on.

Similarly, Indian pre-history, though we had a fair idea, was hotly contested. Linguistic and archaeological findings had become play-dough that interested ones were biting into this shape and that.

But it need not be anymore. Now the picture is getting very clear. The DNA analysis data that have come out as recent as 2018 and 2019, shows clinching evidence.

All languages are equal. It is accidents of history that makes languages unequal. Someone like me is far better off learning Malayalam, Hindi and English, to survive in this world. There is no doubt about that. But let us see how it came to this.

‘Thamasoma  Jyothirgamaya’- The truth has to come out. Our cultural superiority should be our ability to look at it without blinking. If we have to close our eyes and walk, what is the point?

If we forget the Austro Asiatic and Tibetan languages, which are marginal, India has two major language families. The ‘Indo- European’ and the ‘Dravidian’. Hindi, Gujarati, Marathi, Bengali are all Indo-European, derived from Sanskrit.

William Jones was the first person to notice the close affiliation that Sanskrit had, with Latin and Greek, the precursor to European languages.

Dravidian languages are derived from an unknown mother, the proto- Dravidian. Thamizh, Malayalam, Kannada and Thelungu, the languages of the South, derive from it.

But there is a strange fact. Scattered across mid-India, there are some tribal languages that are Dravidian. Gondi, spoken by the Gonds in Jharkand is a case in point. There is a single group of people in Baluchistan, speaking ‘Brahui’. Buried in a sea of Indo- European language speakers.

Archeologically, we have the enormous Indus valley civilization, covering a million square kilometres at its height, around 2500 to 2000 BC. This was as big as the Babylonian and Egyptian civilizations combined. Remember that India is only around three million square kilometres.

Who were the Indus valley people? It was difficult to tell. A stupendous amount of archaeology- but no words. We cannot decipher their written language.

The Language we know from pre-history is that of the Vedas. Old Sanskrit. The oldest, timed by linguists between 1500 and 2000 BC. A rich oral tradition follows in continuity from that time, the changes in Language reflecting the passage of time. Never written down, transferred orally from generation to generation by Brahmins. An astonishing cultural feat unparalleled anywhere else in the world.

It is into this conundrum that population genetics has acted like a powerful flashlight in the darkness.

The basic principles are:

DNA analysis of present populations can shed light on migrations of people.

DNA can be extracted from temporal bones of dug up skeletons, and when correlated with the present population, can describe the timing and direction of migration accurately.

Mitochondrial DNA analysis can show the passage of heredity along the female line.

Y chromosomal DNA can show the same through the male line.

Whole DNA analysis completes the picture. Mutation rate in introns (parts of DNA) can also time the migration pattern.

And what do we see?

I will be extremely brief. Anyone can go to the references for further reading.

Modern humans originated in Africa, 200’000 to 300’000 years ago. All humans outside Africa are descended from a group that migrated out of Africa (From Ethiopia and Eritrea region to present day Yemen- Just to show how precise we can be), around 60’000 years ago. Some, by a sort of coastal migration across hundreds of generations, reached India. From around 40’000 years ago, India was, and is, a thickly populated place.

The Andaman Tribes are unmixed descendents of these original inhabitants.

All other populations of India are completely mixed. But there are patterns to this mixing.

The original inhabitants left their mark on our DNA. 60 to 70 percent is by them. But it is thicker in south and thinnest in North West.

There is a peculiar gender mismatch here. Going by the maternal line, 70 to 90 percent is contributed by this original out-of- Africa migrants, while the paternal line shows only 20 to 40 percent. This shows that with each wave of migrants, the mixing occurred from the migrant males to the resident females.

It is extremely probable, that the Indus valley people spoke proto- Dravidian. Genetically, they seem to have been predominantly original migrants, with some mixing from the Iranian region. (Data shows Zagro farmers from Iran, though a single skeleton from an Indus site in  Raghigarhi showed only local hunter gatherer Iranian mixture. This skeleton does not show any Indo- European mixture.)

Dravidian is possibly related to Elamite, an ancient extinct language of Iran.

As the Indus Valley culture slowly declined, by 2000 to 1500 BC mostly due to drought, these people migrated and mixed with the rest of India in a South- Easterly direction.

If the Elamite connection is true, that means that Dravidian languages replaced all others spoken till then! Just think about that.

Now is the climax. Exactly between 1500 and 2000 BC, Indo European speakers arrive. They come in waves from North-west. Originally from the Russian Steppes, thousand years earlier, they overran Europe and spread their languages there (Latin and Greek). They were the first people to domesticate the horse.

All Indians show all these mixtures. But the Indo European mixture shows a definite pattern:

  • A gradient from North to South. Maximum in the North and decreasing to the South. (The Indian Cline)
  • Predominantly from Indo European males to Resident females.
  • Higher castes show a greater degree of Indo- European mixing. Lower castes show less. This is evident in South India too.
  • At around 200 AD, the mixing almost completely stopped. The castes were frozen. We are a large number of small populations. The genetic diversity between neighbours in the same village, could be more than that between a Northern and Southern European of today.

Now, these extremely compelling findings can disturb some people, even though it need not. We are not responsible for our pre-history! It just was. I beg your pardon if I did. But the truth can do us good.

Our diversity is real.

But unity is also real. There are no ‘pure’ original populations among us.

Our tolerance is essentially a product of intense mixing of different cultures.

We forget the struggles of the remote past. Recent battles are clear history. But all are- over. Yes- over.

Some of the bad aspects of our past lies buried, but it shows up in our genetic signatures. But we can progress only of we know what we are. How can an amnesic human plan the future?                              (Jimmy Mathew)

Some References (There are many more):


  1. Which of us are Aryans?- Romila Thapar, Michael Vitzel and three more.
  2. Who we are and how we came here?- David Reich
  3. The early Indians- Tony Joseph.


Numerous. Selected:

  1. Reconstructing Indian population History- Nature, 2009- David Reich, Kumaraswamy Thangaraj, et al
  2. The formation of human populations in south and central asia- Science, 2019- 120 authors including David reich.
  3. An ancient Harappan genome lacks ancestry from Steppe pastoralists or Iranian farmers- Cell, 2019, Shinde, et al. (This is the paper, recently erroneously reported as ‘debunks Aryan Migration theory’ in recent media. It actually supports it, as the Indus Valley body showed no evidence of steppe ancestry. Only, Iranian hunter gatherer and ancient south Indian ancestry!)

Dr Jimmy

I am a Doctor, Writer and Science Communicator. I am a member of Info- Clinic, and have written a few books. This site features my blog posts and stories. Thank you for visiting. ഞാൻ എഴുതാൻ ഇഷ്ടമുള്ള ഉള്ള ഒരു ഡോക്ടർ ആണ് . നിങ്ങളുടെ താത്പര്യത്തിന് നന്ദി .