
“The single biggest threat to the credibility of a presentation is cherry-picked data.” – Edward Tufte
Short attention span is a bane in today’s mediatised world. The excitement to repost sensational news on a social media platform and the lack of desire to spend time to understand the viral posts have often led people to fall for false narratives. Visual data, particularly, attracts a lot of eyes because of being simpler in comprehension. In this piece, I write about how a blind following of such posts can be deceptive if we are not cautious.
On September 7, there was a tweet (a post on ‘X’) sharing an India map (see below) showing survey-based data on inter-caste marriage rates in Indian states with a comment, “Interestingly, Tamil Nadu performs worse in intercaste marriages when compared to states like UP and Bihar. While the national average is 9.9%, the figure for TN is 2.6%. UP is 8.6%, and Bihar is 4.7%”. The post (till the time of writing this piece) had around 104k+ views, 916 likes and 284 RTs.

Again, on September 8, another account tweeted the same India map. This post (when the piece was being written) got 44k+ views, 350 likes and 155 RTs, and the account holder commented, “Bihar has been ruled by Lohiaite socialists for the last 50 years, and yet they have not only kept Bihar utterly backward, but they have shamelessly led Bihar to an economically failed state with an exodus of distress migration. They have also brazenly perpetuated the caste system and brahmanical patriarchy. 😡” I later realised that this account had shared this map in another post earlier with the comment, “Dravidian politics is a failed project,” by quoting the inter-caste marriage rates of states of Maharashtra, UP and TN, and had garnered 31k+ views, 528 likes and 207 RTs.
Looking at the map above, the data visualisation might make one believe that TN is one of the worst performing states in India, way below the national average, when it comes to inter-caste marriages. Now, let us understand how not only such tweets and visual data can be reductionist but even the research they rely on need to be read with caution.
Unpacking the map and the research paper behind it
The issue with the map is its silence on the research methodology used in the paper – the sample size, the sample selection criteria, and the definition of ‘inter-caste marriage’. To explore these aspects, one needs to dive into the paper and the research methodology it employs. If one looks closer, the creator of the map has mentioned the source of the data. It is a freely accessible research paper published in 2011 on the website of the Population Association of America (PAA) based in the United States. This study, presented in a Poster Session at the 2011 Annual Meeting of the PAA in Washington DC, is authored by Kumudini (not, Kumudin) Das, Kailash Chandra Das, Tarun Kumar Roy, and PK Tripathy. It has been cited in the past by Scroll (which surprisingly misidentified them as ‘four scholars at Princeton University’) in 2014 and by Feminism in India (FII) in 2021.
The abstract of the paper clearly states that “The study uses the data of third round of National Family Health Survey (2005-06), i.e., NFHS-3.” There is a ‘Data and Methods’ section providing details of the research methodology. The sample size in this paper would be considered representative of India as the NFHS-3 was a national representative sample survey and was conducted in all the 29 Indian states of India. Further, the information about caste of the husband and wife were also collected during the survey.
The paper analyses caste information of 32,160 Hindu couples. It does not include couples belonging to other religious groups, nor does it include people from the Scheduled Tribes (ST) communities for it reflects more of a community rather than a caste. The information collected on caste in the paper is grouped into three categories: Scheduled Caste (SC), Other Backward Classes (OBC) and Others (which includes all the so-called ‘higher castes’). The paper interestingly recognises the ascending order of ‘class’, rather than ‘caste’, hierarchy in India as SC, OBC and then ‘Others’ on the top of the hierarchy.
The paper argues that the logic that inter-caste marriage is expected to be more in the southern region as its socio-economically more developed than other parts of India fail. To support this argument, the paper produces evidence that this is because inter-caste marriage is only 9.71 percent in southern part of India, while being highest in the western region (17 per cent). The paper further concludes that while inter-caste marriage rates in Goa, Meghalaya, Punjab, Kerala, and Karnataka are very high, i.e., 26.67, 25, 22.36, 21.35, and 16.47 percent respectively, the rates in Bihar (4.60 per cent), Madhya Pradesh (3.57 per cent), and Tamil Nadu (2.59 per cent) are very low. These numbers have surprised many.
Analysis and the ‘catch’
So, the question arises: why is that? This is because of the method this paper adopts, which is identifying people into the three caste groups rather than their individual castes, to determine if a marriage is an inter-caste or not. While the paper defines inter-caste marriage as a woman belonging to higher caste marrying a man belonging to lower caste and vice-versa and supplements this definition by saying that if a woman marries a man other than her own caste, then it would be an inter-caste marriage, there is a catch here that begs a few questions. If a Brahmin caste woman marries a Rajput caste man or a Vaishya caste man, i.e., all from higher castes, will it qualify as an inter-caste marriage for the purpose of this paper? And what about a Yadav caste woman marrying a Kurmi or a Koeri caste man, all from the OBC in Bihar, or say a woman from one SC caste marrying a man from another SC caste? Similarly, will a Valaiyar caste woman marrying a Gowda caste man, both from OBC in Tamil Nadu, be considered inter-caste? The method of the paper suggests that none of these marriages would be called inter-caste, despite in all these scenarios a woman has married a man outside of her own caste. What this means is that the data is of the rates of ‘inter-caste group’ marriages rather than the rates of ‘inter-caste’ marriages. It is also evident from Table 2 of the paper, which shows both – the rate of SC women marrying men of lower caste than her caste as well as the rate of women from ‘Others’, i.e., higher castes, marrying men of upper caste than her caste, as zero percent.
Although the paper offers valuable insights into the extent, pattern, spatial distribution, and determinants of inter-caste and inter-religious marriages in India, the title ‘inter-caste marriages in India’ thus appears to be misleading. So, coming back to the case of Tamil Nadu, where OBCs are 68 percent and SCs are 20 percent, the inter-caste marriage rate would go north for marriages among OBCs but from different castes. Caste, being a producer of a graded social inequality system, even those marriages should be ideally considered inter-caste where two parties from the SC or OBC or ‘Others’, but from different castes, marry each other. Had this been the method employed in the paper, the rates in all states should have gone up, thereby giving more clarity and comparative understanding on the rates of inter-caste marriages in India.
Shailesh Kumar is a lecturer in law at the Department of Law and Criminology, Royal Holloway, University of London. He tweets at @shailesh262.
Well explained! Indeed the research methodology transparency will not only make data more ethical but also increase it’s utility.
Where can I find the real numbers? I desperately need them
Fantastic analysis and good catch on the scope and data of the study.
The intercaste marriage ?research paper has been doing the rounds since last 15 years .Its worse than an opinion poll with a sample size of few hundreds for most states and few thousand for a state .The questions selected too are done in an irrational way and the high figure states have very few samples .
There will be a huge difference between village and cities and even within villages between a village with a single dominant community and others in lesser numbers in comparison with a village with many castes in equal numbers and educational, job opportunities remaining constant among them coupled with female literacy.
The papers quoted are presented in various conferences by slightly altering certain parameters.
http://epc2010.princeton.edu/papers/100157
Discussed about the glaring errors in these ? studies 10 years back in this post
Yayathi Says:
February 2, 2014 at 21:54
1. Three of the authors of the paper are identified to be with International Institute of Population Sciences,(IIPS) jointly sponsored by GOI, Tata Trust and UN. IIPS is supposedly the nodal agency for coordinating the NHS. IIPS is a “Deemed University” and has a full fledged teaching and research programs. They provide studies & reports as part of their consulting and research services provided to the Health ministry.
2. I do see the point of 70% OBC in one major group, made by Mr. Poovannan and Mr. Venkatesan. But please see the spatial distribution in TN:
a. Chennai is only one major cosmopolitan area, and the Coimbatore-Tirupur industrial belt comes distant next.
b. The major castes in the OBC – Vanniyar, Thevar, Kongu Vellalar, Nadar are all distributed and they do not overlap each other for most part. Vanniyars –> Ariyalur to Thiruvallur; Thevar –> Thanjavur to Tirunelvei; Kongu Vellalar –> Coimbatore to Salem; Nadar–>Kanyakumari, Toothukudi.
c. Places like Madurai have some mixed population but OBCs like Kongu vellalars and Vanniyars are not in large numbers there.
d. On the other hand, Dalits are spread all over the state – even on that, there are some regional variations depending on their sub-caste, which may also lessen the chance of inter-marriage among the sub-castes.
e. In the absence of a clear cut data substantiating Mr. Poovannan’s claim, I find it difficult to expect a high number of inter-caste marriage among OBC, just on the basis of 70% claim.
https://othisaivu.in/2014/01/26/post-326/?fbclid=IwY2xjawKtdWZleHRuA2FlbQIxMABicmlkETF6ME5QMGdVWVNxU1YwMUx2AR5YlAQM4H5TziLLKoTtFV_uo1IXNNxm9LGLS9IE1so2F0iOQckXLS5v6EX8CQ_aem_mRV0keu3LK6SLR2kevBNjg
Thanks!
Reply