Data.gov, Open Government Platform, and Cancer data sets

After attending a lecture at University of San Francisco by Jonathan Reichental (@Reichental) on the use of open data in the public sector, I started poking around some data sets available at Data.gov.

Data.gov is pretty impressive. The site was established in 2009 by Vivek Kundra, the first person with the title “Federal CIO” of the United States, appointed by Barack Obama.  It is rapidly adding data sets; sixty-four thousand data sets have been added just in the last year.

Interestingly, there is an open-source version of data.gov itself, called the open government platform. It is built on Drupal and available on github. The initiative is spear-headed by the US and the Indian governments, to help promote transparency and citizen engagement by making data widely and easily available. Awesome.

The Indian version is: data.gov.in. There is also a Canadian version, a Ghanaian version, and many other countries are following suit.

I started mucking around and produced a plot of the Age-adjusted Urinary Bladder cancer occurrence, by state.

  • The data was easy to find. I downloaded it without declaring who I am or why I’m downloading the data, and I didn’t have to wait for any approval.
  • The data was well-formatted and trivially easy to digest using python pandas.
  • Ipython notebook and data source available below.




If you’re interested in this data, you should also check out http://statecancerprofiles.cancer.gov/ , which I didn’t know existed until I started writing this post. I was able to retrieve this map from there:



The map on China’s new passport

Interesting. China‘s new passport design includes a country map that controversially includes disputed territory like Taiwan and parts of India.



The controversial map on China’s new Passort (via Washington Post)

India’s response is clever, in my opinion — it’s impractical for India to refuse to stamp the passports like some other countries, so instead they’ve changed the visa stamp itself to include a modified version of the controversial map.

The MGNREGA program

The National Rural Employment Guarantee Act (NREGA) is one of the largest programs of its kind in the world. It is a flagship program of the Indian government, and a major talking point for the UPA (center-left) political parties. It makes frequent appearances on rallying cries during political campaigns, and is a significant portion of the Indian annual budget. NREGA’s annual spend is about INR48,000 crore (~$9 Billion US dollars) — amounting to more than 11% of the 2011 Union budget expenditure.

Annual spend on NREGA in US Dollars.

So what is it?

The Official MGNREGA logo

The National Rural Employment Guarantee Act (NREGA) is a long acronym, but descriptive: it is a large-scale program by the government of India to provide a guaranteed paid labor to households in rural parts of India. 17 million and more families have been provided employment due to this program.

The program, enacted in 2005, offers guaranteed labor for 100 days in a year, and if suitable work cannot be found for registered workers then an unemployment allowance is paid instead.

Everyone seemed to have an opinion about NREGA. Some admire an ambitious plan to make changes, others disagree with the principle of guaranteed labor. But everyone I spoke to agreed that in execution, the plan has countless problems.

There are some pretty intriguing arguments in favor of this program, which, from what I can tell, was the brain-child of economists including one Jean Drèze.

The most obvious argument, and a natural motivator to put something like this together, is that there’s plenty of work to be done in India, and plenty of people that need work.

The Federal government pays the wages and much of the other costs (materials, administration) of the putting people to work. The State and local governments are responsible for finding the projects and managing the employment. I thought this was pretty clever: although wages comes from the Federal budget, the allowance, if necessary, must come out of the State’s budget.  Therefore the state has a strong incentive to find suitable work for all registrants.

These and other checks-and-balances are meant to make sure that NREGA functions as intended. Which is important in a piece of legislature of this size and scope.

Corruption and unintended consequences

While traveling in India, I got into several discussions on this topic. Everyone seemed to have an opinion about NREGA. Some admire an ambitious plan to make changes, others disagree with the principle of guaranteed labor. But everyone I spoke to agreed that in execution, the plan has countless problems.

A big feature of these problems, of course, is corruption. Corruption in India is a scourge that takes countless smart ideas and good intentions, and renders them useless. It holds back progress in the country in order to provide immediate perquisites to people who are sometimes rotten souls, but often just ordinary people who eventually succumbed to a seemingly inescapable system.

In any multi-layered, hierarchical program in India, money leaks out at each administrative step. NREGA is riddled with these types of problems. People need to pay bribes to register for programs, bureaucrats make fake registrations and pocket the funds, the selection of projects is subject to the whims of corrupt officials, and so on.

Okay, yes, corruption ruins everything, and everybody I spoke to about NREGA rightfully mentioned corruption. But I find it perhaps more interesting to look at the unintended consequences of the program, failure points at a more fundamental level. I mean, guaranteed labor? Doesn’t that just scream unintended consequences to you?

For example, the productivity of workers that are provided guaranteed labor is very low — no surprise there. And the desirability of a government-paid and low-stress employment trumps that of any competing options, and so laborers previously employed at other projects (like private construction and agricultural labor) have an incentive to quit their jobs in favor of this more attractive option. Consequently, the government has to actually suspend the program during peak farming periods to counteract resulting labor shortages.

I heard one person argue that, as little as it sounds (INR120 or about $2 a day), getting paid at about minimum wage for 100 days each year is pretty decent for many households. So decent, in fact, that some may decide that they don’t need to work (or work very hard) for the remaining 265 days. That’s one reason, she said, that finding people for household work is getting pretty difficult.

Any program as large as this will have unintended consequences like these. How significant are they, relative to the benefits of the project?  To what extent should they be treated like casualties of war, to be minimized, but largely unavoidable, and to what extent do they make a convincing argument against the program as a whole? 

Changing the name, by the way, to the Mahatma Gandhi National Rural Employment Guarantee Act, doesn’t seem to have helped alleviate any of these issues, but it is fantastic political rhetoric.

The MGNREGA program stipulates that a third of the workforce must be women. In 2011, about half of the 45 million people employed by the program were women. This was a whole other topic of conversation, which I’ll post about later.


Quora Answer: The Must-Hear A.R. Rahman Song

This feels like double-dipping, but it took so long to write this Quora answer it must be acceptable to post it on my blog as well.

“Which is the best AR Rahman song that one must hear for sure?”
My answer on Quora (along with other answers, of course) is at http://qr.ae/7Qvh9, and I’ve reposted it here:

Rahman’s music for the 1995 movie Bombay has been one of the most commercially successful albums in his home country India, and the best selling film soundtrack of all time. It has been recognized on various must-listen lists, including the UK Guardian’s Top albums to Hear Before You Die.

But since you asked for a song, not an album, I’ll nominate one of the songs from this album: कहना. ही. क्या. / kehna hi kya.

Kehna hi kya is one of its most memorable and popular songs, and found its own independent success on the radio. It got additional recognition from the UK Guardian, beyond the other tracks on the album, on their Top Songs to Hear
(misspelled as “kehma hi kya”). Anecdotally speaking, it was ubiquitous in India in the 90s, and is still heard frequently today.

Kehna hi kya wins my vote also because it includes a vocal solo by A.R. Rahman himself, so one gets a taste of him both as a composer and a singer.

The movie Bombay is the story of the love between a Hindu man and a Muslim woman. The story leads up to a turbulent period in the early-1990s, escalating to the inter-religious Bombay Riots in which the Babri Masjid (mosque) was destroyed and hundreds of people were killed.

The song carefully uses language (Urdu-esque) and invokes images (like the veil in the excerpt below) that the audience associates with Muslim culture, but without any actual religious content. The images themselves are beautiful and passionate.


Sharm thodi thodi humko aaye to nazarein jhuk jaayen
Sitam thoda thoda humpe shok hawa bhi kar jaaye
Aisi chali, aanchal ude, dil mein ek toofaan uthe
Hum to lut gaye khade hi khade

Translated (my own translation; I’m no language scholar but I disliked others I found):

A little shyness came and caused my eyes/gaze to fall downward
(But) even the wind tortures/teases me,
Blowing in such a way as to throw off my veil,
— and (likewise) a storm blows/rises in my heart,
Just standing there my heart was stolen.

The lyrics were written by Mehboob Kotwal.

You can download the song from Amazon:

Tagore on Translations

The following is a taste of Rabindranath Tagore’s thoughts on language and translations, taken from a speech he gave when traveling in China in 1925:

“Languages are jealous. They do not give up their best treasures to those who try to deal with them through an intermediary belonging to an alien rival. You have to court them in person and dance attendance on them. Poems are not like gold or other substantial things that are transferable. You cannot receive the smiles and glances of your sweetheart through an attorney, however diligent and dutiful he may be.”

Tagore describes his experiences studying translated works of European authors, and his efforts at learning German. He says that he was cursed in that he understood meaning too quickly: once he was able to understand enough to infer the author’s intent, he was able to skip over the nuances and details of the language. In this way he read Heine “like a man walking in sleep crossing unknown paths with ease.”

His limited understanding of German was insufficient for other works like Faust.

“I believe I found my entrance to the palace, not like one who has keys for all the doors, but as a casual visitor who is tolerated in some general guest room, comfortable but not intimate.”

In the end, Tagore says, Goethe and other remained unknown to him.

“This is as it should be. Man cannot reach the shrine, if he does not make the pilgrimage.”

Read prose and appreciate it in its original language, insists Tagore. Read my poetry in the original Bangali and only then can you judge it truly.

“So you must not hope to find anything true from my own language in translation… I am gratified to hear from you that you are convinced that I am a poet because I have beautiful grey beard. But my vanity will remain unsatisfied until you know me from my voice that is in my poems.

“I hope that this may make you want to learn Bengali some day.”


Rabindranath Tagore, Talks in China. Rabindra Rachanavali Series, Rupa 2002. [Link]

Spotted on the Mangala Express

Spotted while traveling on the Mangala Express, from Ernakulum in Kerala to Panvel near Mumbai.

Mangala Express

Mangala Lakshadweep Express

From the window

A continual backdrop of green, of coconut trees, of bill- and graffiti- covered walls, small colorful buildings with peeling paint, and lines of drying laundry. Signs and messages in indecipherable Malayalam.

A woman in sari drinking a soda. a 4-foot tall pile of discarded drinking water bottles. Endless forests of coconut trees. Kids playing badminton. A group of women being lectured by a large fat woman in a black top. A sign for RSS, the first I’d spotted. Rivers and rivers, passing periodically. Two peeling towers looming over a train station. Kids playing cricket. Long endless fences separating residential neighborhoods from the railroad tracks, too far between stations to have any bills or graffiti. Regularly spaced rows of planted coconut trees. A large sign for CPI(M): a red hammer and sickle alongside a square-shouldered man in white and two hands shaking.

Clothes hung up to dry, everywhere and on everything, including a railroad crossing liftgate. Crowds of cars waiting for the train to pass, spilling across lanes to take up the entire road, waiting a furious and frustrating traffic jam with cars in the other directions. Four kids in the middle of a long endless dirt field, waving happily at the train. Huge piles of coconut, piles of gravel, and piles of lumber logs, trash, uprooted shrubs, large rocks, and stacked 10-gallon drums of mystery. A lady washing clothes by a river, pausing briefly to watch the train pass.

At station stops

Vendors running with good to load onto the train; groups of men stretching their legs. A man selling halva in small clear-plastic wrapped blocks. Small kiosks selling water, soda, dried banana chips, and packaged snacks. Several men standing together with perfectly matching moustaches. A group of women in burkas, chattering.

Inside the train

Salesmen walking the isles, hawking their goods: vada, daal vada, chai, coffee, samosa, pakora, break pakora, water, soda, an omelette lunch, egg biryani dinner, pav-wara, bhel masala, cheap necklaces and trinkets. A man with a tall stack of books: children’s coloring books, political books, religious books, mostly in Malayalam — he leaves a pile in each area, and comes back a long while later after they’ve been perused, hoping for a sale. A man selling large cloth sheets, for use as bedsheets or perhaps even dhotis.

A young woman struggling awkwardly with the climb down from the top berth, facing the wrong way but finds footholds with her husband’s help. A couple in a berth: the woman’s head on his lap, his head nodding rhythmically in sleep, their fingers intertwined. A lady with the air of academia, reading a book in Malayalam with stern gold spectacles low on her nose. A smelly young boy mildly escorted out of sleeper class to coach chair.

Observed entire conversations held in hand gestures and head motions, and then, eventually, joined in.

(Click to view complete gallery)

Discovering the ‘Argumentative Indian’ in me

From “India: Government & Politics in a Developing Nation” by Hardgrave, Jr. and Kochanek [link]:

In the aftermath of decolonization following World War II, theorists and statesmen saw the problems of poverty, economic stagnation, accelerated socioeconomic change, ethnic upheaval, and the need to create and sustain political order and legitimacy as a unique set of challenges that confronted the new states of Asia and Africa on their way to modernization and development. By the early 1970s, however, the advanced industrial societies of Europe, North America, and Japan were themselves convulsed by similar challenges as rapid technological change, global energy crises, raw material shortages, and a deteriorating environment found governments straining to satisfy rising expectations in a world of diminishing resources. It became increasingly evident that the problems of change and institutional adaptation were not the product of some isolated process of transformation from traditional to modern, agrarian to industrial, or developing to developed, but a continuous process of social, political, economic, and psychological adjustment to persistent pressures and challenges generated by alterations in the internal and external environments. There was no final social or political order that somehow would be reached by a magical process of “development” or “modernization,”, but a constant set of challenges that would continue to test human ingenuity in adapting to changing political, social, economic, and institutional imperatives.

With a trip to India imminent, my interest in Indian foreign policy, current affairs, and politics is revived. This book, which contains the above as one of its opening passages, has a direct style which I find informative. Yet there’s something about it that I find slightly disturbing.

To take one example, the description of Hinduism as having “a quality of resignation, of passiveness and fatalism…  [a] religious belief that has manifested itself in the political attitude of the many Indians who simply accept the government they have as the one they deserve” may be interesting, but it makes me feel defensive. Both theologians and historians could take issue. All the Hindus that I personally know fall far outside this generalization. But it is just a generalization, and the authors know that, and other readers must realize it, too. I’m also aware that my sample of Hindus is not representative of the population, and so on and so forth. So okay, I need to relax and let my hackles down.

Or should I? “Despite the creation of Pakistan, partition did not solve India’s communal problem. India still has one of the largest Muslim populations in the world.” Wait — is that saying that the population of Muslims in India are a problem? That’s as abrasive to me as if someone discussing US history were to claim the black population was a problem. Communal conflict due to diversity — any type of diversity — is a result of attitudes and prejudices, not the presence of the minority population. Was this an accidental miss-phrasing? Perhaps, but not one you would ever catch Amartya Sen making.

In contrast to Sen’s portrayal of India, this book — at least as I read each of them — this book is far less confident in the country’s future, using words that betray a wary skepticism that its challenges can and will be met. “India’s masses are an awakening force that has yet to find coherence and direction”, it notes, and “the image of spiritual, Gandhian India pales before continuous agitation, intermittent rioting, and a rising level of violence.” Especially considering my copy is an old edition (1993 — ancient history in the fast-moving sub-continent), there is nothing in these words that I could fairly contest. But…

… But even Edward Luce (“In Spite of the Gods”) alternated his doubts and pessimism of India with a sense of awe; he seemed ultimately to feel that the worst of it all was temporary turbulence in the rise of a great nation. Perhaps Hardgrove and Kochanek will do the same later in the book, but they certainly aren’t yet. [I’ll check back in when I’m further in.]

I should perhaps be reading the latest edition (2007, I think), but I find it provocative (in multiple ways) to read this one. I like reading a book and stopping frequently to argue with it in my head. It keeps me on my toes. Sure, a lot has changed since 1993, but history is as interesting as current affairs, and it is true that many of the “old” problems in India remain problems today. Perhaps I’ll get even more out of this older edition than the 2007, and I can always read the 2011 when it comes out.