Machine Desi Hip Hop: A Fun Experiment with RNN

There has been a lot of exciting developments on the use of Recurrent Neural Networks lately. After Andrej Karpathy’s post on the Unreasonable effectiveness of RNNs and a lot of cool experiments followed. Some of which are: TEDRnn, Find Your Dream Job , RNN Bible, DrumpfRNN. All of them make use of LSTM, a special kind of RNN that enables to connect previous information to the present task even in situations where gap between relevant information and place of prediction is large. You can find more about LSTMs in these amazing blogs written by Christopher Olah and Nikhil Buduma.

I was aware that LSTM is pretty popular for generating text that sounded grammatically accurate albeit less meaningful. This made me wonder, would this be any good for transliteration? I had to scratch the itch. Having been heavily influenced by Bollywood, a monkey in my head wanted to try testing it with Desi Hip-Hop party songs. Probably because it thinks most of the lyrics are meaningless anyways. Will non-native speakers figure out if it is gibberish right away? May be not.

Machine Desi Hip Hop Poster

Transliteration is conversion of a text from one script to another. For instance, the English transliteration of Hindi term ‘नमस्ते’ is ‘NAMASTE’, while its translation is ‘HELLO’.

I managed to scrape just above 100 song lyrics transliterated in English by various bollywood hip-hop artists. The data consists of only about 157000 characters (certainly not a huge dataset). In case you are not very familiar with this genre of songs, most of them although written in Hindi are swayed by Punjabi with many English terms creeping in every now and then.

The model generates lyrics by predicting one character after the other and this is where LSTM’s long term context memory comes into play. This LSTM model was built using Keras with theano in the backend. I rented an Amazon’s AWS GPU based g2x instance (using grid K520) and ran about 120 iterations on the data that ran approximately for about 6 hours. I did try running the algorithm on my local machine with 4 cores and compared the time taken for each epoch. It was approximately 10 times slower than GPU.


I set the seed text (required to start text generation) to something like “Ish your boy Ierr”, hoping that it will pickup some rapper’s pattern and generate something interesting. The model started learning and for each iteration it generated lyrics, character by character. I ignored the first few iterations since it was undecipherable. I waited and checked the outputs after 30 iterations, then 60 and then 100, only to find garbage text that sounded close to no language. I began to wonder if lstm is really not that good for transliteration or was it me doing something funny.

Sure enough, it was the latter case. It was realized that the seed given to the model contained only English words and that the already sparse training data mainly consisted of Hindi/Punjabi words and very few English terms. I decided to give it one more try with randomly chosen seed (of 20 characters) from the training data. As expected, the first few iterations had results like:


Seed: “r rani, teri jawani”
r rani, teri jawani
e r  tuit
uioner b6 hltuh
a¸aned ieyeuaoutuvd t. dua.s m6 pa irvendnaij
n epneaalwa. btt raamqe m kdhinaa

d  ynea  ahhd a teaa
iapa ch
ao nhaoeyant  o i  imo
ih  h lihl mioy
¢   i
t  ?t
urkho?,s  ny a a .h s u bra_y
n ga c l ecmo
aa np  ghdoeuaana oimarhaq, aat
egooanoyhnaaa rutan.hto  srd
ie e a
unojo  efmadkevenh a
wpoadao mwruuh
animy soht godedeiaasu2ouaaaakzm  tubehmee oagk d

I checked up the results after the 10th iteration, still nothing very distinctly seemed like hindi. It was the 19th iteration when I could see a ray of hope. The text did sound very much like a Hindi/Punjabi sentence and as expected it made absolutely no sense.

Iteration 19:

Seed: “ata been mujhko toh”

ata been
mujhko toh na saanu jaane lakh te hai
te hain bab saate hai
aas konge gara pe gava paana hai

aaja je baban meri hoos
mera baal laghi hai
pehi paar te peeth pata loogl na chaladi
ka sunda mere saan da main te saani naar gaana main taa karang

mere maan rati, mera naal te ni labni
mere laal na waalh kao
tere yaar main taan pakh li lai laaye
ni chaalo te lo khar de sapte ho saaye
na lo tu main to


As the iterations progressed it sounded like what was intended but as it was noticed earlier, for iterations with seed terms that extensively contained English terms, the words in the lyrics generated were difficult to read. The outputs of all the iterations could be found here. Having seen the results and considering the fact that this was run on a scanty-dataset, it is reasonable to say that LSTM could very well be utilized for transliteration problems.


The generated set of lyrics were taken one step further.Few rhyming couplets(aabb and abab form) were randomly chosen and manually ordered. I now had a Machine generated Desi-Hip Hop song lyrics ready. To do justice to the lyrics, I collaborated with some talented people and came up with a Desi-Hip Hop music video. Don’t understand the lyrics? Don’t panic, Nobody does…!

Machine Desi Hip Hop Lyrics:

main hoon meri naat hai
mere saa mujhe di lagaar
hai manna se magaa jaaya
jam san meri gala maar

disco vich ghaa pe gaya
disco vich ghaa pe gaya
disco vich ghaa pe gaya
disco vich ghaa pe gaya

bad te mere gee main sebaati
hai apna dil tohaaroor beti
par jaane weh laalu kich jar
oh seakh i’ta saapa raho kar
mera baa mera naal mera jaan nahi naar
nachre challe nikh tere nari tere jaar
sboni main tune mainu na baan hai
meri gal meri nahi rani da jaan hai

naal ni bhaam bara dhoon
hai aan hooj shaub de khoon
hi sanna tunne saar laar hai
mundeya de saad bab yehar hai

choda co paari hai, palle khila
apre kara hoon kishi jide gila
khwaban vich khona ni, kinni raata soya ni
tere ghar ke hoya ni

kubi keri annihon phori na choon
niscon te boon la di kon ku joon
rako kar de vod mera naah, nahi,
dal ee lyee kichune saan hai nahi

khir tere choori nachre phoom lega saye
chaki samriyaale kehon lond, nakhaye
mujhke kudiya makhle ni kare ishare
nakhre dikhaave ji main ke main laare

disco vich ghaa pe gaya
disco vich ghaa pe gaya
disco vich ghaa pe gaya
disco vich ghaa pe gaya

The model was over-fit on the line ‘disco vich ghaa pe gaya’ from the training dataset, which is a line from an existing song ‘Take your sandals off’, which is also the reason why it kept popping up repeatedly in the generated output.

7 thoughts on “Machine Desi Hip Hop: A Fun Experiment with RNN

  1. This is awesome dude. Try out on full bollywood data and/or all language songs. Or composing music using LSTM and create the new AI Musician….

  2. Very cool man! Bravo for sticking to the idea and executing it!

  3. If training vanilla neural nets is optimization over functions, training recurrent nets is optimization over programs. The takeaway is that even if your data is not in form of sequences, you can still formulate and train powerful models that learn to process it sequentially. You’re learning stateful programs that process your fixed-sized data.

Leave a Reply

Your email address will not be published. Required fields are marked *