Using Hadoop/Python/HBase for processing 2.6 million crash reports per day | A Mozilla story…

Every computer literate person, at some point or the other, would have faced an application crash and a message something similar to the following:

“xxxx application crashed. Would you like to send the details to the XYZ company to help make the product better…” 

I will be honest that I have seldom hit the ‘OK’ button but have at times wondered how would these crash reports get processed.

Now, here is a great insight into how Mozilla processes the browser crash reports it receives. The project is called Socorro & uses Hadoop, Python, HBase etc.

You can read a wonderful article here. A must read for all Techies, has a short demo video, a PoC showing how Mozilla could integrate Hazelcast into Socorro and achieve caching and processing 2TB of crash reports with 50 node Hazelcast cluster.

Mozilla receives ~2.5K crash reports per minute during peak traffic & stores 2.6 million crash reports per day….!!!!.

PS: Am wondering how much crash reports Microsoft would be getting for it’s applications :-))