Using Hadoop/Python/HBase for processing 2.6 million crash reports per day | A Mozilla story…

Posted on Updated on

Every computer literate person, at some point or the other, would have faced an application crash and a message something similar to the following:

“xxxx application crashed. Would you like to send the details to the XYZ company to help make the product better…” 

I will be honest that I have seldom hit the ‘OK’ button but have at times wondered how would these crash reports get processed.

Now, here is a great insight into how Mozilla processes the browser crash reports it receives. The project is called Socorro & uses Hadoop, Python, HBase etc.

You can read a wonderful article here. A must read for all Techies, has a short demo video, a PoC showing how Mozilla could integrate Hazelcast into Socorro and achieve caching and processing 2TB of crash reports with 50 node Hazelcast cluster.

Caching and Processing 2TB Mozilla Crash Reports in memory with Hazelcast

Mozilla receives ~2.5K crash reports per minute during peak traffic & stores 2.6 million crash reports per day….!!!!.

PS: Am wondering how much crash reports Microsoft would be getting for it’s applications :-))