Customer Complaint Analyzer

Overview: 2-3 minute read

Consumer feedback plays an important role in improving the services which a Company provides. In my project, I worked on a public data-set which had gathered the feedback of consumers(US Based) for various companies and their products. Using the data, I tried to find interesting patterns or visualise what went wrong over the years or what has improved since.

There was a lot of missing data and also a lot of random data which had to be removed(It gave me a lot of errors). The missing data was not removed, but just replaced by NA or left blank so that we did not lose data(as only a few columns were missing and we could use the rest). Approximately 1/6 of the data was missing. After cleaning the data, I reformatted it so that it could be accurately read by Spark.

I used Scala / Spark to do all the heavy lifting of the project then used R to perform visualisations. All the Cleaning was performed on Spark To get the data for visualization, I did RDD transformations to get the required data and then stored it in different text files. The size of each text file was approximately 2kb. The data was then read into R and visualized as bar graphs.


More details can be found in the Report