Tryst with T-SNE

 In Analytics

I received an article on what is google upto with AI. It was a discussion on how to visualise highly dimensional data into 2 dimension space using a clustering technique called T-SNE. It was an interesting read until you ask the question how do you go about using for your own data / use-case. I found one site which gave a great step by step instruction on why you should be wary of this technique ( Great, I haven’t even begun to understand the technique, now I have equally understand the caveats as well. But I ploughed on. I wanted to read a sample data of hotel feedback ( and see how this technique works on an unstructured text format. I have tried sentiment analysis and topic modelling before. I wanted to check if T-SNE would give me a different perspective. I used RTM, RGL and Rtsne package to perform data clean-up and transformation and run the technique.

Result of the analysis

I tried with a setting of perplexity (yep, it’s a parameter) of 40, which is midway. I got the results on a 3D graph, the snapshot of which is attached. I could see a number of words that appear together a lot in the dataset. Like Hotel and Spa appeared together. The other group of words was kitchen, dish and chef. Environment and Fantastic was a repeating theme. On a simple graph, we are able to scan thru, zoom in/out and twist and turn through the 3D cloud to get a sense of the aspects of hospitality that are deemed important. While sentiment analysis gives whether a tweet is happy or not, I was able to get a sense of underlying themes on what the customers consider important.

Hotel Feedback Data Clustering using T-SNE technique

Blog disclaimer:
This is a professional weblog, and we have invited experts to share their thoughts, expertise , perspectives and knowledge. The opinions expressed here are purely representing their personal views and not those of any institution, employer or company.

Recent Posts

Start typing and press Enter to search