Hi friends, this is part 2 of the NLP analysis based on amazon reviews, here we will describe the machine learning stage that we will apply to our data, specifically the Random Forest model.

In part 1 we obtained the data, cleaned it and built a bag of words after tokenizing the reviews and applying a vectorizer.

Let’s start by explaining what a Random Forest is, initially i will say that it is an ensemble of decisions tree, so what is a decision tree?

