Lauren Writes


My Data Science Adventure

Three Visualization Techniques

It’s true, a picture tells a thousand words!


NLP: Random Forest & Neural Network Classifiers

After cleaning and exploring my dataset for my NLP project, I wanted to model my data using both a Random Forest Classifier as well as a Neural Network Classifier. To prepare the data for these models I had to take a couple of different methods. After a lot of googling, I thought it would be helpful to describe these methods in a cohesive blog!


From Actuary to Data Scientist

Here I sit about to complete my 5 month immersive course in Data Science with the Flatiron School this week! Before I leave this course, I want to speak about why I decided to enter this field and change my profession at age 36.


ROC Curve / Multiclass Predictions / Random Forest Classifier

While working through my first modeling project as a Data Scientist, I found an excellent way to compare my models was using a ROC Curve! However, I ran into a bit of a glitch because for the first time I had to create a ROC Curve using a dataset with multiclass predictions instead of binary predictions. I also had to learn how to create a ROC Curve using a Random Forest Classifier for the first time. Since it took me an entire afternoon googling to figure these things out, I thought I would blog about them to hopefully help someone in the future, that being you!


ARIMA Modeling and Train/Test Split

When looking at time series and considering fitting the ARIMA model to your data, as always it’s important to develop train/test splits of your data. However, when doing this for time series the process is a bit different. Rather than using a random sample as you may do when fitting a regression model, you’ll want to split the data based on your datetime.