Happy (belated) Thanksgiving! This year, my family drove down to Houston for the holiday & I hosted Thanksgiving for the first time. We played lots of games and ate well - my fridge is still stocked full of leftovers. Knowing we’d be busy with hosting, I planned ahead and scheduled a lighter post - this week, I thought I’d highlight some technical books that I’ve either referenced for modeling work, have been recommended to me, or I’ve heard about and would like to read:
- The Elements of Statistical Learning is referenced as the Bible of Machine Learning by Josh Starmer and provides a robust and deeply technical foundation for a wide array of machine learning models. It’s considered a must-have among both machine learning theorists, who look for new model structures, and practitioners (like myself!).
- An Introduction to Statistical Learning with Applications in R is a companion to The Elements of Statistical Learning. An Introduction to Statistical Learning arose as a broader and less technical treatment of the key topics discussed in The Elements of Statistical Learning. Each section also includes learning-lab lessons walking through the implementation of the statistical learning method from that chapter (Emil Hvitfeldt is also working on a companion site for completing the labs with tidymodels).
- Tidy Modeling with R is a guide to using the tidymodel framework and has been an excellent reference in both personal and professional projects.
- Text Mining with R: a Tidy Approach serves as an introduction to text mining and other methods for dealing with unstructured, non-rectangular data. In my current role as a Consumer Experience Analyst, I have to interact with unstructured data (in the form of patient comments) daily - this book, along with the tidytext package, have been incredibly useful for analyzing and visualizing text data.
- Supervised Machine Learning for Text Analysis in R picks up where Text Mining with R left off by exploring (as the title suggests) supervised machine learning methods with text data. While I haven’t done extensive text modeling, this is one area that I’d like to explore further in 2022.
- Feature Engineering and Selection: A Practical Approach for Predictive Models is a guidebook offering methods for feature engineering (transforming and creating new predictor variables to improve predictive model performance). While I’ve utilized some basic feature engineering in some of my work, I’m interested in adding more robust tools to my feature-engineering toolkit!
- Introduction to Empirical Bayes is David Robinson’s book coalescing a series of blog posts on Bayesian estimation, credible intervals, A/B testing, mixed models, and a host of other methods, all through the example of baseball batting averages.
- Build a Career in Data Science is, as the name suggests, a book about building a career in data science. I generally feel that most career-help books are too broad to be useful or offer non-novel information for those in the industry the book is written for. Given, however, that I don’t have an academic or professional background in the field and that I’d like to eventually move from analytics to data science, I’d like to add this to the collection to pick up on some best practices.