The Math of Machine Learning

Matrix multiplication diagram
(hover for CC attribution)

One of the challenges of data science in general is that it is a multi-disciplinary field. For any given problem, you may need skills in data extraction, data transformation, data cleaning, math, statistics, software engineering, data visualization, and the domain. And that list likely isn’t inclusive.

One of the first questions when it comes to machine learning in specific, is “how much math do I need to know?”

This is where I would recommend you start, to get the most value for your time:

  • Matrix Multiplication (Subject: Linear Algebra)
  • Probability (Subject: Statistics)
  • Normal Distributions (Subject: Statistics)
  • Bayes Theorem (Subject: Statistics)
  • Linear Regression (Subject: Statistics)

Of course you will run across other math needs, but I think the above list represents the foundation.

If you need places to get started with those topics, check out Kahn Academy, Coursera, or your location library.

For more on machine learning, check out other posts such as ML in R, Linear Algebra in R, and ML w/XGBoost.

Flatland

I finished reading Flatland. It’s a parable-style story about dimensions and geometry. As exciting as that sounds, it’s really good. I recommend it. You don’t have to be a math nerd to read it.

I read the annotated version, but pretty much read the story straight-through, only using the notes when something struck a chord.

Check it out at amazon…