Category: Data

  • Machine Learning & Gradient Boosting w/xgboost

    I presented at the Cleveland R User Group on using xgboost in R. Slides are available here. Code (jupyter notebooks) are here. Feedback welcome. Enjoy!

  • Installing TensorFlow on CentOS

    Google released TensorFlow as open source for community use and improvement. From the site: “TensorFlow™ is an open source software library for numerical computation using data flow graphs.” The instructions on tensorflow.org are aimed at Ubuntu and OS X. I had a need to install it on CentOS so I documented the steps in a […]

  • Presentation on Linear Algebra in R

    At our January meeting, I presented on Linear Algebra basics in R. I have been taking the Andrew Ng’s Stanford Machine Learning course. That course primarily uses Matlab (or Octave, and open source equivalent), and machine learning involves manipulating and calculating with matrices. Naturally, being an R person, I have been working with some of […]

  • Data-Driven vs the Dashboard 

    It is common for technical product companies to call themselves “data-driven” these days. The idea is that metrics are used to drive decisions. Sounds easy enough, and compatible with a technology landscape that is enamored with data science, etc.  But something didn’t always feel right to me. Strange, right? If you follow this blog or […]

  • Analyzing Spread Football Picks With R

    I’ve been making an effort to learn R for about a year. I have experimented with it on and off over the years, but this is first serious effort I’ve been making. Whenever I am learning something, rather than just focusing on book examples, I try to come up with an example that is relevant […]