I had the chance to speak at StirTrek 2018 about Machine Learning in R. I have been to StirTrek before, but it’s been a few years. The conference has really grown, as there are over 2000 attendees now.
I was in the 3:30 timeslot. I talked in a full theater and they broadcast the talk to two other theaters. I don’t know what attendance was like in the overflow rooms. Most of the follow up questions were from developers looking for resources to get started, tutorials, etc. It seemed like a sign that attendees were interested in going further, which was the point of the talk.
The organizers did a great job. I had a helpful proctor who notified about time, and made sure I was setup and informed.
The talk will go up later this month on YouTube, and I’ll add it to the blog. Thanks to all who attended, and a big thanks to all who helped organize, sponsor, and volunteered for the conference.
I’ve worked with various alternate file handlers in python before and wanted to explore the options in R. I was pleasantly surprised to find handlers prebuilt for tasks like compressing data. In addition, a pipe function is available to allow you to use less common commands on your file, like gpg for encryption.
I put together a quick video demo of how to use these functions, and it’s available on youtube:
The mean of means (of state e) is close to .36. If you take .3 * .36 + .4 * (1-.36), you get .364, so this seems to make sense. Note that I’m weighting the switching to e percentage based on the percentage of being in that state in the first place.
Note that I’m using the tm package, which is the traditional way to work with a document collection in R. There are new ways like tidytext that are gaining popularity. I may do a follow up talk on that.
The Monty Hall Problem is famous in the world of statistics and probability. For those struggling with the intuition, simulating the problem is a great way to get at the answer. Randomly choose a door for the prize, randomly choose a door for the user to pick first, play out Monty’s role as host, and then show the results of both strategies.
The numeric output will vary, but look something like: