Graham Wheeler's Random Forest

Stuff about stuff

Managing Engineering and Data Science Agile Teams

It is very common in modern software engineering organizations to use agile approaches to managing teamwork. At both Microsoft and eBay teams I have managed have used Scrum, which is a reasonably simple and effective approach that offers a number of benefits, such as timeboxing, regular deployments (not necessarily continuous but at least periodic), a buffer between the team and unplanned work, an iterative continuous improvement process through retrospectives, and metrics that can quickly show whether the team is on track or not.

Basic Machine Learning with SciKit-Learn

This is the fourth post in a series based off my [Python for Data Science bootcamp]((https://github.com/gramster/pythonbootcamp) I run at eBay occasionally. The other posts are: a Python crash course using Jupyter exploratory data analysis. In this post we will look into the basics of building ML models with Scikit-Learn. Scikit-Learn is the most widely used Python library for ML, especially outside of deep learning (where there are several contenders and I recommend using Keras, which is a package that provides a simple API on top of several underlying contenders like TensorFlow and PyTorch).

Exploratory Data Analysis with NumPy and Pandas

This is the third post in a series based off my Python for Data Science bootcamp I run at eBay occasionally. The other posts are: a Python crash course using Jupyter introductory machine learning. This is an introduction to the NumPy and Pandas libraries that form the foundation of data science in Python. These libraries, especially Pandas, have a large API surface and many powerful features. There is now way in a short amount of time to cover every topic; in many cases we will just scratch the surface.

Using Jupyter

This is the second post in a series based off my Python for Data Science bootcamp I run at eBay occasionally. The other posts are: a Python crash course exploratory data analysis. introductory machine learning. Jupyter is an interactive computing environment that allows users to create heterogeneous documents called notebooks that can mix executable code, markdown text with MathJax, multimedia, static and interactive charts, and more. A notebook is typically a complete and self-contained record of a computation, and can be converted to various formats and shared with others.