Graham Wheeler's Random Forest

Stuff about stuff

Creating Type Stubs for Scientific Python (Part 4)

The Story Thus Far Its been a while since the last post, mainly because I hit a speed bump along the way, which I have since addressed. It’s worth recapping what was covered before. Scientific Python pacckages like matplotlib don’t have much in the way of inline type annotations, nor do they have good type stubs available, but those would be very useful to improve the experience using them in code editors they do have a standard form of docstrings, numpydoc format, and that includes parameter and return value descriptions that most of the time include descriptions of the types (albeit in an informal way) I decided to build a tool to extract these and try to convert them to formal type annotations and generate stubs the extraction part, and the ‘insert converted annotations in to make stubs’ part, are reasonably straightforward, thanks in particular to Instagram’s libCST library for concrete syntax tree visiting and transforming I write the extracted descriptions plus my best effort at turning these into formal types into ‘.

Creating Type Stubs for Scientific Python (Part 3)

Generating Output for a Whole Package The approach we took in the last post to finding the files for a package is not strictly correct. We imported the package, then looked at the file associated with the package, and if it was an __init__.py file, added all the other .py files in the same directory. This works in many cases but not all. It specifically did work for matplotlib.axes which is the example I have used until now.

Creating Type Stubs for Scientific Python (Part 2)

Welcome back to this series on creating type stubs for scientific Python. In the last post we looked at using LibCST to generate skeleton type stubs, with a little bit of inference from value assignments. In this post we will dive into the process of using numpydoc-format docstrings. An Intro to numpydoc The easiest way to get a feel for numpydoc-format docstrings is to look at an example: def legend_elements(self, prop="colors", num="auto", fmt=None, func=lambda x: x, **kwargs): """ Create legend handles and labels for a PathCollection.

Creating Type Stubs for Scientific Python (Part 1)

This is part 1 of what will be two or three posts. In this post, I cover building a basic type stub generator; in the next post, I’ll get into handling scientific Python packages specifically. Why I Care About Python Type Stubs One of the teams I manage in my day job is the Pylance team, who build the Python language server for Visual Studio and Visual Studio Code. Pylance is built on top of the static type checker pyright, but where pyright focuses on finding errors, Pylance is focused on providing a great editor experience (as well as finding errors, but the editor experience is paramount).

Github reports for backlog management

A couple of years ago we decided we wanted to make sure that we were responding promptly to issues that users created on our GitHub repos. In particular, a 3-business day SLA was what we thought would be appropriate. Making sure that we did that day after day could be a bit tedious, so I thought it made sense to automate it. We had a vast trove of GitHub data in our Kusto data warehouse for every repository owned by Microsoft.

Moving my blog to Hugo

I have been using Nikola for about the past 8 years for my blog, but have been eyeing the development of Hugo and thinking I might want to migrate, and have finally done it. There’s nothing wrong with Nikola; I think it’s actually less work than Hugo because it handles .ipynb Jupyter notebooks very seamlessly, but Hugo is super-fast so you can work in a ’live-releoad’ mode which I like. So this weekend I finally did it.

Unit Tests that Don't Suck

Introduction This post is based on a talk I gave to my team in an effort to establish a common approach to thinking about unit tests. The existing code base we had suffered from a number of problems relating to how tests were being written; despite good intentions, it can be easy to do testing badly. In particular, here are some of the things I observed: a massive overuse of dependency injection: pretty much all dependencies of all classes were being set up using DI.

Prioritization, Estimating and Planning

This post came out of a talk I gave to a group of mentees, prompted by questions they had around how to do estimation and how to know they were working on the right priorities. These are complex questions to which there are no single answers, but I aimed to give them some tools that could help. Prioritizing “If it’s a priority you’ll find a way. If it isn’t, you’ll find an excuse.

Flow

“A bad system will beat a good person every time” - Edwards Deming This post is based on a tech talk I gave at eBay in early 2018. eBay had gone through a company-wide transformation to agile processes (where before this had been team-specific) and the main points I wanted to make where that it was important to make the hidden things the consumed people’s time visible, explicit, and properly prioritized, if we want to improve throughput or flow.

Personality Patterns

The last post in this series covered the Five Factor Model of personality. In this post we’ll dig into personality patterns that people can exhibit. Everyone has some combination of the five factors, but how does that combination manifest as a personality type? There are many different models of personality types, but one used in psychology and psychoanalysis is the categorization in the DSM - the Diagnostic and Statistical Manual of Mental Disorders.