Python Bytes

by Michael Kennedy and Brian Okken

Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.


Latest Episodes

#126 WebAssembly comes to Python

Sponsored by DigitalOcean:

Special guest: Cecil Philip

Brian #1: Python Used to Take Photo of Black Hole

  • Lots of people talking about this. The link I’m including is a quick write up by Mike Driscoll.
  • From now on these conversations can happen:
    • “So, what can you do with Python?”
    • “Well, it was used to help produce the worlds first image of a black hole. Your particular problem probably isn’t as complicated as that, so Python should work fine.”
  • Projects listed in the paper: “First M87 Event Horizon Telescope Results. III. Data Processing and Calibration”:

Cecil #2: Wasmer - Python Library for executing WebAssembly binaries

  • WebAssembly (Wasm) enables high level languages to target a portable format that runs in the web
  • Tons of languages compile down to Wasm but Wasmer enables the consumption of Wasm in python
  • This enables an interesting use case for using Wasm as a way to leverage code between languages

Michael #3: Cooked Input

  • cooked_input is a Python package for getting, cleaning, converting, and validating command line input.
  • Name comes from input / raw_input (unvalidated) and cooked input (validated)
  • Beginner’s can use the provided convenience classes to get simple inputs from the user.
  • More complicated command line application (CLI) input can take advantage of cooked_input’s ability to create commands, menus and data tables.
  • All sorts of cool validates and cleaners
  • Examples
    cap_cleaner = ci.CapitalizationCleaner(style=ci.ALL_WORDS_CAP_STYLE)
    ci.get_string(prompt="What is your name?", cleaners=[cap_cleaner])
    >>>  ci.get_int(prompt="How old are you?", minimum=1)

    How old are you?: abc
    "abc" cannot be converted to an integer number
    How old are you?: 0
    "0" too low (min_val=1)
    How old are you?: 67

Brian #4: JetBrains and PyCharm officially collaborating with Anaconda

  • PyCharm 2019.1.1 has some improvements for using Conda environments.
    • Fixed various bugs related to creating Conda envs and installing packages into them.
  • Special distribution of PyCharm: PyCharm for Anaconda with enhanced Anaconda support.
  • I’m using PyCharm Pro with vim emulation this week to edit a notebook based presentation. I might run them in Jupyter, or just run it in PyCharm, but editing with all my normal keyboard shortcuts is awesome.

Cecil #5: Building a Serverless IoT Solution with Python Azure Functions and SignalR

  • Interesting blog post on using serverless, IoT, real-time messaging to create a live dashboard
  • Shows how to create a serverless function in Python to process IoT data
  • There’s tons of DIY applications for using this technique at home
  • The Dashboard is a static website using D3 for charting.

Michael #6: multiprocessing.shared_memory — Provides shared memory for direct access across processes

  • New in Python 3.8
  • This module provides a class, SharedMemory, for the allocation and management of shared memory to be accessed by one or more processes on a multicore or symmetric multiprocessor (SMP) machine.
  • The ShareableList looks nice to use.



  • Getting ready for PyCon with STICKERS. Yeah, baby. Come see us at PyCon. I’ll also be bringing some copies of Python Testing with pytest, if anyone doesn’t already have a copy.
  • Lots of interviews going on for Test & Code, and some will happen at PyCon.




Brian: To understand recursion you must first understand recursion.

Michael: A programmer was found dead in the shower. Next to their body was a bottle of shampoo with the instructions 'Lather, Rinse and Repeat'.

Audio Download

Posted on 19 April 2019 | 8:00 am

#125 Will you conquer the deadlock empire?

Sponsored by Datadog:

Brian #1: My How and Why: pyproject.toml & the 'src' Project Structure

  • Brian Skinn
  • pyproject.toml
    • but with setuptools, instead of flit or poetry
    • with a src dir
    • and tox and black
  • all the bits and pieces to make all of this work

Michael #2: The Deadlock Empire: Slay dragons, master concurrency!

  • A game to test your thread safety and skill!
  • Deadlocks occur in code when two threads end up trying to enter two or more locks (RLocks please!)
  • Consider lock_a and lock_b
  • Thread one enters lock_a and will soon enter lock_b
  • Thread two enters lock_b and will soon enter lock_a
  • Imagine transferring money between two accounts, each with a lock, and each thread does this in opposite order.

Brian #3: Cog 3.0

  • Ned Batchelder’s cog gets an update (last one was a few years ago).
  • Cog … finds snippets of Python in text files, executes them, and inserts the result back into the text. It’s good for adding a little bit of computational support into an otherwise static file.”
  • Development moved from Bitbucket to GitHub.
  • Travis and Appveyor CI.
  • The biggest functional change is that errors during execution now get reasonable tracebacks that don’t require you to reverse-engineer how cog ran your code.
  • mutmut mutation testing added. Cool.
  • What I want to know more about is this statement: “…now I use it for making all my presentations”. Very cool idea.

Michael #4: StackOverflow 2019 Developer Survey Results

Brian #5: Cuv’ner A commanding view of your test-coverage"

  • Coverage visualizations on the console.

Michael #6: Mobile apps launched

  • The tech (sadly only 50% Python)
    • Xamarin, Mono, and C# on the device-side
    • Python, Pyramid, and MongoDB on the server-side
  • 90% code sharing or higher
  • Native applications
  • Build the prototype myself on Windows
  • Hired Giorgi via TopTal
  • Dear mobile app developers: You have my sympathy!
  • Try the app at Comes with 2 free courses for anyone who logs in.
  • Android only at the moment but not for long





  • “When your hammer is C++, everything begins to look like a thumb.”
  • “Why don't jokes work in octal? Because 7 10 11”
    • Over explained: Why is 6 afraid of 7. Cuz 7 8 9.
    • Follow on: Why did 7 eat 9? He was trying to eat 3^2 meals.
  • I've been using Vim for a long time now, mainly because I can't figure out how to exit.

Audio Download

Posted on 13 April 2019 | 8:00 am

#124 This is not the None you're looking for

Sponsored by DigitalOcean:

Brian #1: pytest 4.4.0

  • Lots of amazing new features here (at least for testing nerds)
  • testpaths displayed in output, if used.
    • pytest.ini setting that allows you to specify a list of directories or tests (relative to test rootdir) to test. (can speed up test collection).
  • Lots of goodies for plugin writers.
  • Internal changes to allow subtests to work with a new plugin, pytest-subtests.
  • Just started playing with it, but I’m excited already. Planning on a full Test & Code episode after I play with it a bit more.
    # unittest example:
    class T(unittest.TestCase):
        def test_foo(self):
            for i in range(5):
                with self.subTest("custom message", i=i):
                    self.assertEqual(i % 2, 0)
    # pytest example:
    def test(subtests):
        for i in range(5):
            with subtests.test(msg="custom message", i=i):
                assert i % 2 == 0

Michael #2: requests-async

  • async-await support for requests
  • Just finished talking with Kenneth Reitz, native async coming to requests, but awhile off
  • Nice interm solution
  • Requires modern Python (3.6)
  • Interesting Flask, Quart, Starlette, etc. framework wrapper for testing

Brian #3: Reasons why PyPI should not be a service

  • Dustin Ingram’s article: PyPI as a Service
  • “Layoffs at JavaScript package registry raise questions about fate of community resource” - The Register article
  • Apparently PyPI gets requests for a private form of their service regularly, but there are problems with that.
  • Currently a non-profit project under the PSF. That may be hard to maintain if they have a for-profit part.
  • Donated services and infrastructure of more than $1M/year would be hard to replace.
  • There are already other package repository options. Although there is probably room for others to compete.
  • Currently run by volunteers for the most part. (<1 employee). Don’t think they would stick around to volunteer for a for-profit enterprise.
  • conclusion: not impossible, but probably not worth it.

Michael #4: Jupyter in the cloud

  • Six easy ways to run your Jupyter Notebook in the cloud by Kevin Markham
  • six services you can use to easily run your Jupyter notebook in the cloud. All of them have the following characteristics:
    • They don't require you to install anything on your local machine.
    • They are completely free (or they have a free plan).
    • They give you access to the Jupyter Notebook environment (or a Jupyter-like environment).
    • They allow you to import and export notebooks using the standard .ipynb file format.
    • They support the Python language (and most support other languages as well).
  • Binder is a service provided by the Binder Project, which is a member of the Project Jupyter open source ecosystem. It allows you to input the URL of any public Git repository, and it will open that repository within the native Jupyter Notebook interface.
  • Kaggle is best known as a platform for data science competitions. However, they also provide a free service called Kernels that can be used independently of their competitions.
  • Google Colaboratory, usually referred to as "Google Colab," is available to anyone with a Google account. As long as you are signed into Google, you can quickly get started by creating an empty notebook, uploading an existing notebook, or importing a notebook from any public GitHub repository.
  • To get started with Azure Notebooks, you first sign in with a Microsoft or Outlook account (or create one). The next step is to create a "project", which is structured identically to a GitHub repository: it can contain one or more notebooks, Markdown files, datasets, and any other file you want to create or upload, and all of these can be organized into folders.
  • CoCalc, short for "collaborative calculation", is an online workspace for computation in Python, R, Julia, and many other languages. It allows you to create and edit Jupyter Notebooks, Sage worksheets, and LaTeX documents.
  • Datalore was created by JetBrains, the same company who makes PyCharm (a popular Python IDE). Getting started is as easy as creating an account, or logging in with a Google or JetBrains account. You can either create a new Datalore "workbook" or upload an existing Jupyter Notebook.

Brian #5: Jupyter Notebook tutorials

  • These are from Dataquest
  • Jupyter Notebook for Beginners: A Tutorial
    • Incredibly gentle, concise, useful tutorial to get started quickly.
    • Installation, creating, and running with server and browser.
    • Discussion of .ipynb files
    • Overview of interface, cells, shortcuts, markdown.
    • Kernels
    • Starting with data. Importing appropriate libraries, loading data.
    • Save and checkpoint
    • looking at data, graphing/plotting data
    • Sharing notebooks: exporting, using github and gists, nbviewer,
  • Tutorial: Advanced Jupyter Notebooks
    • shell commands
    • basic magics
    • autosaving
    • matplotlib inline
    • debugging in Jupyter
    • (Brian: Gak! Maybe switch to PyCharm for debugging)
    • using timeit
    • rendering theml, latex, other languages in cells.
    • logging, extensions
    • charts with seaborn
    • macros
    • loading, importing and running external code and snippets.
    • scripted execution, even on the command line
    • parametrization with env variables
    • styling, hiding cells, working with databases

Michael #6: Unique sentinel values, identity checks, and when to use object() instead of None

  • By Trey Hunner
  • In Python (and in programming in general), you’ll need an object which can be uniquely identified. Sometimes this unique object represents a stop value or a skip value and sometimes it’s an initial value.
  • Often this is None, but there are plenty of gotchas packed in there.
  • Nice example of re-implementing min.
  • Make sure to leverage is rather than ==
    initial = object()
    # ...
    if minimum is not initial:
       return minimum
    # ...





Audio Download

Posted on 5 April 2019 | 8:00 am

#123 Time to right the py-wrongs

Sponsored by Datadog:

Brian #1: Deconstructing
  • Brett Cannon
  • Breakdown of the infamous xkcd comic poking fun at the authors Python Environment on his computer.
    • The interpreters listed
    • Homebrew description
    • binaries
    • A discussion of pip, easy_install
    • The paths and the $PATH and $PYTHONPATH
  • Actually quite an educational history lesson, and the abuse some people put their computers through.
  • “So the next time someone decides to link to this comic as proof that Python has a problem, you can say that it's actually Randall's problem.”
Michael #2: Python package as a CLI option
  • Wanted to make this little app available via a CLI as a dedicated command. Really tired of python3 or ./
  • Turns out, pip and Python already solve this problem, if you structure your package correctly
  • Thanks to everyone on Twitter!
  • The trick turns out to be to have entrypoints in your package
    entry_points = {
      "console_scripts": ['bootstrap = bootstrap.bootstrap:main']
    } ...

This should even register it with pipx install package ;)

Brian #3: pyright
  • a Microsoft static type checker for the Python language.
  • “Pyright was created to address gaps in existing Python type checkers like mypy.”
  • 5x faster than mypy
  • meant for large code bases
  • written in TypeScript and runs within node.
Michael #4: Refactoring Python Applications for Simplicity
  • If you can write and maintain clean, simple Python code, then it’ll save you lots of time in the long term. You can spend less time testing, finding bugs, and making changes when your code is well laid out and simple to follow.
  • Is your code complex?
  • Metrics for Measuring Complexity
    • Lines of Code
    • Cyclomatic complexity is the measure of how many independent code paths there are through your application.
    • Maintainability Index
  • Refactoring: The technique of changing an application (either the code or the architecture) so that it behaves the same way on the outside, but internally has improved.
  • Nice overview of tooling (PyCharm, VS Code plugins, etc)
  • Anti-patterns and ways out of them (best part of the article IMO)
Brian #5: FastAPI
  • Thanks Colin Sullivan for suggesting the topic
  • FastAPI framework, high performance, easy to learn, fast to code, ready for production”
  • “Sales pitch / key features:
    • Fast: Very high performance, on par with NodeJS and Go (thanks to Starlette and Pydantic). One of the fastest Python frameworks available.
    • Fast to code: Increase the speed to develop features by about 200% to 300%. (estimated)
    • Fewer bugs: Reduce about 40% of human (developer) induced errors. (estimated)
    • Intuitive: Great editor support. Completion everywhere. Less time debugging.
    • Easy: Designed to be easy to use and learn. Less time reading docs.
    • Short: Minimize code duplication. Multiple features from each parameter declaration. Fewer bugs.
    • Robust: Get production-ready code. With automatic interactive documentation.
    • Standards-based: Based on (and fully compatible with) the open standards for APIs: OpenAPI(previously known as Swagger) and JSON Schema.”
  • uses:
  • document REST apis with both
    • Swagger
    • ReDoc
  • looks like quite a fun contender in the “put together a REST API quickly” set of solutions out there.
  • Just the front page demo is quite informative. There’s also a tutorial that seems like it might be a crash course in API best practices.
Michael #6: Bleach: stepping down as maintainer
  • by Will Kahn-Greene
  • Bleach is a Python library for sanitizing and linkifying text from untrusted sources for safe usage in HTML.
  • A retrospective on OSS project maintenance
  • Picked up maintenance of the project because
    • I was familiar with it
    • current maintainer really wanted to step down
    • Mozilla was using it on a bunch of sites
    • I felt an obligation to make sure it didn't drop on the floor and I knew I could do it.
  • Never really liked working on Bleach
  • He did a bunch of work on a project I don't really use, but felt obligated to make sure it didn't fall on the floor, that has a pain-in-the-ass problem domain. Did that for 3+ years.
  • Is [he] getting paid to work on it? Not really.
  • Does [he] like working on it? No.
  • Seems like [he] shouldn't be working on it anymore.




Audio Download

Posted on 29 March 2019 | 8:00 am

#122 Give Me Back My Monolith

Sponsored by DigitalOcean:

Brian #1: Combining and separating dictionaries
    d = d1.copy()
Michael #2: Why I Avoid Slack
  • by Matthew Rocklin
  • I avoid interacting on Slack, especially for technical conversations around open source software.
  • Instead, I encourage colleagues to have technical and design conversations on GitHub, or some other system that is public, permanent, searchable, and cross-referenceable.
  • Slack is fun but, internal real-time chat systems are, I think, bad for productivity generally, especially for public open source software maintenance.
  • Prefer GitHub because I want to
    • Engage collaborators that aren’t on our Slack
    • Record the conversation in case participants change in the future.
    • Serve the silent majority of users who search the web for answers to their questions or bugs.
    • Encourage thoughtful discourse. Because GitHub is a permanent record it forces people to think more before they write.
    • Cross reference issues. Slack is siloed. It doesn’t allow people to cross reference people or conversations across Slacks
Brian #3: Hunting for Memory Leaks in Python applications
  • Wai Chee Yau
  • Conquering memory leaks and spikes in Python ML products at Zendesk.
  • A quick tutorial of some useful memory tools
  • The memory_profiler package and matplotlib to visualize memory spikes.
  • Using muppy to heap dump at certain places in the code.
  • objgraph to help memory profiling with object lineage.
  • Some tips when memory leak/spike hunting:
    • strive for quick feedback
    • run memory intensive tasks in separate processes
    • debugger can add references to objects
    • watch out for packages that can be leaky
      • pandas? really?
Michael #4: Give Me Back My Monolith
  • by Craig Kerstiens
  • Feels like we’re starting to pass the peak of the hype cycle of microservices
  • We’ve actually seen some migrations from micro-services back to a monolith.
  • Here is a rundown of all the things that were simple that you now get to re-visit
  • Setup went from intro chem to quantum mechanics
    • Onboarding a new engineering, at least for an initial environment would be done in the first day. As we ventured into micro-services onboarding time skyrocketed
  • So long for understanding our systems
    • Back when we had monolithic apps if you had an error you had a clear stacktrace to see where it originated from and could jump right in and debug. Now we have a service that talks to another service, that queues something on a message bus, that another service processes, and then we have an error.
  • If we can’t debug them, maybe we can test them
  • All the trade-offs are for a good reason. Right?
Brian #5: Famous Laws Of Software Development
  • Tim Sommer
  • 13 “laws” of software development, including
    • Hofstadter’s Law: “It always takes longer than you expect, even when you take into account Hofstadter's Law.”
    • Conway’s Law: “Any piece of software reflects the organizational structure that produced it.”
    • The Peter Principle: “In a hierarchy, every employee tends to rise to his level of incompetence.”
    • Ninety-ninety rule: “The first 90% of the code takes 10% of the time. The remaining 10% takes the other 90% of the time”
Michael #6: Beer Garden Plugins
  • A powerful plugin framework for converting your functions into composable, discoverable, production-ready services with minimal overhead.
  • Beer Garden makes it easy to turn your functions into REST interfaces that are ready for production use, in a way that’s accessible to anyone that can write a function.
  • Based on MongoDB, Rabbit MQ, & modern Python
  • Nice docker-compose option too



  • From Derrick Chambers

    “What do you call it when a python programmer refuses to implement custom objects? self deprivation! Sorry, that joke was really classless.”

  • via pyjokes: I had a problem so I thought I'd use Java. Now I have a ProblemFactory.

Audio Download

Posted on 22 March 2019 | 8:00 am

#121 python2 becomes self-aware, enters fifth stage of grief

Sponsored by Datadog:

Brian #1: Futurize and Auto-Futurize
  • Staged automatic conversion from Python2 to Python3 with futurize from
    • pip install future
  • Stages:
    • 1: safe fixes:
      • exception syntax, print function, object base class, iterator syntax, key checking in dictionaries, and more
    • 2: Python 3 style code with wrappers for Python 2
      • more risky items to change
      • separating text from bytes, quite a few more
    • very modular and you can be more aggressive and more conservative with flags.
  • Do that, but between each step, run tests, and only continue if they pass, with auto-futurize from Timothy Hopper.
    • a shell script that uses git to save staged changes and tox to test the code.
Michael #2: Tech blog writing live stream
  • via Anthony Shaw
  • Live stream on "technical blog writing"
  • Talking about how I put articles together, research, timing and other things about layouts and narratives.
  • Covers “Modifying the Python language in 6 minutes”, deep article
  • Listicals, “5 Easy Coding Projects to Do with Kids”
  • A little insight into what is popular.
  • Question article: Why is Python Slow?
  • Tourists guide to the CPython source code
Brian #3: Try out walrus operator in Python 3.8
  • Alexander Hultnér
  • The walrus operator is the assignment expression that is coming in thanks to PEP 572.
    # From:
    # Handle a matched regex
    if (match := is not None:
        # Do something with match

    # A loop that can't be trivially rewritten using 2-arg iter()
    while chunk :=

    # Reuse a value that's expensive to compute
    [y := f(x), y**2, y**3]

    # Share a subexpression between a comprehension filter clause and its output
    filtered_data = [y for x in data if (y := f(x)) is not None]
    for entry in sample_data: 
        if title := entry.get("title"):
            print(f'Found title: "{title}"')
  • That code won’t fail if the title key doesn’t exist.
Michael #4: bullet : Beautiful Python Prompts Made Simple
  • Have you ever wanted a dropdown select box for your CLI? Bullet!
  • Lots of design options
  • Also
    • Password “boxes”
    • Yes/No
    • Numbers
  • Looking for contributors, especially Windows support.
Brian #5: Hosting private pip packages using Azure Artifacts
  • Interesting idea to utilize artifacts as a private place to store built packages to pip install elsewhere.
  • Walkthrough is assuming you are working with a data pipeline.
  • You can package some of the work in earlier stages for use in later stages by packaging them and making them available as artifacts.
  • Includes a basic tutorial on setuptools packaging and building an sdist and a wheel.
  • Need to use CI in the Azure DevOps tool and use that to build the package and save the artifact
  • Now in a later stage where you want to install the package, there are some configs needed to get the pip credentials right, included in the article.
  • Very fun article/hack to beat Azure into a use model that maybe it wasn’t designed for.
  • Could be useful for non data pipeline usage, I’m sure.

  • Speaking of Azure, we brought up Anthony Shaw’s pytest-azurepipelines pytest plugin last week. Well, it is now part of the recommended Python template from Azure. Very cool.

Michael #6: Async/await for wxPython
  • via Andy Bulka
  • Remember asyncio and PyQt from last week?
  • Similar project called wxasync which does the same thing for wxPython!
  • He’s written a medium article about it with links to that project, and share some real life usage scenarios and fun demo apps.
  • wxPython is important because it's free, even for commercial purposes (unlike PyQt).
  • His article even contains a slightly controversial section entitled "Is async/await an anti-pattern?" which refers to the phenomenon of the async keyword potentially spreading through one's codebase, and some thoughts on how to mitigate that.

Michael: Mongo license followup

  • Will S. told me I was wrong! And I was. :)
  • The main clarification I wanted to make above was that the AGPL has been around for a while, and it is the new SSPL from MongoDB that targets cloud providers.
  • Also, one other point I didn't mention -- the reason the SSPL isn't considered open source is that it places additional conditions on providing the software as a service and the OSI's open source definition requires no discrimination based on field of endeavor.

Michael: python2 becomes self-aware, enters fifth stage of grief

python2 -m pip list DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7.

Michael: PyDist — Simple Python Packaging

  • Your private and public dependencies, all in one place.
  • Looks to be paid, but with free beta?
  • It mirrors the public PyPI index, and keeps packages and releases that have been deleted from PyPI. It allows organizations to upload their own private dependencies, and seamlessly create private forks of public packages. And it integrates with standard Python tools almost as well as PyPI does.

A metajoke: pip install --user pyjokes or even better pipx install pyjokes. Then:

$ pyjoke

[hilarity ensues! …]

Audio Download

Posted on 16 March 2019 | 8:00 am

#120 AWS, MongoDB, and the Economic Realities of Open Source and more

Sponsored by

Brian #1: The Ultimate Guide To Memorable Tech Talks
  • Nina Zakharenko
  • 7 part series that covers choosing a topic, writing a talk proposal, tools, planning, writing, practicing, and delivering the talk
  • I’ve just read the tools section, and am looking forward to the rest of the series.
    • From the tools section: “I noticed I’d procrastinate on making the slides look good instead of focusing my time on making quality content.”
Michael #2: Running Flask on Kubernetes
  • via & Michael Herman
  • What is Kubernetes?
  • A step-by-step tutorial that details how to deploy a Flask-based microservice (along with Postgres and Vue.js) to a Kubernetes cluster.
  • Goals of tutorial
    1. Explain what container orchestration is and why you may need to use an orchestration tool
    2. Discuss the pros and cons of using Kubernetes over other orchestration tools like Docker Swarm and Elastic Container Service (ECS)
    3. Explain the following Kubernetes primitives - Node, Pod, Service, Label, Deployment, Ingress, and Volume
    4. Spin up a Python-based microservice locally with Docker Compose
    5. Configure a Kubernetes cluster to run locally with Minikube
    6. Set up a volume to hold Postgres data within a Kubernetes cluster
    7. Use Kubernetes Secrets to manage sensitive information
    8. Run Flask, Gunicorn, Postgres, and Vue on Kubernetes
    9. Expose Flask and Vue to external users via an Ingress
Brian #3: Changes in the CI landscape Michael #4: Python server setup for macOS 🍎
  • what: hello world for Python server setup on macOS
  • why: most guides show setup on a Linux server (which makes sense) but macoS is useful for learning and for local dev
Brian #5: Learn Enough Python to be Useful: argparse
  • How to Get Command Line Arguments Into Your Scripts - Jeff Hale
  • “argparse is the “recommended command-line parsing module in the Python standard library.” It’s what you use to get command line arguments into your program.
  • “I couldn’t find a good intro guide for argparse when I needed one, so I wrote this article.”
Michael #6: AWS, MongoDB, and the Economic Realities of Open Source
  • Related podcast:
  • Last week, from the AWS blog:

    Today we are launching Amazon DocumentDB (with MongoDB compatibility), a fast, scalable, and highly available document database that is designed to be compatible with your existing MongoDB applications and tools. Amazon DocumentDB uses a purpose-built SSD-based storage layer, with 6x replication across 3 separate Availability Zones. The storage layer is distributed, fault-tolerant, and self-healing, giving you the the performance, scalability, and availability needed to run production-scale MongoDB workloads.

  • Like an increasing number of such projects, MongoDB is open source…or it was anyways. MongoDB Inc., a venture-backed company that IPO’d in October, 2017, made its core database server product available under the GNU Affero General Public License (AGPL).

  • AGPL extended the GPL to apply to software accessed over a network; since the software is only being used, not copied
  • MongoDB’s Business Model
  • We believe we have a highly differentiated business model that combines the developer mindshare and adoption benefits of open source with the economic benefits of a proprietary software subscription business model.
    • MongoDB enterprise and MongoDB atlas
  • Basically, MongoDB sells three things on top of its open source database server:
    • Additional tools for enterprise companies to implement MongoDB
    • A hosted service for smaller companies to use MongoDB
    • Legal certainty
  • What AWS Sells
  • the value of software is typically realized in three ways:
    • First is hardware.
    • Second is licenses. This was Microsoft’s core business for decades: licenses sold to OEMs (for the consumer market) or to companies directly (for the enterprise market).
    • Third is software-as-a-service.
  • AWS announced last week: > The storage layer is distributed, fault-tolerant, and self-healing, giving you the the performance, scalability, and availability needed to run production-scale MongoDB workloads.
  • AWS is not selling MongoDB: what they are selling is “performance, scalability, and availability.” DocumentDB is just one particular area of many where those benefits are manifested on AWS.
  • Thus we have arrived at a conundrum for open source companies:
    • MongoDB leveraged open source to gain mindshare.
    • MongoDB Inc. built a successful company selling additional tools for enterprises to run MongoDB.
    • More and more enterprises don’t want to run their own software: they want to hire AWS (or Microsoft or Google) to run it for them, because they value performance, scalability, and availability.
  • This leaves MongoDB Inc. not unlike the record companies after the advent of downloads: what they sold was not software but rather the tools that made that software usable, but those tools are increasingly obsolete as computing moves to the cloud. And now AWS is selling what enterprises really want.
  • This tradeoff is inescapable, and it is fair to wonder if the golden age of VC-funded open source companies will start to fade (although not open source generally). The monetization model depends on the friction of on-premise software; once cloud computing is dominant, the economic model is much more challenging.

PyTexas 2019 at #Austin on Apr 13th and 14th. Registrations now open. More info at

Michael: Sorry Ant!

Michael: RustPython follow up:

  • Q: Why was the developer unhappy at their job?
  • A: They wanted arrays.

  • Q: Where did the parallel function wash its hands?

  • A: Async

Audio Download

Posted on 5 March 2019 | 8:00 am

#119 Assorted files as Django ORM backends with Alkali

Sponsored by

Special guests

Michael #1: Incrementally migrating over one million lines of code from Python 2 to Python 3
  • Weighing in at over 1 million lines of Python logic, we had a massive surface area for potential issues in our migration from Python 2 to Python 3
  • First Py3 commit, hack week 2015
    • Unfortunately, it was clear that many features were completely broken by the upgrade
  • Official start H1 2017
  • Armed with Mypy, a static type-checking tool that we had adopted in the interim year, they made substantial strides towards enabling the Python 3 migration:
    • Ported our custom fork of Python to version 3.5
    • Upgraded some Python dependencies to Python 3-compatible versions, and forked some others (e.g. babel)
    • Modified some Dropbox client code to be Python 3 compatible
    • Set up automated jobs in our continuous integration (CI) to run the existing unit tests with the Python 3 interpreter, and Mypy type-checking in Python 3 mode
  • Crucially, the automated tests meant that we could be certain that the limited Python 3 compatibility that existed would not have regressed when the project was picked up again.
  • Prerequisites
  • Before we could begin working on migrating any of our application logic, we had to ensure that we could load the Python 3 interpreter and run until the entry point of the application. In the past, we had used “freezer” scripts to do this for us. However, none of these had support for Python 3 around this time, so in late 2016, we built a custom, more native solution which we internally referred to as “Anti-freeze” (more on that in the initial Python 3 migration blog post).
  • Incrementally enabling unit tests and type-checking
  • ‘Straddling’ Python 2 and Python 3
  • Letting it bake
  • Learnings (tl;dr)
    • Unit tests and typing are invaluable.
    • String encoding in Python is hard.
    • Incrementally migrate to Python 3 for great profit.
Eric #2: Network Automation Development with Python (for fun and for profit) Trey #3: Alkali file as DB
  • If you have structured data you want to query (like RSS feed, CSV, JSON, or any custom format of your own creation) you can use a Django ORM-like syntax to query it
  • Save it to the same format or a different format because you control both the reading and the writing
  • Kurt is at PyCascades so I got to chat with him about this
Dan #4: Carnegie Mellon Launches Undergraduate Degree in Artificial Intelligence **
  • Carnegie Mellon University's School of Computer Science will offer a new undergraduate degree in artificial intelligence beginning this fall
  • The first offered by a U.S. university
  • "Specialists in artificial intelligence have never been more important, in shorter supply or in greater demand by employers," said Andrew Moore, dean of the School of Computer Science.
  • The bachelor's degree in AI will focus more on how complex inputs — such as vision, language and huge databases — are used to make decisions or enhance human capabilities
Michael #5: asyncio + PyQt5/PySide2 Dan #6: 4 things I want to see in Python 4.0
  1. JIT as a first class feature
  2. A stable .0 release
  3. Static type hinting
  4. A GPU story for multiprocessing
  5. More community contributions

Michael: My Python Async webcast recording is now available. Michael: PyCon Israel in the first week of June (, and the CFP opened today: Dan: Python Basics Book

  • Q: Why did the developer ground their kid?
  • A: They weren't telling the truthy

Audio Download

Posted on 26 February 2019 | 8:00 am

#118 Better Python executable management with pipx

Sponsored by

Brian #1: Frozen-Flask
  • “Frozen-Flask freezes a Flask application into a set of static files. The result can be hosted without any server-side software other than a traditional web server.”
  • 2012 tutorial, Dead easy yet powerful static website generator with Flask
  • Some of it is out of date, but it does point to the power of Frozen-Flask, as well as highlight a cool plugin, Flask-FlatPages, which allows pages from markdown.
Michael #2: pipx
  • by Chad Smith
  • Last week we spoke about pythonloc
  • Execute binaries from Python packages in isolated environments
  • "binary" to describe a CLI application that can be run directly from the command line
  • Features
    • Safely install packages to isolated virtual environments, while globally exposing their CLI applications so you can run them from anywhere
    • Easily list, upgrade, and uninstall packages that were installed with pipx
    • Run the latest version of a CLI application from a package in a temporary virtual environment, leaving your system untouched after it finishes
    • Run binaries from the __pypackages__ directory per PEP 582 as companion tool to pythonloc
    • Runs with regular user permissions, never calling sudo pip install ... (you aren't doing that, are you? 😄).
  • You can globally install a CLI application by running: pipx install PACKAGE
  • "Just the “pipx upgrade-all” command is already a huge win over pipsi"
  • Check out How does this compare to pipsi?
Brian #3: Data science is different now
  • Vicki Boykis
  • There’s lots of buzz around data science.
  • This has resulted in loads of new data scientists looking for junior level positions.
    • Coming from boot camps, MOOCs, self taught, remote degrees, and other training.
  • “.. now that data science has changed from a buzzword to something even larger companies outside of the Silicon Valley bubble hire for, positions have not only become more codified, but with more rigorous entry requirements that will prefer people with previous data science experience every time.”
  • “ … the market can be very hard, and very discouraging for the flood of beginners.”
  • Data science is a misleading job req
    • “The reality is that “data science” has never been as much about machine learning as it has about cleaning, shaping data, and moving it from place to place.”
  • Advice:
    • Don’t get into data science (this amuses me).
    • “Don’t do what everyone else is doing, because it won’t differentiate you.”
      • “It’s much easier to come into a data science and tech career through the “back door”, i.e. starting out as a junior developer, or in DevOps, project management, and, perhaps most relevant, as a data analyst, information manager, or similar, than it is to apply point-blank for the same 5 positions that everyone else is applying to. It will take longer, but at the same time as you’re working towards that data science job, you’re learning critical IT skills that will be important to you your entire career.”
    • Learn the skills needed for data science today
      • Creating Python packages
      • Putting R in production
      • Optimizing Spark jobs so they run more efficiently
      • Version controlling data
      • Making models and data reproducible
      • Version controlling SQL
      • Building and maintaining clean data in data lakes
      • Tooling for time series forecasting at scale
      • Scaling sharing of Jupyter notebooks
      • Thinking about systems for clean data
      • Lots of JSON
  • Data science is turning more and more into a mostly engineering field.
  • Data scientists need to have “good generalist engineering skills with a data background.”
Michael #4: RustPython
  • via Fredrik Averpil
  • A Python-3 (CPython >= 3.5.0) Interpreter written in Rust.
  • Seems pretty active: Latest commit ac95b61 an hour ago…
  • Goals
    • Full Python-3 environment entirely in Rust (not CPython bindings)
    • A clean implementation without compatibility hacks
  • Contributing
    • To start contributing, there are a lot of things that need to be done.
    • Most tasks are listed in the issue tracker. Check issues labeled with good first issue if you wish to start coding.
  • Rust does have direct WebAssembly support…
Brian #5: Jupyter Notebook: An Introduction
  • Mike Driscoll on RealPython
  • Not the “all the cool things you can do with it”, but the “really, how do I start” tutuorial.
    • I think it should have included a mention of installing it in a venv and how to use %pip install, so I’ll include those things in these notes.
  • Installing with pip install jupyter .
    • Also a note that Jupyter is included with the Anaconda distribution.
    • Note: Like everything else, I always install it in a virtual environment, if using pip, so the real installation instructions I recommend is:
      • python3 -m venv venv --``prompt jupyter
      • source venv/bin/activate OR venv\scripts\activate.bat if windows
      • pip install jupyter
      • pip install [HTML_REMOVED]
      • jupyter notebook
      • That will launch a localhost web interface.
  • Creating a new notebook within the web interface.
  • Changing the “Untitled” name by clicking on the name.
    • This was not obvious to me.
  • Running cells, including the shift-enter keyboard shortcut.
  • A run through the menu, stopping at non-obvious places
    • “File” has “Save and Checkpoint” which is super cool.
    • “Edit” has cell cut, copy, paste. But also has delete, split, merge, and cell movement.
    • “Cell” menu has lots of cool run options, like “Run all above” and “Run all below” and others.
  • Not just Python, but you can have a terminal sessions and more from within Jupyter.
  • A look at the “Running” tab.
  • Quick overview of the markdown support for markdown cells
  • Exporting notebooks using jupyter nbconvert

  • Extra notes on installing packages from Jupyter:

    • To pip install from the notebook, do this: %pip install numpy in a code cell.
Michael #6: Python Developers Survey 2018 Results
  • Python usage as a main language is up 5 percentage points from 79% in 2017 when Python Software Foundation conducted its previous survey.
  • What do you use Python for? (2018/2017)
    • 59%/51% Data analysis
    • 56%/54% Web dev
    • 39%/32% ML
    • Web development is the only category with a large gap (56% vs. 36%) separating those using Python as their main language vs. as a supplementary language. For other types of development, the differences are much smaller.
  • What do you use Python for the most? (single answer)
    • 29%/29% web dev
    • 17%/17% data analysis
    • 11%/8% ML
  • Like last year:
    • 27% (Web development) ≈ 28% (Scientific development)
      • Science = 17% + 11% for Data analysis + Machine learning
  • Python 3 vs Python 2
    • 84% Python 3 vs 16% Python 2. The use of Python 3 continues to grow rapidly. According to the latest research in 2017, 75% were using Python 3 compared with 25% for Python 2.
  • Top 4 web frameworks (majority to the first two):
    • Flask
    • Django
    • Tornado
    • Pyramid
  • Databases
    • PostgreSQL
    • MySQL
    • SQLite
    • MongoDB
  • ORMs
    • SQLAlchemy and Django ORM tied
  • “Mentored sprints for diverse beginners” at PyCon
    • A newcomer’s introduction to contributing to an open source project”
    • Call for applications for projects open Feb 8 to March 14
    • Call for contributors, participants in the sprint also open Feb 8 to March 14
    • If you are wondering if this event is for you: it definitely is and we would love to have you taking part in this sprint.”
    • “This mentored sprint will take place on Saturday, May 4th, 2019 from 2:35pm to 6:30pm”
Joke: via Florian Q: If you have some pseudo code (say in sample.txt) how do you most easily convert it to Python? A: Change the extension to .py

Extra Joke: Python Song (with chapters!)

Audio Download

Posted on 22 February 2019 | 8:00 am