Python Bytes

by Michael Kennedy and Brian Okken

Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.

  

Latest Episodes

#117 Is this the end of Python virtual environments?

Sponsored by pythonbytes.fm/datadog

Brian #1: Goodbye Virtual Environments?

  • by Chad Smith
  • venv’s are great but they introduce some problems as well:
    • Learning curve: explaining “virtual environments” to people who just want to jump in and code is not always easy
    • Terminal isolation: Virtual Environments are activated and deactivated on a per-terminal basis
    • Cognitive overhead: Setting up, remembering installation location, activating/deactivating
  • PEP 582 — Python local packages directory
    • This PEP proposes to add to Python a mechanism to automatically recognize a __pypackages__directory and prefer importing packages installed in this location over user or global site-packages. This will avoid the steps to create, activate or deactivate “virtual environments”. Python will use the __pypackages__ from the base directory of the script when present.
  • Try it now with pythonloc
    • pythonloc is a drop in replacement for python and pip that automatically recognizes a __pypackages__ directory and prefers importing packages installed in this location over user or global site-packages. If you are familiar with node, __pypackages__ works similarly to node_modules.
    • Instead of running python you run pythonloc and the __pypackages__ path will automatically be searched first for packages. And instead of running pip you run piploc and it will install/uninstall from __pypackages__.

Michael #2: webassets

  • Bundles and minifies CSS & JS files
  • Been doing a lot of work to rank higher on the sites
  • That lead me to Google’s Lighthouse
  • Despite 25ms response time to the network, Google thought my site was “kinda slow”, yikes!
  • webassets has integration for the big three: Django, Flask, & Pyramid.
    • But I prefer to just generate them and serve them off disk
    def build_asset(env: webassets.Environment, 
                   files: List[str], 
                   filters: str, 
                   output: str):
        bundle = webassets.Bundle(
            *files,
            filters=filters,
            output=output,
            env=env
        )
        bundle.build(force=True)

Brian #3: Bernat on Python Packaging

Michael #4: What the mock? — A cheatsheet for mocking in Python

  • Nice introduction
  • Some examples
    @mock.patch('work.os')
        def test_using_decorator(self, mocked_os):
            work_on()
    mocked_os.getcwd.assert_called_once()

And

        def test_using_context_manager(self):
            with mock.patch('work.os') as mocked_os:
                work_on()
    mocked_os.getcwd.assert_called_once()

Brian #5: Transitions: The easiest way to improve your tech talk

  • By Saron Yitbarek
  • Jeff Atwood of CodingHorror noted “The people who can write and communicate effectively are, all too often, the only people who get heard. They get to set the terms of the debate.”
  • Effectively presenting is part of effective communication.
  • I love the focus of this article. Focused on one little aspect of improving the performance of a tech talk.

Michael #6: Steering council announced

  • Our new leaders are
    • Barry Warsaw
    • Brett Cannon
    • Carol Willing
    • Guido van Rossum
    • Nick Coghlan
  • Via Joe Carey
  • We both think it’s great Guido is on the council.

Extras:

Joke:

From the list from Ant, my votes.

  • Q: What's the second movie about a database engineer called? A: The SQL.

  • !false It's funny 'cause it's true.

  • A programmer's spouse tells them, "Run to the store and pick up a loaf of bread. If they have eggs, get a dozen." The programmer comes home with 12 loaves of bread.


Audio Download

Posted on 14 February 2019 | 8:00 am


#116 So you want Python in a 3D graphics engine?

Sponsored by pythonbytes.fm/digitalocean

Brian #1: Inside python dict — an explorable explanation

  • Interactive tutorial on dictionaries
    • Searching efficiently in a list
    • Why are hash tables called has tables?
    • Putting it all together to make an “almost”-Python-dict
    • How Python dict really works internally
  • Yes this is a super deep dive, but wow it’s cool.
  • Tons of the code is runnable right there in the web page, including moving visual representations, highlighted code with current line of code highlighted.
  • Some examples allow you to edit values and play with stuff.

Michael #2: Embed Python in Unreal Engine 4

Brian #3: Redirecting stdout with contextlib

  • When I want to test the stdout output of some code, that’s easy, I grab the capsys fixture from pytest.
  • But what if you want to grab the stdout of a method NOT while testing?
  • Enter [contextlib.redirect_stdout(new_target)](https://docs.python.org/3/library/contextlib.html#contextlib.redirect_stdout)
  • so cool. And very easy to read.
  • ex:
    f = io.StringIO()
    with redirect_stdout(f):
        help(pow)
    s = f.getvalue()
  • also a version for stderr

Michael #4: Panda3D

  • via Kolja Lubitz
  • Panda3D is an open-source, completely free-to-use engine for realtime 3D games, visualizations, simulations, experiments
  • Not just games, could be science as well!
  • The full power of the graphics card is exposed through an easy-to-use API. Panda3D combines the speed of C++ with the ease of use of Python to give you a fast rate of development without sacrificing on performance.
  • Features:
    • Platform Portability
    • Flexible Asset Handling: Panda3D includes command-line tools for processing and optimizing source assets, allowing you to automate and script your content production pipeline to fit your exact needs.
    • Library Bindings: Panda3D comes with out-of-the-box support for many popular third-party libraries, such as the Bullet physics engine, Assimp model loader, OpenAL
    • Performance Profiling: Panda3D includes pstats — an over-the-network profiling system designed to help you understand where every single millisecond of your frame time goes.

Brian #5: Why PyPI Doesn't Know Your Projects Dependencies

  • Some questions you may have asked: > How can I produce a dependency graph for Python packages? > Why doesn’t PyPI show a project’s dependencies on it’s project page? > How can I get a project’s dependencies without downloading the package? > Can I search PyPI and filter out projects that have a certain dependency?
  • If everything is in requirements.txt, you just might be able to, but…
  • setup.py is dynamic. You gotta run it to see what’s needed.
  • Dependencies might be environment specific. Windows vs Linux vs Mac, as an example.
  • Nothing stopping someone from putting random.choice() for dependencies in a setup.py file. But that would be kinda evil. But could be done. (Listener homework?)
  • The wheel format is way more predictable because it limits some of this freedom. wheels don’t get run when they install, they really just get unpacked.
  • More info on wheels: Kind of a tangent, but what why not:
    • From: https://pythonwheels.com
    • Advantages of wheels
      • Faster installation for pure Python and native C extension packages.
      • Avoids arbitrary code execution for installation. (Avoids setup.py)
      • Installation of a C extension does not require a compiler on Linux, Windows or macOS.
      • Allows better caching for testing and continuous integration.
      • Creates .pyc files as part of installation to ensure they match the Python interpreter used.
      • More consistent installs across platforms and machines.”

Michael #6: PyGame series

Extras:

Joke (maybe, Brain feel free to pick another one):

  • via @realpython
  • Why do Pythons live on land? They are above C-level!


Audio Download

Posted on 6 February 2019 | 8:00 am


#115 Dataclass CSV reader and Nina drops by

Sponsored by pythonbytes.fm/datadog

Special guest: Nina Zakharenko

Brian #1: Great Expectations
  • A set of tools intended for batch time testing of data pipeline data.
  • Introduction to the problem doc: Down with Pipeline debt / Introducing Great Expectations
  • expect_[something]() methods that return json formatted descriptions of whether or not the passed in data matches your expectations.
  • Can be used programmatically or interactively in a notebook. (video demo).
  • For programmatic use, I’m assuming you have to put code in place to stop a pipeline stage if expectations aren’t met, and write failing json result to a log or something.
  • Examples, just a few, full list is big:
    • Table shape:
      • expect_column_to_exist, expect_table_row_count_to_equal
  • Missing values, unique values, and types: - expect_column_values_to_be_unique, expect_column_values_to_not_be_null
    • Sets and ranges
      • expect_column_values_to_be_in_set
    • String matching
      • expect_column_values_to_match_regex
    • Datetime and JSON parsing
    • Aggregate functions
      • expect_column_stdev_to_be_between
    • Column pairs
    • Distributional functions
      • expect_column_chisquare_test_p_value_to_be_greater_than
Nina #2: Using CircuitPython and MicroPython to write Python for wearable electronics and embedded platforms
  • I’ve been playing with electronics projects as a hobby for the past two years, and a few months ago turned my attention to Python on microcontrollers
  • MicroPython is a lean and efficient implementation of Python3 that can run on microcontrollers with just 256k of code space, and 16k of RAM. CircuitPython is a port of MicroPython, optimized for Adafruit devices.
  • Some of the devices that run Python are as small as a quarter.
  • My favorite Python hardware platform for beginners is Adafruit’s Circuit PlayGround Express. It has everything you need to get started with programming hardware without soldering. All you’ll need is alligator clips for the conductive pads.
    • The board features NeoPixel LEDs, buttons, switches, temperature, motion, and sound sensors, a tiny speaker, and lots more. You can even use it to control servos, tiny motor arms.
    • Best of all, it only costs $25.
  • If you want to program the Circuit PlayGround Express with a drag-n-drop style scratch-like interface, you can use Microsoft’s MakeCode. It’s perfect for kids and you’ll find lots of examples on their site.
  • Best of all, there are tons of guides for Python projects to build on their website, from making your own synthesizers, to jewelry, to silly little robots.
  • Check out the repo for my Python-powered earrings, see a photo, or a demo.
  • Sign up for the Adafruit Python for Microcontrollers mailing list here, or see the archives here.
Michael #3: Data class CSV reader
  • Map CSV to Data Classes
  • You probably know about reading CSV files
    • Maybe as tuples
    • Better with csv.DictReader
  • This library is similar but maps Python 3.7’s data classes to rows of CSV files
  • Includes type conversions (say string to int)
  • Automatic type conversion. DataclassReader supports str, int, float, complex and datetime
  • DataclassReader use the type annotation to perform validation of the data of the CSV file.
  • Helps you troubleshoot issues with the data in the CSV file. DataclassReader will show exactly in which line of the CSV file contain errors.
  • Extract only the data you need. It will only parse the properties defined in the dataclass
  • It uses dataclass features that let you define metadata properties so the data can be parsed exactly the way you want.
  • Make the code cleaner. No more extra loops to convert data to the correct type, perform validation, set default values, the DataclassReader will do all this for you
  • Default fallback values, more.
Brian #4: How to Rock Python Packaging with Poetry and Briefcase
  • Starts with a discussion of the packaging (for those readers that don’t listen to Python Bytes, I guess.) However, it also puts flit, pipenv, and poetry in context with each other, which is nice.
  • Runs through a tutorial of how to build a pyproject.toml based project using poetry and briefcase.
  • We’ve talked about Poetry before, on episode 100.
  • pyproject.toml is discussed extensively on Test & Code 52.
  • briefcase is new, though, it’s a project for creating standalone native applications for Mac, Windows, Linux, iOS, Android, and more.
  • The tutorial also discusses using poetry directly to publish to the test-pypi server. This is a nice touch. Use the test-pypi before pushing to the real pypi. Very cool.
Nina #5: awesome-python-security *🕶🐍🔐, a collection of tools, techniques, and resources to make your Python more secure*
  • All of your production and client-facing code should be written with security in mind
  • This list features a few resources I’ve heard of such as Anthony Shaw’s excellent 10 common security gotchas article which highlights problems like input injection and depending on assert statements in production, and a few that are new to me:
  • OWASP (Open Web Application Security Project) Python Resources at pythonsecurity.org
  • bandit a tool to find common security issues in Python
    • bandit features a lot of useful plugins, that test for issues like:
      • hardcoded password strings
      • leaving flask debug on in production
      • using exec() in your code
      • & more
  • detect-secrets, a tool to detect secrets left accidentally in a Python codebase
  • & lots more like resources for learning about security concepts like cryptography
  • See the full list for more
Michael #6: pydbg
  • Python implementation of the Rust dbg macro
  • Best seen with an example. Rather than printing things you want to inspect, you:
    a = 2
    b = 3

    dbg(a+b)

    def square(x: int) -> int:
        return x * x

    dbg(square(a))

outputs:

    [testfile.py:4] a+b = 5
    [testfile.py:9] square(a) = 4
Extras:

Brian:

  • pathlib + pytest tmpdir → tmp_path & tmp_path_factory

Michael:

  • The Art of Python is a miniature arts festival at PyCon North America 2019, focusing on narrative, performance, and visual art. We intend to encourage and showcase novel art that helps us share our emotionally charged experiences of programming (particularly in Python). We hope that by attending, our audience will discover new aspects of empathy and rapport, and find a different kind of delight and perspective than might otherwise be expected at a large conference.
  • StackOverflow Survey is Open! https://stackoverflow.az1.qualtrics.com/jfe/form/SV_1RGiufc1FCJcL6B
  • NumPy Is Awaiting Fix for Critical Remote Code Execution Bug
    • via Doug Sheehan
    • The issue was raised on January 16 and affects NumPy versions 1.10 (released in 2015) through 1.16, which is the latest release at the moment, released on January 14
    • The problem is with the 'pickle' module, which is used for transforming Python object structures into a format that can be stored on disk or in databases, or that allows delivery across a network.
    • The issue was reported by security researcher Sherwel Nan, who says that if a Python application loads malicious data via the numpy.load function an attacker can obtain remote code execution on the machine.
  • Get your google data

Nina:

  • I’m teaching a two day Intro and Intermediate Python course on March 19th and 20th. The class will live-stream for free here on each day of or join in-person from downtown Minneapolis. All of the course materials will be released for free as well.
  • I recently recorded a series of videos with Carlton Gibson (Django maintainer) on developing Django Web Apps with VS Code, deploying them to Azure with a few clicks, setting up a Continuous Integration / Continuous Delivery pipeline, and creating serverless apps. Watch the series here: https://aka.ms/python-videos
  • I’ll be a mentor at a brand new hatchery event at PyCon US 2019, mentored sprints for diverse beginners organized by Tania Allard. The goal is to help underrepresented folks at PyCon contribute to open source in a supportive environment. The details will be located here (currently a placeholder) when they’re finalized.
  • Catch my talk about electronics projects in Python with LEDs at PyCascades in Seattle on February 24th. Currently tickets are still for sale.
  • If you haven’t tried the Python extension for VS Code, now is a great time. The December release included some killer features, such as remote Jupyter support, and exporting Python files as Jupyter notebooks. Keep up with future releases at the Python at Microsoft blog.
  • Q: What do you call a snake that only eats desert? A: A pie-thon. (might not make sense read out loud)
  • Q: How do you measure a python? A: In inches. They don't have any feet!
  • Q: What is a python’s favorite subject? Hiss-tory!


Audio Download

Posted on 2 February 2019 | 8:00 am


#114 What should be in the Python standard library?

Sponsored by pythonbytes.fm/digitalocean

Brian #1: What should be in the Python standard library?
  • on lwn.net by Jake Edge
  • There was a discussion recently about what should be in the standard library, triggered by a request to add LZ4 compression.
  • Kinda hard to summarize but we’ll try:
    • Jonathan Underwood proposed adding LZ4 compression to stdlib.
    • Can of worms opened
    • zlib and bz2 already in stdlib
    • Brett proposed making something similar to hashlib for compression algorithms.
    • Against adding it:
      • lz4 not needed for stdlib, and actually, bz2 isn’t either, but it’s kinda late to remove.
    • PyPI is easy enough. put stuff there.
    • Led to a discussion of the role of stdlib.
      • If it’s batteries included, shouldn’t we add new batteries
      • Some people don’t have access to PyPI easily
      • Do we never remove elements? really?
      • Maybe we should have a lean stdlib and a thicker standard distribution of selected packages
        • who would decide?
        • same problem exists then of depending on it. How to remove stuff?
        • Steve Dower would rather see a smaller standard library with some kind of "standard distribution" of PyPI modules that is curated by the core developers.
      • A leaner stdlib could speed up Python version schedules and reduce burden on core devs to maintain seldom used packages.
    • See? can of worms.
    • In any case, all this would require a PEP, so we have to wait until we have a PEP process decided on.
Michael #2: Data Science portal for Home Assistant launched
  • via Paul Cutler
  • Home Assistant is launching a data science portal to teach you how you can learn from your own smart home data.
  • In 15 minutes you setup a local data science environment running reports.
  • A core principle of Home Assistant is that a user has complete ownership of their personal data. A users data lives locally, typically on the SD card in their Raspberry Pi
  • The Home Assistant Data Science website is your one-stop-shop for advice on getting started doing data science with your Home Assistant data.
  • To accompany the website, we have created a brand new Hass.io Add-on JupyterLab lite, which allows you to run a data science IDE called JupyterLab directly on your Raspberry Pi hosting Home Assistant. You do your data analysis locally, your data never leaves your local machine.
  • When you build something cool, you can share the notebook without the results, so people can run it at their homes too.
  • We have also created a Python library called the HASS-Data-Detective which makes it super easy to get started investigating your Home Assistant data using modern data science tools such as Pandas.
  • Check out the Getting Started notebook
  • IoT aside: I finally found my first IoT project: Recording in progress button.
Brian #3: What's the future of the pandas library?
  • Kevin Markham over at dataschool.io
  • pandas is gearing up to move towards a 1.0 release. Currently rc-ing 0.24
  • Plans are to get there “early 2019”.
  • Some highlights
    • method chaining - encouraged by core team
      • to encourage further, more methods will support chaining
    • Apache arrow likely to be part of pandas backend sometime after 1.0
    • Extension arrays - allow you to create custom data types
    • deprications
      • inplace parameter. It doesn’t work with chaining, doesn’t actually prevent copies, and causes codebase complexity
      • ix accessor, use loc and iloc instead
      • Panel data structure. Use MultiIndex instead
      • SparseDataFrame. Just use a normal DataFrame
      • legacy python support
Michael #4: PyOxidizer
  • PyOxidizer is a collection of Rust crates that facilitate building libraries and binaries containing Python interpreters.
  • PyOxidizer is capable of producing a single file executable - with all dependencies statically linked and all resources (like .pyc files) embedded in the executable
  • The Oxidizer part of the name comes from Rust: executables produced by PyOxidizer are compiled from Rust and Rust code is responsible for managing the embedded Python interpreter and all its operations.
  • PyOxidizer is similar in nature to PyInstaller, Shiv, and other tools in this space. What generally sets PyOxidizer apart is
    • Produced executables contain an embedded, statically-linked Python interpreter
    • have no additional run-time dependency on the target system
    • runs everything from memory (as opposed to e.g. extracting Python modules to a temporary directory and loading them from there).
Brian #5: Working With Files in Python
  • by Vuyisile Ndlovu on RealPython
  • Very comprehensive write up on working with files and directories
  • Includes legacy and modern methods.
    • Pay attention to pathlib parts if you are using 3.4 plus
    • Also great for “if you used to do x, here’s how to do it with pathlib”.
  • Included:
    • Directory listings
    • getting file attributes
    • creating directories
    • file name pattern matching
    • traversing directories doing stuff with the files in there
    • creating temp directories and files
    • deleting, copying, moving, renaming
    • archiving with zip and tar including reading those
    • looping over files
Michael #6: $ python == $ python3?
  • via David Furphy
  • Homebrew tried this recently & got "persuaded" to reverse.
  • Also in recent discussion of edits to PEP394, GvR said absolutely not now, probably not ever.
  • Guido van Rossum
    • RE: python doesn’t exist on macOS as a command: Did you mean python2 there? In my experience macOS comes with python installed (and invoking Python 2) but no python2 link (hard or soft). In any case I'm not sure how this strengthens your argument.
    • I'm also still unhappy with any kind of endorsement of python pointing to python3. When a user gets bitten by this they should receive an apology from whoever changed that link, not a haughty "the PEP endorses this".
    • Regardless of what macOS does I think I would be happier in a future where python doesn't exist and one always has to specify python2 or python3. Quite possibly there will be an age where Python 2, 3 and 4 all overlap, and EIBTI.
Extras:

Michael: A letter to the Python community in Africa

  • via Anthony Shaw
  • Believe the broader international Python and Software community can learn a lot from what so many amazing people are doing across Africa.
  • e.g. The attendance of PyCon NA was 50% male and 50% female.
Joke: via Luke Russell: A: “Knock Knock” B: “Who’s There" A: ……………………………………………………………………………………….“Java”

Also: Java 4EVER video is amazing: youtube.com/watch?v=kLO1djacsfg


Audio Download

Posted on 26 January 2019 | 8:00 am


#113 Python Lands on the Windows 10 App Store

Sponsored by https://pythonbytes.fm/digitalocean

Brian #1: Advent of Code 2018 Solutions
  • Michael Fogleman
  • Even if you didn’t have time or energy to do the 2018 AoC, you can learn from other peoples solutions. Here’s one set written up in a nice blog post.
Michael #2: Python Lands on the Windows 10 App Store
  • Python Software Foundation recently released Python 3.7 as an app on the official Windows 10 app store.
  • Python 3.7 is now available to install from the Microsoft Store, meaning you no longer need to manually download and install the app from the official Python website.
  • there is one limitation. “Because of restrictions on Microsoft Store apps, Python scripts may not have full write access to shared locations such as TEMP and the registry.
  • Discussed with Steve Dower over on Talk Python 191
Brian #3: How I Built A Python Web Framework And Became An Open Source Maintainer
  • Florimond Manca
  • Bocadillo - “A modern Python web framework filled with asynchronous salsa”
  • maintaining an open source project is a marathon, not a sprint.”
  • Tips at the end of the article include tips for the following topics, including recommendations and tool choices:
    • Project definition
    • Marketing & Communication
    • Community
    • Project management
    • Code quality
    • Documentation
    • Versioning and releasing
Michael #4: Python maintainability score via Wily
  • via Anthony Shaw
  • A Python application for tracking, reporting on timing and complexity in tests
  • Easiest way to calculate it is with wily https://github.com/tonybaloney/wily … the metrics are ‘maintainability.mi’ and ‘maintainability.rank’ for a numeric and the A-F scale.
    • Build an index: wily build src
    • Inspect report: wily report file
    • Graph: wily graph file metric
Brian #5: A couple fun awesome lists
  • Awesome Python Security resources
    • Tools
      • web framework hardening, ex: secure.py
      • multi tools
      • static code analysis, ex: bandit
      • vulnerabilities and security advisories
      • cryptography
      • app templates
    • Education
      • lots of resources for learning
    • Companies
  • Awesome Flake8 Extensions
    • clean code
    • testing, including
    • security
    • documentation
    • enhancements
    • copyrights
Michael #6: fastlogging
  • via Robert Young
  • A faster replacement of the standard logging module with a mostly compatible API.
  • For a single log file it is ~5x faster and for rotating log file ~13x faster.
  • It comes with the following features:
    • (colored, if colorama is installed) logging to console
    • logging to file (maximum file size with rotating/history feature can be configured)
    • old log files can be compressed (the compression algorithm can be configured)
    • count same successive messages within a 30s time frame and log only once the message with the counted value.
    • log domains
    • log to different files
    • writing to log files is done in (per file) background threads, if configured
    • configure callback function for custom detection of same successive log messages
    • configure callback function for custom message formatter
    • configure callback function for custom log writer

Joke: >>> import antigravity


Audio Download

Posted on 18 January 2019 | 8:00 am


#112 Don't use the greater than sign in programming

Sponsored by https://pythonbytes.fm/datadog

Brian #1: nbgrader
  • nbgrader: A Tool for Creating and Grading Assignments in the Jupyter Notebook
    • The Journal of Open Source Education, paper accepted 6-Jan-2019
  • nbgrader documentation, including a intro video
  • From the JOSE article:
    • “nbgrader is a flexible tool for creating and grading assignments in the Jupyter Notebook (Kluyver et al., 2016). nbgrader allows instructors to create a single, master copy of an assignment, including tests and canonical solutions. From the master copy, a student version is generated without the solutions, thus obviating the need to maintain two separate versions. nbgrader also automatically grades submitted assignments by executing the notebooks and storing the results of the tests in a database. After auto-grading, instructors can manually grade free responses and provide partial credit using the formgrader Jupyter Notebook extension. Finally, instructors can use nbgrader to leave personalized feedback for each student’s submission, including comments as well as detailed error information.”
  • CS teaching methods have come a long ways since I was turning in floppies and code printouts.
Michael #2: profanity-check
  • A fast, robust Python library to check for offensive language in strings.
  • profanity-check uses a linear SVM model trained on 200k human-labeled samples of clean and profane text strings.
  • Making profanity-check both robust and extremely performant
  • Other libraries like profanity-filter use more sophisticated methods that are much more accurate but at the cost of performance.
    • profanity-filter runs in 13,000ms vs 24ms for profanity-check in a benchmark
  • Two ways to use:
    • predict(text) → 0 or 1 (1 = bad)
    • predict_prob(text) → [0, 1] confidence interval (1 = bad)
Brian #3: An Introduction to Python Packages for Absolute Beginners
  • Ever tried to explain the difference between module and package? Between package-in-the-directory-with-init sense and package-you-can-distribute-and-install-with-pip sense? Here’s the article to read beforehand.
  • Modules, packages, using packages, installing, importing, and more.
  • And that’s not even getting into flit and poetry, etc. But it’s a good place to start for people new to Python.
Michael #4: Python Dependencies and IoC
  • via Joscha Götzer
  • Open-closed principle is at work with these and is super valuable to testing (one of the SOLID principles): Software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification.
  • There is a huge debate around why Python doesn’t need DI or Inversion of Control (IoC), and a quick stackoverflow search yields multiple results along the lines of “python is a scripting language and dynamic enough so that DI/IoC makes no sense”. However, especially in large projects it might reduce the cognitive load and decoupling of individual components
  • Dependency Injector: I couldn’t get this one to work on windows, as it needs to compile some C libraries and some Visual Studio tooling was missing that I couldn’t really install properly. The library looks quite promising though, but sort of static with heavy usage of containers and not necessarily pythonic.
  • Injector: The library that above mentioned article talks about, a little Java-esque
  • pinject: Has been unmaintained for about 5 years, and only recently got new attention from some open source people who try to port it to python3. A product under Google copyright, and looks quite nice despite the lack of python3 bindings. Probably the most feature-rich of the listed libraries.
  • python-inject: I discovered that one while writing this email, not really sure if it’s any good. Nice use of type annotations and testing features
  • di-py: Only works up to python 3.4, so I’ve also never tried it (I’m one of those legacy python haters, I’m sure you can relate 😄).
  • Serum: This one is a little too explicit to my mind. It makes heavy use of context managers (literally with Context(...): everywhere 😉) and I’m not immediately sure how to work with it. In this way, it is quite powerful though. Interesting use of class decorators.
  • And now on to my favorite and a repeated recommendation of mine around the internet→ Haps: This lesser-known, lightweight library is sort of the new kid on the block, and really simple to use. As some of the other libraries, it uses type annotations to determine the kind of object it is supposed to instantiate, and automatically discovers the required files in your project folder. Haps is very pythonic and fits into apps of any size, helping to ensure modularization as the only dependency of your modules will be one of the types provided by the library. Pretty good example here.
Brian #5: A Gentle Introduction to Pandas
  • Really a gentle introduction to the Pandas data structures Series and DataFrame.
  • Very gentle, with console examples.
  • Create series objects:
    • from an array
    • from an array, and change the indexing
    • from a dictionaries
    • from a scalar, cool. didn’t know you could do that
  • Accessing elements in a series
  • DataFrames
    • sorting, slicing
    • selecting by label, position
    • statistics on columns
    • importing and exporting data
Michael #6: Don't use the greater than sign in programming
  • One simple thing that comes up time and time again is the use of the greater than sign as part of a conditional while programming. Removing it cleans up code.
  • Let's say that I want to check that something is between 5 and 10.
  • There are many ways I can do this
    x > 5 and 10 > x
    5 < x and 10 > x
    x > 5 and x < 10
    10 < x and x < 5
    x < 10 and x > 5
    x < 10 and 5 < x
  • Sorry, one of those is incorrect. Go ahead and find out which one
  • If you remove the use of the greater than sign then only 2 options remain
    • x < 10 and 5 < x
    • 5 < x and x < 10
    • The last is nice because x is literally between 5 and 10
  • There is also a nice way of expressing that "x is outside the limits of 5 and 10”
    • x < 5 or 10 < x
    • Again, this expresses it nicely because x is literally outside of 5 to 10.
  • Interesting comment: What is cleaner or easier to read comes down to personal taste. But how to express "all numbers greater than 1" without '>'?
    • ans: 1 < allNumbers
Extras

Michael

Joke: Harry Potter Parser Tongue via Nick Spirit


Audio Download

Posted on 11 January 2019 | 8:00 am


#111 loguru: Python logging made simple

Sponsored by https://pythonbytes.fm/datadog

Brian #1: loguru: Python logging made (stupidly) simple
  • Finally, a logging interface that is just slightly more syntax than print to do mostly the right thing, and all that fancy stuff like log rotation is easy to figure out.
  • i.e. a logging API that fits in my brain.
  • bonus: README is a nice tour of features with examples.
  • Features:
    • Ready to use out of the box without boilerplate
    • No Handler, no Formatter, no Filter: one function to rule them all
    • Easier file logging with rotation / retention / compression
    • Modern string formatting using braces style
    • Exceptions catching within threads or main
    • Pretty logging with colors
    • Asynchronous, Thread-safe, Multiprocess-safe
    • Fully descriptive exceptions
    • Structured logging as needed
    • Lazy evaluation of expensive functions
    • Customizable levels
    • Better datetime handling
    • Suitable for scripts and libraries
    • Entirely compatible with standard logging
    • Personalizable defaults through environment variables
    • Convenient parser
    • Exhaustive notifier
Michael #2: Python gets a new governance model
  • by Brett Canon
  • July 2018, Guido steps down
  • Python progress has basically been on hold since then
  • ended up with 7 governance proposals
  • Voting was open to all core developers as we couldn't come up with a reasonable criteria that we all agreed to as to what defined an "active" core dev
  • And the winner is ... In the end PEP 8016, the steering council proposal, won.
  • it was a decisive win against second place
  • PEP 8016 is heavily modeled on the Django project's organization (to the point that the PEP had stuff copy-and-pasted from the original Django governance proposal).
    • What it establishes is a steering council of five people who are to determine how to run the Python project. Short of not being able to influence how the council itself is elected (which includes how the electorate is selected), the council has absolute power.
    • result of the vote prevents us from ever having the Python project be leaderless again, it doesn't directly solve how to guide the language's design.
  • What's next? The next step is we elect the council. It's looking like nominations will be from Monday, January 07 to Sunday, January 20 and voting from Monday, January 21 to Sunday, February 03
  • A key point I hope people understand is that while we solved the issue of project management that stemmed from Guido's retirement, the council will need to be given some time to solve the other issue of how to manage the design of Python itself.
Brian #3: Why you should be using pathlib
  • Tour of pathlib from Trey Hunner
  • pathlib combines most of the commonly used file and directory operations from os, os.path, and glob.
  • uses objects instead of strings
  • as of Python 3.6, many parts of stdlib support pathlib
  • since pathlib.Path methods return Path objects, chaining is possible
  • convert back to strings if you really need to for pre-3.6 code
  • Examples:
    • make a directory: Path('src/__pypackages__').mkdir(parents=True, exist_ok=True)
    • rename a file: Path('.editorconfig').rename('src/.editorconfig')
    • find some files: top_level_csv_files = Path.cwd().glob('*.csv')
    • recursively: all_csv_files = Path.cwd().rglob('*.csv')
    • read a file: Path('some/file').read_text()
    • write to a file: Path('.editorconfig').write_text('# config goes here')
    • with open(path, mode) as x works with Path objects as of 3.6
  • Follow up article by Trey: No really, pathlib is great
Michael #4: Altair and Altair Recipes
  • via Antonio Piccolboni (he wrote altair_recipes)
  • Altair: Declarative statistical visualization library for Python
    • Altair is developed by Jake Vanderplas and Brian Granger
    • By statistical visualization they mean:
      • The data source is a DataFrame that consists of columns of different data types (quantitative, ordinal, nominal and date/time).
      • The DataFrame is in a tidy format where the rows correspond to samples and the columns correspond to the observed variables.
      • The data is mapped to the visual properties (position, color, size, shape, faceting, etc.) using the group-by data transformation.
    • Nice example that I can get behind
    # cars = some Pandas data frame
    alt.Chart(cars).mark_point().encode(
        x='Horsepower',
        y='Miles_per_Gallon',
        color='Origin',
    )
  • altair_recipes
    • Altair allows generating a wide variety of statistical graphics in a concise language, but lacks, by design, pre-cooked and ready to eat statistical graphics, like the boxplot or the histogram.
    • Examples: https://altair-recipes.readthedocs.io/en/latest/examples.html
    • They take a few lines only in altair, but I think they deserve to be one-liners. altair_recipes provides that level on top of altair. The idea is not to provide a multitude of creative plots with fantasy names (the way seaborn does) but a solid collection of classics that everyone understands and cover most major use cases: the scatter plot, the boxplot, the histogram etc.
    • Fully documented, highly consistent API (see next package), 90%+ test coverage, maintainability grade A, this is professional stuff if I may say so myself.
Brian #5: A couple fun pytest plugins
  • pytest-picked
    • Using git status, this plugin allows you to:
      • Run only tests from modified test files
      • Run tests from modified test files first, followed by all unmodified tests
    • Kinda hard to overstate the usefulness of this plugin to anyone developing or debugging a test. Very, very cool.
  • pytest-clarity
    • Colorized left/right comparisons
    • Early in development, but already helpful.
    • I recommend running it with -qq if you don’t normally run with -v/--verbose since it overrides the verbosity currently.
Michael #6: Secure 🔒 headers and cookies for Python web frameworks
  • Python package called Secure, which sets security headers and cookies (as a start) for Python web frameworks.
  • I was listening to the Talk Python To Me episode “Flask goes 1.0” with Flask maintainer David Lord. At the end of the interview he was asked about notable PyPI packages and spoke about Flask-Talisman, a third-party package to set security headers in Flask. As a security professional, it was surprising and encouraging to hear the maintainer of the most popular Python web framework speak passionately about a security package.
  • Had been recently experimenting with emerging Python web frameworks and realized there was a gap in security packages. That inspired Caleb to (humbly) see if it were possible to make a package to correct that and I started with Responder and then expanded to support more frameworks.
  • The outcome was Secure with functions to support aiohttp, Bottle, CherryPy, Falcon, hug, Pyramid, Quart, Responder, Sanic, Starlette and Tornado (most of these, if not all have been featured on Talk Python) and can also be utilized by frameworks not officially supported. The goal is to be minimalistic, lightweight and be implemented in a way that does not disrupt an individual framework’s design.
  • I have had some great feedback and suggestions from the developer and OWASP community, including some awesome discussions with the OWASP Secure Project and the Sanic core team.
  • Added support for Flask and Django too.
  • Secure Cookies is nice in the mix
Extras:

Michael: SQLite bug impacts thousands of apps, including all Chromium-based browsers

Michael: Follow up to our AI and healthcare conversation

  • via Bradley Hintze
  • I found your discussion of deep learning in healthcare interesting, no doubt because that is my area. I am the data scientist for the National Oncology Program at the Veterans Health Administration.
  • I work directly with clinicians and it is my strong opinion that AI cannot take the job from the MD. It will however make caring for patients much more efficient as AI takes care of the low hanging fruit, it you will.
  • Healthcare, believe it or not, is a science and an art. This is why AI is never going to make doctors obsolete. It will, however, make doctors more efficient and demanded a more sophisticated doctor -- one that understands AI enough to not only trust it but, crucially, comprehend its limits.

Michael: Upgrade to Python 3.7.2

  • If you install via home brew, it’s time for brew update && brew upgrade

Michael: New course!


Audio Download

Posted on 5 January 2019 | 8:00 am


#110 Python Year in Review 2018 Edition

Sponsored by DigitalOcean: pythonbytes.fm/digitalocean

This episode originally aired on Talk Python at talkpython.fm/192.

It's been a fantastic year for Python. Literally, every year is better than the last with so much growth and excitement in the Python space. That's why I've asked two of my knowledgeable Python friends, Dan Bader and Brian Okken, to help pick the top 10 stories from the Python community for 2018.

Guests

10: Python 3.7:

9: Changes in versioning patterns

8: Python is becoming the world’s most popular coding language

7: 2018 was the year data science Pythonistas == web dev Pythonistas

6: Black

5: New PyPI launched!

4: Rise of Python in the embedded world

3: Legacy Python's days are fading?

2: It's the end of innocence for PyPi

1: Guido stepped down as BDFL


Audio Download

Posted on 26 December 2018 | 8:00 am


#109 CPython byte code explorer

Sponsored by DigitalOcean: pythonbytes.fm/digitalocean

Brian #1: Python Descriptors Are Magical Creatures
  • an excellent discussion of understanding @property and Python’s descriptor protocol.
  • discussion includes getter, setter, and deleter methods you can override.
Michael #2: Data Science Survey 2018 JetBrains
  • JetBrains polled over 1,600 people involved in Data Science and based in the US, Europe, Japan, and China, in order to gain insight into how this industry sector is evolving
  • Key Takeaways
    • Most people assume that Python will remain the primary programming language in the field for the next 5 years.
    • Python is currently the most popular language among data scientists.
    • Data Science professionals tend to use Keras and Tableau, while amateur data scientists are more likely to prefer Microsoft Azure ML.
  • Most common activities among pros and amateurs:
    • Data processing
    • Data visualization
  • Main programming language for data analysis
    • Python 57%
    • R 15%
    • Julia 0%
  • IDEs and Editors
    • Jupyter 43%
    • PyCharm 38%
    • RStudio 23%
Brian #3: cache.py
  • cache.py is a one file python library that extends memoization across runs using a cache file.
  • memoization is an incredibly useful technique that many self taught or on the job taught developers don’t know about, because it’s not obvious.
  • example:
    import cache

    @cache.cache()
    def expensive_func(arg, kwarg=None):
      # Expensive stuff here
      return arg
  • The @cache.cache() function can take multiple arguments.
    • @cache.cache(timeout=20) - Only caches the function for 20 seconds.
    • @cache.cache(fname="my_cache.pkl") - Saves cache to a custom filename (defaults to hidden file .cache.pkl)
    • @cache.cache(key=cache.ARGS[KWARGS,NONE]) - Check against args, kwargs or neither of them when doing a cache lookup.
Michael #4: Setting up the data science tools
  • part of a larger video series
  • set up. Tools to keras ultimately
  • Tools
    • anaconda
    • tensorflow
    • Jupyter
    • Keras
  • good for true beginners
  • setup and activate a condo venv
  • Start up a notebook and switch envs
  • use conda, rather than pip
Brian #5: chartify
  • “Python library that makes it easy for data scientists to create charts.”
  • from the docs:
    • Consistent input data format: Spend less time transforming data to get your charts to work. All plotting functions use a consistent tidy input data format.
    • Smart default styles: Create pretty charts with very little customization required.
    • Simple API: We've attempted to make to the API as intuitive and easy to learn as possible.
    • Flexibility: Chartify is built on top of Bokeh, so if you do need more control you can always fall back on Bokeh's API.
Michael #6: CPython byte code explorer
  • JupyterLab extension to inspect Python Bytecode
  • via Anton Helm
  • by Jeremy Tuloup
  • You’ll see exactly what it’s about if you watch the GIF movie at the github repo.
  • Can’t think of a better way to understand Python bytecode quickly than to play a little with this
  • Comparing versions of CPython: If you have several versions of Python installed on your machine (let's say in different conda environments), you can use the extension to check how the bytecode might differ.
  • Nice visualization of different performance aspects of while vs. for at the end

Brian:


Audio Download

Posted on 18 December 2018 | 8:00 am