Python Bytes

by Michael Kennedy and Brian Okken

Python Bytes is a weekly podcast hosted by Michael Kennedy and Brian Okken. The show is a short discussion on the headlines and noteworthy news in the Python, developer, and data science space.

  

Latest Episodes

#99 parse - the regex antidote in Python

Sponsored by DigitalOcean: pythonbytes.fm/digitalocean

Forbes cyber article: Cyber Saturday—Doubts Swirl Around Bloomberg's China Chip Hack Report

Brian #1: parse

  • parse() is the opposite of format()
  • regex not required for parsing strings.
  • Provides these functionalities: export parse(), search(), findall(), and with_pattern()
    >>> parse("It's {}, I love it!", "It's spam, I love it!")
    [HTML_REMOVED]
    >>> search('Age: {:d}\n', 'Name: Rufus\nAge: 42\nColor: red\n')
    [HTML_REMOVED]
    >>> ''.join(r.fixed[0] for r in findall(">{}<", "[HTML_REMOVED]the [HTML_REMOVED]bold[HTML_REMOVED] text[HTML_REMOVED]"))
    'the bold text'
  • Can also compile for repeated use.

Michael #2: fman Build System

  • FBS lets you create GUI apps for Windows, Mac and Linux
  • via Michael Herrmann
  • Build Python GUIs, with Qt – in minutes
  • Write a desktop application with PyQt or Qt for Python.
  • Use fbs to package and deploy it on Windows, Mac and Linux.
  • Avoid months of painful work with the proven solutions provided by fbs.
  • Easy Packaging: Unlike other solutions, fbs makes packaging easy. Create installers for your app in seconds and distribute them to your users – on Windows, Mac and Linux!
  • Open Source: fbs's source code is available on GitHub. You can use it for free in open source projects licensed under the GPL. Commercial licenses are also offered.
    • Free under the GPL. If that's too restrictive, a commercial license is 250 Euros once.
    • PyQt's licensing is similar (GPL/Commercial). A license for it is € 450 (source).
  • Came from fman, a dual-pane file manager for Mac, Windows and Linux

Brian #3: fastjsonschema

  • Validate JSON against a schema, quickly.

Michael #4: IPython 7.0, Async REPL

  • via Nick Spirit
  • Article by Matthias Bussonnier
  • We are pleased to announce the release of IPython 7.0, the powerful Python interactive shell that goes above and beyond the default Python REPL with advanced tab completion, syntactic coloration, and more.
  • Not having to support Python 2 allowed us to make full use of new Python 3 features and bring never before seen capability in a Python Console, see the Python 3 Statement.
  • One of the core features we focused on for this release is the ability to (ab)use the async and await syntax available in Python 3.5+.
  • TL;DR: You can now use async/await at the top level in the IPython terminal and in the notebook, it should — in most of the cases — “just work”.
  • The only thing you need to remember is: If it is an async function you need to await it.

Brian #5: molten

Michael #6: A Python love letter

  • Dear Python, where have you been all my life? (reddit thread)
  • I am NOT a developer. But, I've tinkered with programming (in BASIC, Visual Basic, Perl, now Python) when needed over the years
  • I decided that I needed to script something, and hoped that learning how to do it in Python was going to take me significantly less time than doing it manually - with the benefit of future timesavings. No, I didn't go from 0 to production in a day. But if my coworkers will leave me alone, I might be in production by the end of the day tomorrow.
  • What I'm working on today isn't super complex — But putting together what I've done so far has just been a complete joy.
  • Overall it feels natural, intuitive, and relatively easy to understand and write the code for the basic things I'm doing - I haven't had this much fun doing stuff with code since the days fooling around with BASIC in my teens.
  • Feedback / comments
    • Welcome to the club. I came up on c++; my job highly trained me in C and assembly but every project I touch I think, wait, "we can do 95% this in python". And we do.
    • I used to have a chip on my shoulder. I wanted to do things the hard way to truly understand them. I went with C++. … I learned that doing things the smart way was better than doing things the hard way and didn't interfere with learning.
    • I felt the exact same way I finally decided to learn it. It's like a breath of fresh air. Sadly there are few things in my life that made me feel like this, Python and Bitcoin both give me the same levels of enjoyment. … I've used Java, Groovy, Scala, Objective-C, C, C++, C#, Perl and Javascript in a professional capacity over the years and nothing feels as natural to me as Python does. The developers truly deserve any donations they get for making it. … Hell my next two planned tattoos are bitcoin and python logos on my wrists.
    • I taught myself Python a little over 3 years ago and I quickly went from not being programmer to being a programmer. … However the real popularity of Python comes from the depth and quality of 3rd party libraries and how easy they are to install.

Extra:


Audio Download

Posted on 16 October 2018 | 8:00 am


#98 Python-Electron as a Python GUI

Sponsored by DigitalOcean: pythonbytes.fm/digitalocean

Brian #1: Making Etch-a-Sketch Art With Python

  • Really nice write up of methodically solving problems with simplifying the problem space, figuring out what parts need solved, grabbing off the shelf bits that can help, and putting it all together.
  • Plus it would be a fun weekend (or several) project with kids helping.
  • Controlling the Etch-a-Sketch
    • Raspberry Pi, motors, cables, wood fixture
    • Software to control the motors
  • Picture simplification with edge detection with Canny edge detection.
  • Lines to motor control with path finding with networkx library.
  • Example results included in article.
  • Pentium song: https://www.youtube.com/watch?v=qpMvS1Q1sos

Michael #2: Dropbox moves to Python 3

  • They just rolled out one of the largest Python 3 migrations ever
  • Dropbox is one of the most popular desktop applications in the world
  • Much of the application is written using Python. In fact, Drew’s very first lines of code for Dropbox were written in Python for Windows using venerable libraries such as pywin32.
  • Though we’ve relied on Python 2 for many years (most recently, we used Python 2.7), we began moving to Python 3 back in 2015.
  • If you’re using Dropbox today, the application is powered by a Dropbox-customized variant of Python 3.5.
  • Why Python 3?
    • Exciting new features: Type annotations and async & await
    • Aging toolchains: As Python 2 has aged, the set of toolchains initially compatible for deploying it has largely become obsolete
  • Embedding Python
    • To solve build and deploy problem, we decided on a new architecture to embed the Python runtime in our native application.
    • Deep integration with the OS (e.g. smart sync) means native apps are required
  • In future posts, we’ll look at:
    • How we report crashes on Windows and macOS and use them to debug both native and Python code.
    • How we maintained a hybrid Python 2 and 3 syntax, and what tools helped.
    • Our very best bugs and stories from the Python 3 migration.

Brian #3: Resources for PyCon that relate to really any talk venue

Michael #4: Electron as GUI of Python Applications

  • via Andy Bulka
  • Electron Python is a template of code where you use Electron (nodejs + chromium) as a GUI talking to Python 3 as a backend via zerorpc. Similar to Eel but much more capable e.g. you get proper native operating system menus — and users don’t need to have Chrome already installed.
  • Needs to run zerorpc server and then start electron separately — can be done via the node backend
  • using Electron as a GUI toolkit gets you
    • native menus, notifications
    • installers, automatic updates to your app
    • debugging and profiling that you are used to, using the Chrome debugger
    • ES6 syntax (a cleaner Javascript with classes, module imports, no need for semicolons etc.). Squint, look sideways, and it kinda looks like Python… ;-)
    • the full power of nodejs and its huge npm package repository
    • the large community and ecosystem of Electron
  • How to package this all?
  • Building a deployable Python-Electron App post by Andy Bulka
    • One of the great things about using Electron as a GUI for Python is that you get to use cutting edge web technologies and you don’t have to learn some old, barely maintained GUI toolkit
    • How much momentum, money, time and how many developer minds are focused on advancing web technologies? Answer: it’s staggeringly huge.
    • Compare this with the number of people maintaining old toolkits from the 90’s e.g. wxPython? Answer: perhaps one or two people in their spare time.
    • Which would you rather use?
    • Final quote: And someone please wrap Electron-Python into an IDE so that in the future all we have to do is click a ‘build’ button — like we could 20 years ago. :-)

Brian #5: pluggy: A minimalist production ready plugin system

  • docs
  • plugin management and hook system used by pytest
  • A separate package to allow other projects to include plugin capabilities without exposing unnecessary state or behavior of the host project.

Michael #6: How China Used a Tiny Chip to Infiltrate U.S. Companies

  • via Eduardo Orochena
  • The attack by Chinese spies reached almost 30 U.S. companies, including Amazon and Apple, by compromising America’s technology supply chain, according to extensive interviews with government and corporate sources.
  • In 2015, Amazon.com Inc. began quietly evaluating a startup called Elemental Technologies, a potential acquisition to help with a major expansion of its streaming video service, known today as Amazon Prime Video. (from Portland!)
  • To help with due diligence, AWS, which was overseeing the prospective acquisition, hired a third-party company to scrutinize Elemental’s security
  • servers were assembled for Elemental by Super Micro Computer Inc., a San Jose-based company (commonly known as Supermicro) that’s also one of the world’s biggest suppliers of server motherboards
  • Nested on the servers’ motherboards, the testers found a tiny microchip, not much bigger than a grain of rice, that wasn’t part of the boards’ original design.
  • Amazon reported the discovery to U.S. authorities, sending a shudder through the intelligence community. Elemental’s servers could be found in Department of Defense data centers, the CIA’s drone operations, and the onboard networks of Navy warships. And Elemental was just one of hundreds of Supermicro customers.
  • During the ensuing top-secret probe, which remains open more than three years later, investigators determined that the chips allowed the attackers to create a stealth doorway into any network that included the altered machines. Multiple people familiar with the matter say investigators found that the chips had been inserted at factories run by manufacturing subcontractors in China.
  • One government official says China’s goal was long-term access to high-value corporate secrets and sensitive government networks. No consumer data is known to have been stolen.
  • American investigators eventually figured out who else had been hit. Since the implanted chips were designed to ping anonymous computers on the internet for further instructions, operatives could hack those computers to identify others who’d been affected.

Extra:


Audio Download

Posted on 8 October 2018 | 8:00 am


#97 Java goes paid

Sponsored by DataDog -- pythonbytes.fm/datadog

Brian #1: Making a PyPI-friendly README

  • twine now checks for rendering problems with README
  • Install the latest version of twine; version 1.12.0 or higher is required: pip install --upgrade twine
  • Build the sdist and wheel for your project as described under Packaging your project.
  • Run twine check on the sdist and wheel: twine check dist/*
  • This command will report any problems rendering your README. If your markup renders fine, the command will output Checking distribution FILENAME: Passed.

Michael #2: Java goes paid

  • Oracle's new Java SE subs: Code and support for $25/processor/month
  • Prepare for audit after inevitable change, says Oracle licensing consultant
  • There’s also a little bit of stick to go with the carrot, because come January 2019 Java SE 8 on the desktop won’t be updated any more … unless you buy a sub.
  • The short version is that every commercial enterprise needs to look at their Java SE (Standard Edition) usage to see if they need to do something with licensing.

Brian #3: Absolute vs Relative Imports in Python

  • Review of how imports are used, along with subpackages and from
    • ex: from package.sub import func
  • Relative: what does this mean:
from .some_module import some_class
from ..some_package import some_function
from . import some_class

Michael #4: pyxel - A retro game engine for Python

  • Thanks to its simple specifications inspired by retro gaming consoles, such as only 16 colors can be displayed and only 4 sounds can be played back at the same time, you can feel free to enjoy making pixel art style games.
  • Run on Windows, Mac, and Linux
  • Code writing with Python3
  • After installing Pyxel, the examples of Pyxel will be copied to the current directory with the following command: install_pyxel_examples

Brian #5: Click 7.0 Released

  • Changelog
  • Drop support for Python 2.6 and 3.3.
  • Add native ZSH autocompletion support.
  • Usage errors now hint at the --help option
  • Really long list of changes since the last release at the beginning of 2017

Michael #6: How we spent 30k USD in Firebase in less than 72 hours

  • the largest crowdfunding campaign in Colombia, collecting 3 times more than the previous record so far in only two days!
  • Run on the Vaki platform -- subject of this article
  • We had reached more than 2 million sessions, more than 20 million pages visited and received more than 15 thousand supports. This averages to a thousand users active on the site in average and collecting more than 20 supports per minute.
  • Site was running slow, tried things like upgraded the frontend frameworks
  • Logged into Firebase: had spent $30,356.56 USD in just 72 hours! Going at $600/hr
  • All came down to a very bad implementation of this.loadPayments().
  • Comments are interesting
  • It could happen to any of us, it happened to me this month.

Extras:


Audio Download

Posted on 28 September 2018 | 8:00 am


#96 Python Language Summit 2018

Sponsored by DigitalOcean -- pythonbytes.fm/digitalocean

Brian #1: Plumbum: Shell Combinators and More

  • Toolbox of goodies to do shell-like things from Python.
  • “The motto of the library is “Never write shell scripts again”, and thus it attempts to mimic the shell syntax (shell combinators) where it makes sense, while keeping it all Pythonic and cross-platform.”

Example:

>>> from plumbum.cmd import grep, wc, cat, head
>>> chain = ls["-a"] | grep["-v", "\\.py"] | wc["-l"]
>>> print chain
/bin/ls -a | /bin/grep -v '\.py' | /usr/bin/wc -l
>>> chain()
u'13\n'
>>> ((cat < "setup.py") | head["-n", 4])()
u'#!/usr/bin/env python\nimport os\n\ntry:\n'
>>> (ls["-a"] > "file.list")()
u''
>>> (cat["file.list"] | wc["-l"])()
u'17\n'

Michael #2: Windows 10 Linux subsystem for Python developers

  • via Marcus Sherman
  • “One of the hardest days in teaching introduction to bioinformatics material is the first day: Setting up your machine.”
  • While I have seen a very large bias towards Macs in academia, there are plenty of people that keep their Windows machines as a badge of pride... Marcus included.
  • Even though Anaconda is cross platform and helpful, how does this work on Windows?
    • python3 -m venv .env and source .env/bin/activate?
    • Spoiler alert: Not well.
  • Step by step getting Ubuntu on Windows
  • Shows how to setup an x-server

Brian #3: Type hints cheat sheet (Python 3)

  • Do you remember how to type hint duck types?
    • Something accessed like an array (list or tuple or …) and holds strings → Sequence[str]
    • Something that works like a dictionary mapping integers to strings → Mapping[int, str]
  • As I’m adding more and more typing to interface functions, I keep this cheat sheet bookmarked.

Michael #4: Python driving new languages

  • Here are five predictions for what programming will look like 10 years from now.
    • Programming will be more abstract
    • Trends like serverless technologies, containers, and low code platforms suggest that many developers may work at higher levels of abstraction in the future
    • AI will become part of every developer's toolkit—but won't replace them
    • A universal programming language will arise
    • To reap the benefits of emerging technologies like AI, programming has to be easy to learn and easy to build upon
    • "Python may be remembered as being the great-great-great grandmother of languages of the future, which underneath the hood may look like the English language, but are far easier to use,"
    • Every developer will need to work with data
    • Programming will be a core tenet of the education system

Brian #5: asyncio documentation rewritten from scratch

  • twitter thread by Yury Selivanov
    • “Big news! asyncio documentation has been rewritten from scratch! Read the new version here: https://docs.python.org/3/library/asyncio.html …. Huge thanks to @WillingCarol, @elprans, and @andrew_svetlov for support, ideas, and reviews!’
    • “BTW, this is just the beginning. We'll continue to refine and update the documentation. Next up is adding two tutorials: one teaching high-level concepts and APIs, and another teaching how to use protocols and transports. A section about asyncio architecture is also planned.”
    • “And this is just the beginning not only for asyncio documentation, but for asyncio itself. Just for Python 3.8 we plan to add:
      • new streaming API
      • TaskGroups and cancel scopes
      • Supervisors and tracing API
      • new SSL implementation
      • many usability improvements”

Michael #6: The 2018 Python Language Summit

  • Here are the sessions:
    • Subinterpreter support for Python: a way to have a better story for multicore scalability using an existing feature of the language.
      • Subinterpreters will allow multiple Python interpreters per process and there is the potential for zero-copy data sharing between them.
      • But subinterpreters share the GIL, so that needs to be changed in order to make it multicore friendly.
    • Modifying the Python object model: looking at changes to CPython data structures to increase the performance of the interpreter. - via Instagram and Carl Shapiro - By modifying the Python object model fairly substantially, they were able to roughly double the performance - A little controversial - Shapiro's overall point was that he felt Python sacrificed its performance for flexibility and generality, but the dynamic features are typically not used heavily in performance-sensitive production workloads.
    • A Gilectomy update: a status report on the effort to remove the GIL from CPython.
      • Larry Hastings updated attendees on the status of his Gilectomy project.
      • Since his status report at last year's summit, little has happened, which is part of why the session was so short. He hasn't given up on the overall idea, but it needs a new approach.
    • Using GitHub Issues for Python: a discussion on moving from bugs.python.org to GitHub Issues.
    • Shortening the Python release schedule: a discussion on possibly changing from an 18-month to a yearly cadence.
      • The Python release cycle has an 18-month cadence; a new major release (e.g. Python 3.7) is made roughly on that schedule.
      • But Łukasz Langa, who is the release manager for Python 3.8 and 3.9, would like to see things move more quickly—perhaps on a yearly cadence.
    • Unplugging old batteries: should some older, unloved modules be removed from the standard library?
      • Python is famous for being a "batteries included" language—its standard library provides a versatile set of modules with the language
      • There may be times when some of those batteries have reached their end of life.
      • Christian Heimes wanted to suggest a few batteries that may have outlived their usefulness and to discuss how the process of retiring standard library modules should work.
    • Linux distributions and Python 2: the end of life for Python 2 is coming, what distributions are doing to prepare.
      • Christian Heimes wanted to suggest a few batteries that may have outlived their usefulness and to discuss how the process of retiring standard library modules should work.
      • To figure out how to help the Python downstreams so that Python 2 can be fully discontinued.
    • Python static typing update: a look at where static typing is now and where it is headed for Python 3.7.
      • Started things off by talking about stub files, which contain type information for libraries and other modules.
      • Right now, static typing is only partially useful for large projects because they tend to use a lot of packages from the Python Package Index (PyPI), which has limited stub coverage. There are only 35 stubs for third-party modules in the typeshed library, which is Python's stub repository.
      • He suggested that perhaps a centralized library for stubs is not the right development model. Some projects have stubs that live outside of typeshed, such as Django and SQLAlchemy.
      • PEP 561 ("Distributing and Packaging Type Information") will provide a way to pip install stubs from packages that advertise that they have them.
    • Python virtual environments: a short session on virtual environments and ideas for other ways to isolate local installations.
      • Steve Dower brought up the shortcomings of Python virtual environments, which are meant to create isolated installations of the language and its modules.
      • Thomas Wouters defended virtual environments in a response: The correct justification is that for the average person, not using a virtualenv all too soon creates confusion, pain, and very difficult to fix breakage. Starting with a virtualenv is the easiest way to avoid that, at very little cost.
      • But Beazley and others (including Dower) think that starting Python tutorials or training classes with a 20-minute digression on setting up a virtual environment is wasted time.
    • PEP 572 and decision-making in Python: a discussion of the controversy around PEP 572 and how to avoid the thread explosion that it caused in the future.
      • The "PEP 572 mess" was the topic of a 2018 Python Language Summit session led by benevolent dictator for life (BDFL) Guido van Rossum.
    • Getting along in the Python community: trying to find ways to keep the mailing list welcoming even in the face of rudeness.
      • About tkinter…
    • Mentoring and diversity for Python: a discussion on how to increase the diversity of the core development team.
      • Victor Stinner outlined some work he has been doing to mentor new developers on their path toward joining the core development ranks
      • Mariatta Wijaya gave a very personal talk that described the diversity problem while also providing some concrete action items that the project and individuals could take to help make Python more welcoming to minorities.

Extras

Listener feedback: CUDA is NVidia only, so no MacBook pro unless you have a custom external GPU.


Audio Download

Posted on 22 September 2018 | 8:00 am


#95 Unleash the py-spy!

Sponsored by DataDog -- pythonbytes.fm/datadog

Brian #1: dataset: databases for lazy people

  • dataset provides a simple abstraction layer removes most direct SQL statements without the necessity for a full ORM model - essentially, databases can be used like a JSON file or NoSQL store.
  • A simple data loading script using dataset might look like this:
    import dataset

    db = dataset.connect('sqlite:///:memory:')

    table = db['sometable']
    table.insert(dict(name='John Doe', age=37))
    table.insert(dict(name='Jane Doe', age=34, gender='female'))

    john = table.find_one(name='John Doe')

Michael #2: CuPy GPU NumPy

  • A NumPy-compatible matrix library accelerated by CUDA
  • How many cores does a modern GPU have?
  • CuPy's interface is highly compatible with NumPy; in most cases it can be used as a drop-in replacement.
  • You can easily make a custom CUDA kernel if you want to make your code run faster, requiring only a small code snippet of C++. CuPy automatically wraps and compiles it to make a CUDA binary
  • PyCon 2018 presentation: Shohei Hido - CuPy: A NumPy-compatible Library for GPU
  • Code example
    >>> # This will run on your GPU!
    >>> import cupy as np # This is the only non-NumPy line

    >>> x = np.arange(6).reshape(2, 3).astype('f')
    >>> x
    array([[ 0.,  1.,  2.],
           [ 3.,  4.,  5.]], dtype=float32)
    >>> x.sum(axis=1)
    array([  3.,  12.], dtype=float32)           

Brian #3: Automate Python workflow using pre-commits

  • We covered pre-commit in episode 84, but I still had trouble getting my head around it.
  • This article by LJ Miranda does a great job with the workflow introduction and configuration necessary to get pre-commit working for black and flake8.
  • Includes a nice visual of the flow.
  • Demo of it all in action with a short video.

Michael #4: py-spy

  • Sampling profiler for Python programs
  • Written by Ben Frederickson
  • Lets you visualize what your Python program is spending time on without restarting the program or modifying the code in any way.
  • Written in Rust for speed
  • Doesn't run in the same process as the profiled Python program
  • Does NOT it interrupt the running program in any way.
  • This means Py-Spy is safe to use against production Python code.
  • The default visualization is a top-like live view of your python program
  • How does py-spy work? Py-spy works by directly reading the memory of the python program using the process_vm_readv system call on Linux, the vm_read call on OSX or the ReadProcessMemory call on Windows.

Brian #5: SymPy is a Python library for symbolic mathematics

  • “Symbolic computation deals with the computation of mathematical objects symbolically. This means that the mathematical objects are represented exactly, not approximately, and mathematical expressions with unevaluated variables are left in symbolic form.”
  • example:
    >>> integrate(sin(x**2), (x, -oo, oo))
    √2⋅√π
    ─────
      2
  • examples on site are interactive so you can play with it without installing anything.

Michael #6: Starlette ASGI web framework

Extras:

Michael: PyCon 2019 dates out, put them on your calendar!

  • Tutorials: May 1-2 • Wednesday, Thursday
  • Talks and Events: May 3–5 • Friday, Saturday, Sunday
  • Sprints: May 6–9 • Monday through Thursday

Listener follow up on git pre-commit hooks util: pre-commit package

  • Matthew Layman, @mblayman
  • Heard the discussion about Git commit hooks at the end. I wanted to bring up pre-commit as an interesting project (written in Python!) that's useful for Git commit hooks.
  • tl;dr:
    • $ pip install pre-commit
    • $ ... create a .pre-commit-config.yaml
    • $ pre-commit install # This is a one time operation.
  • pre-commit's job is to manage a project's Git commit hooks. We use this on my team at work and the devs only need to run pre-commit install. This saves us from a bunch of failing CI builds where flake8 or other code style checks would fail.
  • We use pre-commit to run flake8 and black before allowing a commit to proceed. Some projects have a pre-commit configuration to use right out of the box (e.g., black https://github.com/ambv/black#version-control-integration).

Listener: You don't need that (pattern)

  • John Tocher
  • PyCon AU Talk Called "You don't need that” - by Christopher Neugebauer, it was an interesting take on why with a modern and powerful language like python, you may not need the conventionally described design patterns, ala the "Gang of four".


Audio Download

Posted on 15 September 2018 | 8:00 am


#94 Why don't you like notebooks?

Sponsored by DigialOcean -- pythonbytes.fm/digitalocean

Brian #1: Python Patterns

Michael #2: Arctic: Millions of rows a sec (time data)

  • Arctic is a high-performance datastore for numeric data. It supports Pandas, numpy arrays and pickled objects out-of-the-box, with pluggable support for other data types and optional versioning.
  • Arctic can query millions of rows per second per client, achieves ~10x compression on network bandwidth, ~10x compression on disk, and scales to hundreds of millions of rows per second per MongoDB instance.
  • Arctic has been under active development at Man AHL since 2012.
  • Super fast, some latency numbers:
    • 1xDay Data 4ms for 10k rows, vs 2,210 ms from SQL Server)
    • Tick Data 1s for 3.5 MB (Python) or 15 MB (Java) vs 15-40sec from “other tick”
  • Versioned data
  • Built on MongoDB
  • Slides
  • Based on pandas
  • Tested with pytest

Brian #3: PyCon Australia videos

Michael #4: GAE: Introducing App Engine Second Generation runtimes and Python 3.7

  • Today, Google Cloud is announcing the availability of Second Generation App Engine standard runtimes, a significant upgrade to the platform that allows you to easily run web apps using up-to-date versions of popular languages, frameworks and libraries.
  • Python 3.7 is one of the new Second Generation runtimes that we announced at Cloud Next.
  • Based on technology from the gVisor container sandbox, these Second Generation runtimes eliminate many previous App Engine restrictions, giving you the ability to write portable web apps and microservices that take advantage of App Engine's unique auto-scaling, built-in security and pay-per-use billing model.
  • This new runtime allows you to take advantage of Python's vibrant ecosystem of open-source libraries and frameworks. While the Python 2 runtime only allowed the use of specific versions of whitelisted libraries, Python 3 supports arbitrary third-party libraries, including those that rely on C code and native extensions. Just add Django 2.0, NumPy, scikit-learn or your library of choice to a requirements.txt file. App Engine will install these libraries in the cloud when you deploy your app.

Brian #5: I don’t like notebooks

Michael #6: PEP 8000 -- Python Language Governance Proposal Overview

  • This PEP provides an overview of the selection process for a new model of Python language governance in the wake of Guido's retirement. Once the governance model is selected, it will be codified in PEP 13.
  • PEPs in the lower 8000s describe the general process for selecting a governance model.
    • PEP 8001 - Python Governance Voting Process
    • PEP 8002 - Open Source Governance Survey
  • PEPs in the 8010s describe the actual proposals for Python governance.

Extras

  • Free Brian Granger ACM webcast on Jupyter Friday
  • TIOBE jump to #3: https://www.tiobe.com/tiobe-index/


Audio Download

Posted on 6 September 2018 | 8:00 am


#93 Looking like there will be a PyBlazor!

Sponsored by DataDog -- pythonbytes.fm/datadog

Brian #1: Replacing Bash Scripting with Python.

  • reading & writing files
  • CLI’s and working with stdin, stdout, stderr
  • Path and shutil
  • replacing sed, grep, awk, with regex
  • running processes
  • dealing with datetime
  • see also:

Michael #2: pyodide

  • Scientific Python in the browser
    • ALL of CPython (allowed in the browser)
    • NumPy
    • MatPlotLib
    • ...
  • Project by Mozilla
  • We asked “Will there be a PyBlazor?” just two weeks ago. I think we are on a path…

Brian #3: The subset of reStructuredText worth committing to memory

  • A lot of Python packages document with reStructuredText, a lot of reStructuredText tutorials are overwhelming. This post is the answer.
  • paragraphs are with two newlines
  • headings use a weird underlined method of above and below and =, -, and ~
  • bulleted lists work with asterisks but spacing is important
  • italics and bold are with one or two surrounding asterisks
  • inline code uses two backticks
  • links and code snippets are weird and I have to always look this up, as with images, and internal references.
  • so I’ll bookmark this link

Michael #4: bandit

  • via Anthony Shaw
  • Bandit is a tool designed to find common security issues in Python code.
  • To do this Bandit processes each file, builds an AST from it, and runs appropriate plugins against the AST nodes. Once Bandit has finished scanning all the files it generates a report.
  • Issues detected:
    • B312 telnetlib
    • B307 eval
    • B110 try_except_pass
    • B602 subprocess_popen_with_shell_equals_true

Brian #5: Learn Python 3 within Jupyter Notebooks

  • just fun
  • Also shows how to run pytest in a cell.

Michael #6: detect-secrets

  • An enterprise friendly way of detecting and preventing secrets in code.
  • From Yelp
  • detect-secrets is an aptly named module for (surprise, surprise) detecting secrets within a code base.
  • However, unlike other similar packages that solely focus on finding secrets, this package is designed with the enterprise client in mind: providing a backwards compatible, systematic means of:
    1. Preventing new secrets from entering the code base,
    2. Detecting if such preventions are explicitly bypassed, and
    3. Providing a checklist of secrets to roll, and migrate off to a more secure storage.
  • Allows you to set a baseline
  • set it up as a git commit hook


Audio Download

Posted on 31 August 2018 | 8:00 am


#92 Will your Python be compiled?

Sponsored by Digital Ocean -- pythonbytes.fm/digitalocean

Brian #1: IEEE Survey Ranks Programming Languages

  • via Martin Rowe, @measureentblue
  • Python on top. Was last year also, but this year it’s on top even for embedded.
  • Some people dispute the numbers but I believe it.
  • Projects contributing to the rise of Python in embedded:

Michael #2: MyPyC

  • Thread on Python-Dev: Use of Cython
  • It'd be *really nice to at least be able to write some of the C API tests directly in Cython rather than having to fiddle about with splitting the test between the regrtest parts that actually define the test case and the extension module parts that expose the interfaces that we want to test.*
  • Later in the thread, Yury Selivanov dropped a bomb shell.
    • Speaking of which, Dropbox is working on a new compiler they call "mypyc".
    • mypyc will compile type-annotated Python code to an optimized C.
    • Essentially, mypyc will be similar to Cython, but mypyc is a subset of Python, not a superset.
    • Interfacing with C libraries can be easily achieved with cffi. Being a strict subset of Python means that mypyc code will execute just fine in PyPy. They can even apply some optimizations to it eventually, as it has a strict and static type system.

Brian #3: Beyond Interactive: Notebook Innovation at Netflix

  • Netflix is doing some very cool things with Jupyter, and sharing much of it through open source projects.
  • Netflix has growing their use of Jupyter notebooks for many data related roles:
    • business, data, & quantitative analysts
    • algorithm, analytics, & data engineers
    • data, machine learning, & research scientists
  • All of these roles have common needs that are solved by Jupyter and related projects:
    • data exploration, preparation, validation, and productionalization (is that a word?)
  • To help solve their use cases and make notebooks even easier to use for everyone at Netflix, they’ve started many open source projects that can be used by non-Netflix folks as well:
    • nteract is a next-gen React-based UI for Jupyter notebooks.”
    • Papermill is a library for parameterizing, executing, and analyzing Jupyter notebooks. “
    • Commuter is a lightweight, vertically-scalable service for viewing and sharing notebooks.”
    • Titus is a container management platform that provides scalable and reliable container execution and cloud-native integration with Amazon AWS. “
  • There’s a follow-on post that discusses how Netflix is scheduling notebook execution: Scheduling Notebooks

Michael #4: How to create a Windows Service in Python

  • We have spoken about how to run Python script as systemd service
  • Here’s the Windows edition
    • Run Python code on boo
    • When logged out or logged in as another user
    • As a restricted or different account
  • Based on pywin32 (very little documentation)
  • Derive from a given base class then override the three main methods:
    • def start(self) : if you need to do something at the service initialization.
    • A good idea is to put here the initialization of the running condition
    • def stop(self) : if you need to do something just before the service is stopped.
    • A good idea is to put here the invalidation of the running condition
    • def main(self) : your actual run loop. Just create a loop based on your running condition

Brian #5: An Overview of Packaging for Python

  • Started from an essay by Mahmoud Hashemi, @mhashemi
  • Now part of PyPA documentation
    • Different techniques and tools for different types of Python projects
    • modules
    • packages
      • source distributions
      • wheels
      • binary distributions
    • applications
      • this is the hairy part where a bullet point summary just won’t be enough. :)

Michael #6: PEP 505 -- None-aware operators

  • Several modern programming languages have so-called "null-coalescing" or "null- aware" operators, including C# and Swift. These operators provide syntactic sugar for common patterns involving null references.
  • Why not Python?
  • Two cases:
    • The "null-coalescing" operator: To replace inline conditionals such as this value if value is not None else "MISSING" can now be just value ?? "MISSING"
    • The "null-aware member access" operator: Chain calls into a fluent interface without testing for None: return user?.orders.first()?.name would replace this
    if user is None:
        return None

    first_order = user.orders.first()

    if first_order is None:
        return None

    return first_order.name

Extras:


Audio Download

Posted on 25 August 2018 | 8:00 am


#91 Will there be a PyBlazor?

Sponsored by Datadog pythonbytes.fm/datadog

Brian #1: What makes the Python Cool

  • Shankar Jha
  • “some of the cool feature provided by Python”
  • The Zen of Python: import this
  • XKCD: import antigravity
  • Swapping of two variable in one line: a, b = b, a
  • Create a web server using one line: python -m http.server 8000
  • collections
  • itertools
  • Looping with index: enumerate
  • reverse a list: list(reversed(a_list))
  • zip tricks
  • list/set/dict comprehensions
  • Modern dictionary
  • pprint
  • _ when in interactive REPL
  • Lots of great external libraries

Michael #2: Django 2.1 released

  • The release notes cover the smorgasbord of new features in detail, the model “view” permission is a highlight that many will appreciate.
  • Django 2.0 has reached the end of mainstream support. The final minor bug fix release (which is also a security release), 2.0.8, was issued today.
  • Features
    • model “view” feature: This allows giving users read-only access to models in the admin.
    • The new [ModelAdmin.delete_queryset()](https://docs.djangoproject.com/en/2.1/ref/contrib/admin/#django.contrib.admin.ModelAdmin.delete_queryset) method allows customizing the deletion process of the “delete selected objects” action.
    • You can now override the default admin site.
    • Lots of ORM features
    • Cache: The local-memory cache backend now uses a least-recently-used (LRU) culling strategy rather than a pseudo-random one.
    • Migrations: To support frozen environments, migrations may be loaded from .pyc files.
    • Lots more

Brian #3: Awesome Python Features Explained Using Harry Potter

  • Anna-Lena Popkes
  • Initial blog post
  • 100 Days of code, with a Harry Potter universe bent.
  • Up to day 18 so far.

Michael #4: Executing Encrypted Python with no Performance Penalty

  • Deploying Python in production presents a large attack surface that allows a malicious user to modify or reverse engineer potentially sensitive business logic.
  • This is worse in cases of distributed apps.
  • Common techniques to protect code in production are binary signing, obfuscation, or encryption. But, these techniques typically assume that we are protecting either a single file (EXE), or a small set of files (EXE and DLLs).
  • In Python signing is not an option and source code is wide open.
  • requirements were threefold:
    1. Work with the reference implementation of Python,
    2. Provide strong protection of code against malicious and natural threats,
    3. Be performant both in execution time and in stored space
  • This led to a pure Python solution using authenticated cryptography.
  • Created a .pyce file that is encrypted and signed
  • Customized import statement to load and decrypt them
  • Implementation has no overhead in production. This is due to Python's in-memory bytecode cache.

Brian #5: icdiff and pytest-icdiff

  • icdiff: “Improved colored diff”
    • Jeff Kaufman
  • pytest-icdiff: “better error messages for assert equals in pytest”
    • Harry Percival

Michael #6: Will there be a PyBlazor?

  • The .NET guys, and Steve Sanderson in particular, are undertaking an interesting project with WebAssembly.
  • WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable target for compilation of high-level languages like C/C++/Rust, enabling deployment on the web for client and server applications.
  • Works in Firefox, Edge, Safari, and Chrome
  • Their project, Blazor, has nearly the entire .NET runtime (AKA the CLR) running natively in the browser via WebAssembly.
  • This is notable because the CLR is basically pure C code. What else is C code? Well, CPython!
  • Includes Interpreted and AOT mode:
    • Ahead-of-time (AOT) compiled mode: In AOT mode, your application’s .NET assemblies are transformed to pure WebAssembly binaries at build time.
  • Being able to run .NET in the browser is a good start, but it’s not enough. To be a productive app builder, you’ll need a coherent set of standard solutions to standard problems such as UI composition/reuse, state management, routing, unit testing, build optimization, and much more.
  • Mozilla called for this to exist for Python, but sadly didn’t contribute or kick anything off at PyCon 2018: https://www.youtube.com/watch?v=ITksU31c1WY
  • Gary Bernhardt’s Birth and Death of JavaScript video is required pre-reqs as well (asm.js).

Extras and personal info:

Michael:


Audio Download

Posted on 15 August 2018 | 8:00 am