index

Writing an image annotation tool in 50 lines of Python

There are a couple of really nice image annotation libraries that are free and open source. For example, I use LabelImg whenever I need to hand-annotate bounding boxes to create new (or augment existing) datasets for object detection. It can output labels in both Pascal and YOLO formats, which is …


SciPy is partnering with JOSS! Part 1

The Python in Science Conference (SciPy) compiles a conference proceedings every year (as one does). Our process is a little bit different from most conferences in that our review process occurs in two stages.1

In stage one, the Program Committee and their area chairs solicit abstracts for talks and …


How to add plots to docstrings

Recently, we released functionality in niacin for performing data augmentation on timeseries. As a part of this, we wanted to be able to show before and afters in the documentation for how a timeseries (in this case, a sine curve) gets transformed by any particular augmenting function. In a lot …


Getting started with timeseries data augmentation

Data augmentation is a critical component in modern machine learning practice due to its benefits for model accuracy, generalizability, and robustness to adversarial examples. Elucidating the precise mechanisms by which this occurs is a currently active area of research, but a simplified explanation of the current proposals might look like …


Virtual epochs for PyTorch

A common problem when training neural networks is the size of the data1. There are several strategies for storing and querying large amounts of data, or for increasing model throughput to speed up training when there are large amounts of data, but scale causes problems in much more mundane …


Superconvergence in PyTorch

In Super-Convergence: Very fast training of neural networks using large learning rates1, Smith and Tobin present evidence for a learning rate parametrization scheme that can result in a 10x decrease in training time, while maintaining similar accuracy. Specifically, they propose the use of a cyclical learning rate, which starts …


A faster way to generate thin plate splines

In Evading real-time person detectors by adversarial t-shirt1, Xu and coauthors show that the adversarial patch attack described by Thys, Van Ranst, and Goedemé2 is less successful when applied to flexible media like fabric, due to the warping and folding that occurs.

low success rate with adversarial patch from AUTHORS

They propose to remedy this failure …


How to combine variable length sequences in PyTorch DataLoaders

If you're getting started with PyTorch for text, you've probably encountered an error that looks something like:

Sizes of tensors must match except in dimension 0.

The short explanation for this error is that sequences are often different lengths, but tensors are required to be rectangular. The fix for this …


Adding data augmentation to torchtext datasets

It is universally acknowledged that artificially augmented datasets lead to models which are both more accurate and more generalizable. They do this by introducing variability which is likely to be encountered in ecologically valid settings but is not present in the training data; and, by providing negative examples of spurious …


SciPy Proceedings 2020 Survey

The mission of the SciPy Proceedings Committee (Proccom) is to celebrate and promote the work of the members of the SciPy community. This is taken in the broad sense to include a community of the authors and maintainers of core libraries; the scientists and engineers who use these libraries to …


« Page 4 / 5 »