PyCon 2020 has been cancelled. Read the announcement here (Updated March 20)

PyCon Pittsburgh. April 15-23, 2020.

Posters

Using Python for the early detection of Parkinson's disease

Alankrita Tewari, Dipam Paul

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 1

In present times, when technology is integrated into all of our lives in one way or the other it’s unimaginable to get by our daily lives without the use of electronic devices that require us to interact with them (computers, phones,etc). Now imagine, what would it be like if this act of typing on a keyboard that you do organically and subconsciously after a some constant use could help predict an early onset of Parkinson’s, a disorder that until now, is impossible to diagnose in early stages !? Fascinating, right?

Recent studies have shown how analyzing people’s keystrokes as they type on a computer keyboard can reveal a great deal of information about the state of their motor function. We created a prediction algorithm that can capture timing information from computer keystrokes, allowing the researchers to detect patterns that distinguish typing that occurs when motor skills are impaired hence diagnose Parkinson’s early onset aiding in the development of better treatments.
This could also open doors to build up strategies and algorithms to evaluate patients with other diseases that affect motor skills, such as rheumatoid arthritis and dupuytren and I am currently working on the same.

Video Surveillance System for Female Security using Keras

Vibhor Agarwal, -, Prashant Nigam

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 2

Violence against females is among the most widespread, and devastating human rights violations in the world. Every year such molestation activities happen frequently on roads especially at isolated places. Therefore, Video Surveillance Systems (or CCTVs) have evolved significantly in the last several years. Due to the latest advancements in Artificial Intelligence, these systems are becoming more and more sophisticated and self-reliant in detecting suspicious activities. It reduces the need for manual monitoring at the Control Centres and errors associated with it.

In this project, we tackle this problem of suspicious activities against women in isolated areas. Our video surveillance system takes live video as input and detects such suspicious activities in real-time. It then notifies the control centre and raises the on-site alarm. Later on, people at the control centre can validate the alarm and call police to help the needy females.

We trained convolutional neural network model on our dataset to extract both spatial and temporal features from the videos. Then using softmax unit, the probability of maliciousness is predicted to raise the alarm if the probability is above some threshold. We did comprehensive study on several CNN architectures to figure out the techniques that work better in these tasks. All the experiments in this project were carried out using Python and its different libraries. In particular we make extensive use of Keras, a Python based Deep Learning framework.

We believe that our approach can benefit diverse communities attending PyCon who are looking for appropriate machine learning algorithms to solve Video Surveillance tasks that our approach is designed to tackle. In this poster, we will showcase relevant Python tools and comprehensive study of different machine learning architectures one could use to tackle similar tasks in Video Surveillance.

Centrality Measures to Classify Event and Non-event Synchrony

Prashant Nigam, LEELA SURYA TEJA MANGAMURI

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 3

Nowadays, Social media has become an integral part of our life. Studying and analyzing trends in social media has become very important. Our work mainly focuses on differentiating event and non-event and synchrony using centrality measures.

Synchrony is defined as a social phenomenon where number of people performing certain acts first increases and then decreases Example during non-event, there is a rise in the number of tweets but not related to an event. It is a generic hashtag where people share their experiences, emotions, feelings etc.So, this can be termed as non-event synchrony. Ex #mondaymotivation is used by the people to motivate themselves on Monday.

In this project,Twitter was used as a platform to analyze the behavior of people in social media.An algorithm was developed that would help to detect the onset of Synchrony effectively.In this work, we have made graph network for each synchrony using network package NetworkX and implemented Centrality measures on the each of the synchrony.We analysed the retweet to tweet ratio and also the profiles associated with tweet to measure the impact of that tweet and finding if it becomes viral.

The practical applications of this algorithm to enable companies to identify sudden outbursts of negative content about them online so that they may respond swiftly and take measures to stop further proliferation of negative responses .

Economics and the Programming Sandwich

Matthew W Thomas, Maria Fernanda Petri Betto

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 4

This poster is for anyone with an interest in numerical analysis, data work, or the marriage between them. In creating and applying Economic theory, there are often two steps that require programming – numerical and data analysis. Numerical solutions to models are conducted during the initial formative stages of the process while data analysis comes at the end. These two steps form the bread of the “programming sandwich”, and they require very different skills and processes. As a general programming language with a large data analysis community, Python is uniquely well situated to handle both tasks. Presented here is a real world example of a project using Python for both initial simulations and data analysis. We explain our process, the tools we used, and the decisions that we made along the way.

UNRAVELING FORENSIC WITH PYTHON

k.sasirekha, , Sujatha

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 5

Finger print is used to uniquely identify a person involved in crime for centuries;as it does not change over time. During investigation, the complete finger print may or may not be obtained.Even if obtained, the clarity of the collected sample will be minimum;which requires lot of efforts, time and search operations in producing the accurate match and results in identifying the person who is responsible for crime.
Python provides powerful libraries like Numpy, Pandas, Matplotlib, Sklearn, Tensorflow, etc… for acquiring the data, training the model, making predictions and refining the model. To our project, these libraries help in creating the complete image of the finger print from scrap pieces without human intervention in less time. TensorFlow module provides framework for achieving our objective in an efficient way. Also, python provides an easier way to work with different databases using sqlalchemy, sqlite3, pymongo modules for storing the images.
As python comes with substantial libraries, building a generative adversarial network for finger print generation is quite simple now. The primary idea of this poster is how our model using Tensorflow with Python works for the finger print identification task.

Designing Libraries for Async and Synchronous I/O

Seth Michael Larson

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 6

Sans-I/O as a design pattern was announced 3 years ago during PyCon 2016. Now that several network protocols have been implemented with the sans-I/O design pattern what is the next step towards better network client and server libraries?

Explore what tools and techniques a project named ‘Hip’ is using to support HTTP for both synchronous I/O and asynchronous I/O with multiple libraries (AsyncIO, Trio, Curio) in the same code-base using code generation and the sans-I/O pattern.

JustPy - An object-oriented, component based web framework that requires no front-end programming

Eliezer Mintz

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 7

JustPy is an object-oriented, component based, high-level Python Web Framework that requires no front-end programming. With a few lines of only Python code, you can create interactive websites without any JavaScript programming.

When developing with JustPy, there is no front-end/back-end distinction. All programming is done on the back-end, allowing a simpler and more Pythonic web development experience. JustPy removes the front-end/back-end distinction by intercepting the relevant events on the front-end and sending them to the back-end to be processed.

In JustPy, elements on the web page are instances of component classes. A component in JustPy is a Python class. Customized, reusable components can be created from other components. Out of the box, JustPy comes with support for HTML and SVG components as well as more complex components such as charts and grids. It also supports most of the components and the functionality of the Quasar library of Material Design 2.0 components.

JustPy encourages creating your own custom components and reusing them in different projects (and if applicable, sharing these components with others).

JustPy integrates nicely with pandas and simplifies building web sites based on pandas analysis.

JustPy supports visualization using matplotlib and Highcharts.

Hopefully, JustPy will enable teaching web development in introductory Python courses by reducing the complexity of web development.

Hello World!

import justpy as jp

def hello_world():
    wp = jp.WebPage()
    d = jp.Div(text='Hello world!')
    wp.add(d)
    return wp

jp.justpy(hello_world)

That’s it. The program above activates a web server that returns a web page with ‘Hello world!’ for any request. Locally, you would direct your browser to to http://localhost:8000/ or http://127.0.0.1:8000 to see the result.

From nose to pytest - changing tests framework in Apache Airflow

Tomek Urbaszek, Jarosław Potiuk

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 8

Since December 2019, Apache Airflow is using pytest to run tests. Before that we used good, old nose. The process of migration took me 2 months of scratching my head and tracking down numerous side effects. It’s not an easy task when your test suite includes 4 000+ tests. Lessons we have learnt during the process are worth spreading because side effects on database level are just a top of an iceberg.

Tern: Using Python to Look Inside Container Images

Rose Judge

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 9

Popular container build tools (like Docker) do not specify a declarative Software Bill of Materials (SBoM), but rather, rely on an imperative command series to create the filesystems on which services are based. A well used feature of these container build tools is the ability to create new containers on top of existing deployed containers to allow for data deduplication. This layering method, however, makes the container software supply pipeline non-reproducible, and often leads to deploying images with unknown content. Not knowing what is installed in your running container makes complying with the licenses of the installed components more difficult.

Tern is an Open Source inspection tool focused on compliance that finds the metadata of packages installed in a provided container image or Dockerfile. Developed in Python3, Tern uses overlayfs to mount the first filesystem layer used to build the container image (sometimes referred to as the BaseOS). It then executes scripts in a chroot environment to collect information about packages installed in that layer. With that information as a starting point, Tern continues to mount each layer, collecting information about the packages it finds there for each layer that it finds. Once Tern has iterated over all the layers, it generates a report in a variety of formats. Ultimately, Tern is a tool for existing containers that gives you a deeper understanding of your container’s software bill of materials so you can make better decisions about your container based infrastructure, integration and deployment strategies.

Teaching Python Through Arcade Games

Paul Vincent Craven

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 10

Learning to program should be fun. Arcade Academy makes available free resources for learning. The Arcade library is an easy, high-performance 2D game library for Python. This poster presentation demonstrates the library, shows how it can be a great tool educators, and the accompanying materials available.

Not all students learn the same way. Arcade Academy makes it possible to learn by reading, watching instructional videos, and by example. The primary author has constantly engineered the material through student feedback and direct classroom observation for over ten years.

Extensive documentation and even an on-line book contain everything needed to get started. A tutorial video series has been recorded by a diverse group of people, so students can see than anyone can be a programmer.

The Arcade API is written with simplicity, education, and performance in mind. Arcade uses OpenGL and can display thousands of moving sprites, and hundreds of thousands of stationary sprites. Transparency, hitbox calculations, sprite rotation, and physics are easy to get started with.

Over 100 code samples show how to perform common game development tasks. From the simple questions like how to draw a line, intermediate questions on calculating angles for shooting, to advanced procedural dungeon generation.

Using Python to improve physics curriculum and build better students

Ravi Tavakley

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 11

Python is one of the most popular programming languages in industry and for many scientists it is the go-to platform for computation, analysis, and simulation. Demand for Python programmers is high and rising. Traditional Physics instruction often lacks digital computation and drawing power for new students. Integrating Python with Physics improves value of student and the curriculum. Power to program also lets students put formulae in action on digital computers, process large data sets, visualize results; enabling new insights into Physics, and perhaps sparking new interest in the subject.

At Augsburg University, we are integrating Python in physics courses across our major. We started this effort in 2016 by introducing programming in a single physics course. We first used MATLAB ® for this effort, however, have evolved to focus on Python because of its broader appeal for students seeking employment.

The poster will show you the evolution of this effort from its inception. It shows our motivation and things we learned along the way, student feedback we received, the changes we made to improve the program, and student achievements.

Python Powered Excel: focus on data, instead of file formats

JENNIFER WATT

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 12

Do you face the common usability problems when an excel file driven web application is delivered for non-developer users (ie: human resource administrators, project managers etc), or have to process different excel formats for your program?

The fact is that not everyone knows (or cares) about the differences between various excel formats: csv, xls, xlsx are all the same to them. Instead of training those users about file formats, the various Python libraries help developers to handle most of the excel file formats by providing a common programming interface.

Python can provide tools to automate the routine tasks including office automation to facilitate the process and avoid errors in generating reports, to clean data and to visualize large datasets.

I have explored Python library options and the below are most likely support these common problems.

xlrd:
xlrd is a library for reading data and formatting information from Excel files, whether they are .xls or .xlsx files.

openpyxl:
openpyxl is a Python library to read/write Excel 2010 or later versions of xlsx/xlsm/xltx/xltm files.
It was born from lack of existing library to read/write natively from Python the Office Open XML format.

pyexcel:
This library makes information processing involving various excel files as easy as processing array, dictionary when processing file upload/download, data import into and export from SQL databases, information analysis and persistence.
It uses pyexcel and its plugins to provide one uniform programming interface to handle csv, tsv, xls, xlsx, xlsm and ods formats.

Typing the untyped: Hints for adding type hints to untyped libraries (like argparse)

Kyle Swanson, Jesse Michel

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 13

Python developers are finding ways to improve software engineering features and practices. Type hinting (introduced in Python 3.5) allows developers to specify types, which serves as documentation and enables static type-checking, code completion, and source code navigation. Unfortunately, typed projects must often depend on untyped modules, which leads to a mixture of typed and untyped code, thus losing some of the benefits of type hints. Although developers could rewrite the entire untyped library using type hints, an easier solution is to provide a typed interface for existing untyped modules. We present a case study of taking an untyped module — argparse — and creating a typed wrapper — typed-argument-parser — to support type hinting. We discuss some of the strategies and challenges of making this conversion, emphasising the software engineering principles guiding the design.

Multi-Sensor Position Elderly Human Activity Recognition Using Python Libraries

Haruna Abdu

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 14

The increase in the availability of wearable devices offers a prospective solution to the increasing demand for human activity monitoring in almost all its application domain. With all the availability of the wearable devices fully embedded with a large amount of sensor that is being used for health monitoring of elderly peoples and data collection in the essence of human activity monitoring, a lot of techniques are been proposed and used in the process. However, most of the data used by different researchers for the purpose of HAR are collected in the lab settings, not in a natural settings, and the participants are not allowed to perform any activity in their normal way, in addition, elderly peoples from rural areas are mostly not considered that will cause lack of generalization of the dreamed HAR models and finally, transitional activities are mostly not considered due to its short duration, that will cause a lot of misclassification especially among elderly peoples. In this research, data were collected using wearable sensor devices attached at two different positions (ankle and waist) from fifteen different volunteers of elderly peoples performing eight different activities that include both static, dynamic and transitional activities in their normal way. Five different machine learning algorithms were used in the classification process that includes Support Vector Machine (SVM), Naïve Bayes, Decision Tree, Logistic Regression and K-nearest neighbor (KNN). From the result obtained it has shown that Naïve Bayers, KNN and logistic regression have the highest accuracy of 100%, 95% and 89.43% respectively from the sensor attached at the waist position.

“PyLadies Caravan”: a roadshow empowering Women Pythonistas all over Japan

Maaya Ishida, KANAE YAMASHITA

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 15

“PyLadies Caravan” is the series of meetups for Women Pythonistas all over Japan. PyLadies Caravan staff team goes to the region that currently does not have active PyLadies chapter to organize a meetup. In each region we start by finding enough women who want to learn Python or who want to network with other women Pythonistas. Planning of this project started in October 2018 by PyLadies Tokyo, and was executed in January 2019 (and it still continues to date!).

This project is very important because it empowers women engineers to be active members of Python community regardless of their location. From our experience, when female Python engineers cannot find Pythonistas around them, they tend to become negative. So, we wanted to help Women Pythonistas to connect with each other even in rural Japan by organizing local meetups. In addition to offline network, PyLadies offers online community that connects Women Pythonistas throughout Japan. We have seen Women Pythonistas utilizing these spaces for information-sharing and problem-solving.

Our achievements: In 2019, we organized meetups in 6 regions.

  • When our visit, Okinawa started local PyLadies chapter, called PyLadies Okinawa.
  • After our visit, Kyoto re-started local PyLadies chapter, called PyLadies Kyoto.
  • Other regions we visited via PyLadies Japan Slack.(Ehime, Aichi, etc) .

In this Poster session, we will share how to organized such project and will discuss learnings and ways to improve.

  • How to organize a regional meetup
  • How to decide a meetup theme
  • What are problems inherent to the regions
  • How to follow-up
  • How to improve communication

Datatest: Test driven data-wrangling and data validation

Shawn Brown

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 16

In the practice of data science, data preparation can be a huge part of the job. Practitioners often spend 50 to 80 percent of their time wrangling data. This critically important phase is time-consuming, unglamorous, and often poorly structured.

The datatest package was created to support test driven data-wrangling and provide a disciplined approach to an otherwise messy process. It repurposes software testing practices for data preparation and quality assurance projects.

This poster will discuss how datatest can facilitate quick edit-test cycles to help guide the selection, cleaning, integration, and formatting of data. Specifically, it will discuss:

  • How the validate() function’s rich comparison behavior can test a variety of data types.
  • Using datatest’s “acceptance protocol” to distinguish between ideal criteria and acceptable deviation.
  • Maintaining a record of checks and decisions regarding important data sets.
  • Using the RepeatingContainer class to run duplicate queries on multiple objects without repeating yourself.
  • How to structure test suites to guide the data wrangling work-flow.

Visualize Website Structures in 3D with WebGL and Python

Saurabh Chaturvedi, Harsh Sharma

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 17

Getting a 3-dimensional view of a website can offer some valuable insights into it’s structure. Webmasters can use such visualizations to improve site SEO, and can detect anomalous nodes (like outdated & stale pages, for example) and remove them to keep their websites neat and clean. Among other use cases, a non-trivial exploit can be to perform clustering on the website’s nodes and links, and then compare the clusters’ visualization with the sitemap anatomy to see get an idea of how intuitive and navigable the website is.

In this poster we present our approach towards building a web-based tool to offer such a visualization service. We’ll talk about how we used Python to create a spider that takes website entry-points from a user, and crawls from there to recreate the website structures and store the page-to-link mappings in a Postgres database, discarding outlinks to other domains along the way. The poster will also describe our custom and simplified PageRank algorithm that iterates over the mappings stored in the database to get relative importance of web pages in a website. The page ranks thus obtained are mapped to node radii in the visualization, so “more prevalent” nodes are larger in size than the “less prevalent” nodes. We’ll also talk about how the network structure information cached in the database can be used to perform further analysis, e.g, node-embeddings based clustering to get even deeper insights about the website’s anatomy. The data thus processed is then exported to be used by the frontend WebGL engine, which renders the website network in 3 dimensions back to the user.

Finally our poster will also describe how our current approach to website network visualization can be extended to include more algorithms, like visualizing most-frequently and less-frequently used paths with the help of further input data from the user, community detection as well as scalability challenges, both on the Python crawler and analysis backend as well as WebGL JavaScript frontend!

Utilizing BERT and Social Media data for Mental Health Predictions

Ethan Ho, , Kartik Thakore

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 18

In Medical Machine Learning, one of the biggest challenges is getting access to medical data due to the sensitive nature and privacy of such datasets. In this poster, we used Google Colab, BERT (Bidirectional Encoder Representation Transformer) and publicly available reddit data on BigQuery to evaluate the validity and functionality of using social media data to predict depression. In this poster we exhibit an exciting new approach to obtaining medical data and discuss the pros and cons of leveraging social media data in a mental health context, as well as evaluate future iterations of this approach.

Text mining in Python for Evaluating the Ethical Foundations in Computer Science

Oliver Bonham-Carter

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 19

In pedagogy, it is critical for instructors to determine whether students are retaining important concepts. Typically, after modules are taught, tests are given that are filled with direct questions that serve to reveal weaknesses in knowledge of factual information. In computer science, where a single response exists for a majority of the questions, such tests are largely successful. However, when teaching ethical theory in computer science, in tandem with problem-solving and solution-building, such a test would fail since there may not be a single answer to a direct question. Instead, to assess the knowledge invested in problem-solving where ethical concepts are highly relevant, essays must be written to allow the student room to contemplate and explore the variety of responsible or not responsible positions.

Teaching responsible computing is critical in developing software that produces a positive impact on our society, economy, and individuals. In this project, we present an automated text-mining tool written in Python to assist the department in measuring the technical responsibility of students across our department’s computer science curriculum. Although learning objectives and, hence, outcomes vary in each course, our tool provides its analysis by following broad learning categories around internet health, ethics and responsible computing, including students’ understanding of relevant issues, their ability to analyze and evaluate information, and their capacity for integrating the understanding and analysis of ethical thinking into their own work.

Our tool automatically collects reflection documents written by students from their GitHub repositories and using natural language processing analyzes them for ethical considerations based on pre-determined questions and criteria. Various open-source Python libraries are deployed in the generation of the tool, including SciKit Learn, NLTK, gensim, PyQt5, Streamlit and Pytest. Using our tool, it is possible to see the progression of a single student’s ethical thinking throughout a specific course and throughout the entire computer science curriculum, as well as, to have a grand view of all students’ progress in developing an understanding of social responsibility in computer science across all levels of our courses.

Story generation using Python

NILESH JAIN, Rahul S.N

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 20

Automatically generating images according to natural language descriptions is a fundamental problem in many applications, such as art generation and computer-aided design.

Harnessing the power of python libraries such as Tensorflow 2.0, Markovify, Textblob while also visualizing the image generation using animation libraries and implementing the state of the art paper - StoryGAN.

“Sorry, Could you repeat that again?” - Speech Recognition with Python

Javier Jorge Cano

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 21

Nowadays, we are surrounded by devices that can listen to us: Alexa, Siri, Cortana, etc, and the interaction with them has become easier and easier, and more intuitive. The first challenge to communicate in a colloquial way with all these devices is to convert the voice signal to text. To do this, several approaches based on searching methods, algorithmic techniques, and machine learning are combined in very smart and interesting ways.

In this poster, I will introduce the underneath speech recognition systems that these devices utilize. This will be illustrated with a guided example where we will develop a system to recognize isolated words in Python.

Additionally, I will show how we are implementing these and more advanced techniques in our production systems, providing transcriptions for different companies and institutions, using Python in different parts of the process.

Python on Hardware : An interactive poster project

Ayan Pahwa, Aakanksha Agrawal

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 22

In recent years running python on hardware has been one of favourite pastime of DiY makers not just for fun but to harness the power and simplicity of Python and combining it with hardware nodes doing IoT or just solving any problem.

Python and it’s battery included approach not only enable anyone get up and running in no time but also significantly decreases the entry barrier which is there to get started with Embedded systems specially for those with no hardware background.

Now there are a few different ways to use Python in your next embedded project, such as Interacting with your hardware over serial port, giving commands via a Python Script or running python script on a single board computer such as Raspberry pi or running it completely native on your hardware in a bare metal world without operating system using some amazing open source projects like CircuitPython and MicroPython.

This poster presentation will demo such projects making it an interactive hardware poster projects showcasing how Python can be used with hardware in different ways and what cool projects you can make .

From fetching and displaying live weather forecast, to moving objects using motors to generating creative art patterns on RGB LEDs to sensing and responding to environment around you, this will be a good demo of Opportunities are endless once you know your options and what you plan on doing :)

Self-Service Infrastructure with Luigi, AIOHTTP, and Jira

Rory Scott

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 23

Being a Jira Ticket Monkey is no fun, but being a roadblock to other teams might be even less fun. By leveraging Spotify’s Luigi Workflow Engine and SOA/AIOHTTP, we can create a self-service interface through which repetitive tickets are resolved quickly in a standardized way.

Our Platform Team uses Jira Service Desk for customer requests, typically a service request with an approval process (i.e. I need a database, I need a bucket, this IP needs to be whitelisted, etc.). Jira is fine, but how many times have you said to yourself, “It takes me longer to find the ticket I’m looking for, change its status to In Progress, do the work, go back into Jira to find the ticket again, and finally close the ticket?” Our goal was to eliminate many of the manual tasks associated with ticket management, using Python, so that we could focus on improving our platform.

By embracing SOA on our Platform Team and offering infrastructure through our services, we codified the manual steps needed to provision resources using Luigi and directed acyclic graphs, which are triggered by Jira. Compliance is happy because they can approve requests, customers are happy because they aren’t slowed down by our team, and our team is happy to focus on more difficult problems than provisioning buckets, databases, and credentials.

Let's Script Games and Animations with Blender plus Python!

Ayemya M Moe

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 24

Instead of using paint brush, paper and traditional canvas, we will explore art world in a different way- the Python way. What if a not-so-artistic enthusiast wants to jumpstart a new hobby? What if you want to try out your first step as a cool game designer, yet you know nothing about design? No worries, “Blender plus Python” can help.

In this proposal, we focus on how to implement Python scripts with Blender to create amazing games and animations. Blender is the powerful, open-source 3-Dimensional software toolset to produce graphics, games, animations, visual effects, etc. Why use Python with Blender? The goal is to utilize familiar Python tools such as Console and Text Editor along with Blender’s Preferences and Templates, for interactive scripting, customization, and more importantly, multi-tasking and autocompleting workflow. For our poster, we will be building exciting animations by using add-ons and scripts to customize and speed up Blender’s functionality. Throughout the process, we will also show how to efficiently apply Python functions, loops, objects and variables to make creativity shine.

In this poster, we aim to help the audience learn how to creatively leverage Python to build beautiful fast animations or games or anything 3-Dimensional.

Development of a Novel Computer-Aided Pneumonia Diagnosis System Using Python

Sakib Reza, Atia Amin, M.M.A Hashem, Ohida Binte Amin

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 25

Pneumonia is a common lung infection that causes death for thousands of people every year. Especially, It affects infants and elderly people. Diagnosis for pneumonia is generally performed through interpreting chest X-rays (CXR). Professional radiologists evaluate these X-rays and this process may take a significant period of time depending on the queue. Delay in the diagnosis can lead to a serious health risk for the patients. In this context, an automated computer-aided diagnosis system can assist and speed up the whole diagnosis procedure. In this poster, we propose a computer-aided diagnosis system which is completely built utilizing Python packages.

Our proposed pneumonia diagnosis system consists of four segments - Chest X-ray View Classification, Lungs Segmentation, Pneumonia Detection, and Medical Report Generation.

First, the system separates an input CXR into one of the two different view positions – posterior-anterior (PA) and anterior-posterior(AP) using a view classifier. As these two view positions have different visual appearances, it is supposed to be a competent idea to feed them into two distinct specialized models. This view classifier is developed employing a novel feature extraction and selection method.

Then the input CXR moves forward to a lung segmentation model, where only the lungs areas are segmented and go to the next process. As the valid opacities causing pneumonia can only appear inside the lungs area, this segmentation will ensure the detection of the valid opacities only. For lung segmentation, we developed a new encoder-decoder model which outperforms the state-of-the-art U-Net model.

After that, the segmented part of the CXR is pushed to the pneumonia detection model where the suspected opacities are detected. Here, a CNN Model with class activation mapping is applied to classify and localize the affected areas.

Finally, taking positions of the opacities, area of affected localities and the confidence score as the inputs, a report generator model generates a final medical report using a fuzzy inference system.

As far as we know, this is a novel type of specialized computer-aided diagnosis system for pneumonia. At PyCon-US 2020, we anticipate demonstrating our diagnosis system in the format of a poster. The whole system is designed using different python packages including - Numpy for matrix operations, SciPy for signal processing, OpenCV and Scikit-Image for image processing, Deap for genetic algorithm, SkFuzzy for fuzzy inference, Scikit-learn for machine learning and Keras for deep learning. We believe this will exhibit the effective utilization of Python open-source packages in developing a robust medical diagnosis tool and also allow us to connect with similarly interested Python practitioners.

Pydra - a new dataflow engine

DOROTA JARECKA

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 26

Pydra is a new dataflow engine written in Python. The package is a part of the second generation of Nipype ecosystem - an open-source software that provides a uniform interface to existing neuroimaging software and facilitates interaction between them - but Pydra itself is designed to be easily adapted to any other scientific domain.

The goal of Pydra is to provide a lightweight dataflow engine for workflow graph construction, manipulation, and distributed execution. It facilitates reproducibility of scientific pipelines.
The key features of the new engine are:
- similar user interfaces for a single job task and a workflow,
- flexible splitting options for tasks and workflows that have to be run multiple times for different sets of inputs (similar concept to Map-Reduce model),
- support for reuse previously computed components by using a global cache,
- support for execution 
in isolated environments
 (i.e. Docker and Singularity containers), that allows easy re-execution.

Python can be tidy too: pandas recipes for normalizing data

Jenna Jordan

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 27

Hadley Wickham’s 2014 paper “Tidy Data” and the resulting tidyverse packages revolutionized how R users organize and work with data for statistical analysis. The idea of tidy data is summarized in three maxims: each variable is a column, each observation is a row, and each type of observational unit is a table. Those familiar with relational databases will recognize that the concept of tidy data is, in fact, Codd’s 3rd normal form reframed for statisticians. As a relational database enthusiast, I love the idea of libraries designed around tidy data. As a Python user, I want those tidy data-oriented libraries to exist in my world. The good news is that they already do - they just aren’t explicitly designed around the tidy data philosophy. My poster will show how the Python library pandas can be used to “tidy” (or normalize) datasets through the use of some specific recipes that combine existing pandas functions. The poster will demonstrate these recipes using two data sources from my domain of political science which desperately need to be tidied: the Correlates of War project datasets and the UCDP/PRIO Armed Conflict dataset.

Learning to Code in Sixth Grade is Awesome

Alexander "Lex" Smirnow

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 28

I am an upper elementary student at Richmond Montessori School. At RMS, all sixth-graders choose a topic for year-long research project. I chose to research Python programming. As part of this project, I am making a video game in Python. The video game is called “Splatformer.” In this game, the player jumps from block to block and solves puzzles by splatting into a block on the ground, creating a new block.

My poster shows the process I used to design and create this game. There are a lot of options available today to design game levels within existing game platforms, but I am interested in learning how to use multiple different coding languages to increase my skill in coding. Python is a good choice to create a game because it is a very diverse and flexible coding language. And since I am researching Python for my expert project this year, I can also spend time making a video game and it counts as homework.

I want to continue to make my game better, so during the poster session I would like to get advice and ideas from other people in the coding community.

Tune: Scalable hyperparameter tuning framework

Richard Liaw

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 29

If you’ve ever tried to tune hyperparameters for a machine learning model, you know that it can be a very painful process. Simple approaches quickly become time-consuming. And now more than ever, you absolutely need cutting-edge hyperparameter tuning tools to keep up with the state-of-the-art.

This poster introduce Tune, a powerful hyperparameter optimization library designed to remove the friction from scaling experiment execution and hyperparameter search. Here are some of Tune’s core features:

  • Tune provides distributed asynchronous optimization out of the box using Ray (https://github.com/ray-project/ray). This allows users to train multiple models very quickly.
  • Tune offers state of the art algorithms including (but not limited to) ASHA, BOHB, and Population-Based Training.
  • Tune hyperparameter search jobs can scale from from a single machine to a large distributed cluster without changing your code.
  • Tune supports any machine learning framework, including PyTorch, TensorFlow, XGBoost, LightGBM, scikit-learn, and Keras.

Tune is open source at https://github.com/ray-project/ray.

Multi-Armed Bandit Problem and its Potential for Clinical Trial Application

Haque Ishfaq, Atia Amin, Hassan Saad Ifti, Hassan Sami Adnan, Samara Sharmeen

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 30

Multi armed bandit (MAB) is a classic problem for studying exploration-exploitation tradeoff in sequential decision making problems. Algorithms designed to tackle this problem have been used in various real world problems such as click-ad optimization, A/B testing, webpage ranking, dialogue systems, anomaly detection, portfolio design etc. In this poster we will show how to implement and run experimental analysis of Thompson Sampling and Upper Confidence Bound (UCB) algorithm using Python - two of the most successful algorithms to tackle this problem. For our implementation and experiments, we use numpy, pandas and PyPlot. We will also discuss how these algorithms can be used for optimal design of clinical trial. Finally we will discuss how solutions to bandit problem can be used to address ethical issues that are faced in largely practiced randomized clinical trials.

Through this poster we hope the PyCon community will:

1. Get acquainted with MAB problem and how its solution is used in many real world tech applications.
2. Learn how to implement algorithms to solve MAB problem and perform comparative analysis in Python
3. Learn its relevance in large scale clinical trials which will be useful to people working on health technologies.

Infectious Disease Identification in Human Transcriptome Using Python

Atia Amin, , Haque Ishfaq, Hassan Saad Ifti, Hassan Sami Adnan, Samara Sharmeen

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 31

Infectious diseases are a global concern causing more than 25% of annual death worldwide. To control the newly emerging and re-emerging diseases, early and accurate detections of pathogens is crucial. The traditional diagnosis procedures are limited in terms of time, cost, sensitivity and spectrum of available assay targets. Thus, developing more efficient, accurate, and easy-to-use tools for comprehensive diagnostic is necessary.
For overcoming these diagnostic challenges higher throughput and lower cost next genome sequencing technologies can be helpful. In this poster, we will present a Python based bioinformatics pipeline for identifying infectious agents from human genome sequencing data. To perform data analysis and visualization in this pipeline, we use NumPy, Pandas and Matplotlib libraries from Python. In this poster, we will show that our pipeline is able to detect previously unrecognized infections and host responses that will help to discover novel pathogens in the future.

We believe that pathogen detection using computational tools will increase diagnostic yield, improve targeted treatment and help during the public health emergencies. The poster will exhibit how Python can be used to create such tools for controlling emerging infectious diseases and improve our health.

Do You Trust Your Data?

Hassan Saad Ifti, Atia Amin, Haque Ishfaq, Hassan Sami Adnan, Samara Sharmeen

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 32

The importance of data-driven solutions in our lives has become ever so significant. From an individual’s daily grocery shopping to high-end scientific solutions, data are playing a larger role than ever before. Contrary to popular believe , the collection of data is merely a small step in the whole process of data acquisition and interpretation. In the ‘real world,’ no data collection method is infallible. These ‘errors’ —if not accounted for—could lead to erroneous conclusions. It is therefore necessary to quantify the ‘errors’ in data. This further allows for an improved data collection in the future. In this poster, we present how an easy uncertainty quantification can be performed in python using a Monte-Carlo simulation. Instead of going through the Monte-Carlo method step by step, we will display why it is important, how it is employed, and where you can apply this method. A script will be provided on GitHub. Whichever field your in, next time you present your data, you will know how much you trust your data.

Learning as a TA helper in an Adafruit Maker Class

Aarav, Meenal Pant

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 33

I am an 11-year-old student, who likes to do Maker projects. I attended PyCon 2019, where I did a fun workshop hosted by Adafruit. I also won a Maker kit in the Microsoft booth.
In summer 2019, I became a teaching assistant for a Maker class where I was able to teach what I learned at the Adafruit workshop. When the class needed help I would try my best to answer their questions or look it up on Google. One thing I helped the students with was reprogramming their microcontrollers to default when they messed them up. I also helped the teacher with supplies. ( batteries, scissors, etc.) This poster will describe the maker project that I helped out with and my learnings from this experience.
I will cover:

  • Introduction to our robot project
  • Learning and teaching
    • How to use Mu-editor
    • Going through Adafruit worksheet and various code samples with class
    • Challenges
  • Making the robot with materials like cardboard, silly eyes, servo motor, microcontroller
  • Sharing our learnings in a local showcase

Machine Learning for Genomics: Random Forest for Schizophrenia Gene Prioritization

Kynon JM Benjamin

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 34

Schizophrenia is a devastating neuropsychiatric disorder characterized by a combination of psychotic symptoms, like hallucinations and delusions, motivational symptoms, like anhedonia and avolition, and cognitive dysfunction. Schizophrenia is currently understood as a neurodevelopmental disorder in which complex interactions of genetic and environmental factors contribute to dysfunctional brain development. The power of high-throughput genomic technologies like RNA-sequencing (RNA-seq) are unquestionable; however, the high dimensionality and highly correlated structure of genomic data poses several challenges for traditional statistical and bioinformatic data analysis. Random forest is uniquely suited to dealing with correlation and interactions among genomic features due to the grouping property of tree building. and can be used to select and rank features via feature elimination. In this study, we leveraged the large datasets from the BrainSeq Consortium to apply random forest with feature elimination for prioritization of genomic features predictive for schizophrenia.

Unsupervised Neural Machine Translation from West African Pidgin to English

Kelechi Ogueji, Orevaoghene Ahia

Sunday 10 a.m.–1 p.m. in posters - Expo Hall - Board 35

Over 1000 languages are spoken across West and Central Africa, but one language significantly unifies them all - pidgin. With over a 100 million speakers, no natural language processing work has been done on this language. Our work provides the first ever pidgin corpus, trains the first ever pidgin word vectors and successfully performs unsupervised neural machine translation from and to English.