Data Wrangling 101 (with Python) (Virtual)

Data Wrangling with an image of a man on a horse
When
-
Where

Online

Data in the wild can be messy, malformed, and/or generally ill-suited to the specifications of statistical analyses and machine-learning techniques. In this intermediate-level workshop, you'll learn how to use Python to clean, reshape, and transform data prior to analysis. Topics covered may include:

  • Editing strings with regular expressions
  • Converting data between wide and long formats
  • Dealing with null values
  • Grouping and aggregating data
  • Working with time series/datetime types
  • Encoding categorical values
  • Importing and exporting to and from common formats.

Attendees will understand the importance of clean, well-formed datasets; practice using Python to clean, reshape, and format data in various ways; and become familiar with best practices in writing Python code. It will be helpful to have had prior exposure to Python, such as through the "Introduction to Python" workshop or Python Camp. No installation of Python is needed.

This workshop is part of the Using Programming and Code for Research workshop series for for anyone who wants to get started or learn more about use programming languages like Python, R, or other applications. These tools can help you to collect, manipulate, clean, analyze, and visualize research data or automate many repetitive tasks. If you need personalized assistance with a data analysis, programming, or coding project, consider booking a consultation with one of our librarian-experts.

All sessions are free to GW students, faculty, staff, and alumni.