Python Data Analysis with NumPy and pandas (PYT251)
If you or your team are using or plan to use Python for data science or data analytics, then this is the right Python course for you. The course assumes that you already have had a good amount of Python training and/or experience. Your live instructor will start the class by teaching you how to use Jupyter Notebook, a great tool for writing, testing, and sharing quick Python programs. Even if you do not end up using Jupyter Notebook as your main Python IDE, you will appreciate having it as a tool in your Python toolkit.
You will learn NumPy, which makes working with arrays and matrices (in place of lists and lists of lists) much more efficient, and pandas, which makes manipulating, munging, slicing, and grouping data much easier. You will also learn some simple data visualization techniques with matplotlib.
- Learn to work with Jupyter Notebook.
- Learn to use NumPy to work with arrays and matrices of numbers.
- Learn to work with pandas to analyze data.
- Learn to work with matplotlib from within pandas.
- Jupyter Notebook
- Getting Started with Jupyter Notebook
- Creating Your First Jupyter notebook
- More Experimenting with Jupyter Notebook
- Getting the Class Files
- Markdown
- Magic Commands
- Automagic
- Autosave
- Directory Commands
- Bookmarking
- Command History
- Last Three Inputs and Outputs
- Environment Variables
- Loading and Running Code from Files
- Shell Execution
- More Magic Commands
- Getting Help
- NumPy
- Efficiency
- NumPy Arrays
- Getting Basic Information about an Array
- np.arange()
- Similar to Lists
- Different from Lists
- Universal Functions
- Multiplying Array Elements
- Multi-dimensional Arrays
- Retrieving Data from an Array
- Modifying Parts of an Array
- Adding a Row Vector to All Rows
- More Ways to Create Arrays
- Getting the Number of Rows and Columns in an Array
- Random Sampling
- Rolling Doubles
- Using Boolean Arrays to Get New Arrays
- More with NumPy Arrays
- pandas
- Series
- Other Ways of Creating Series
- np.nan
- Accessing Elements from a Series
- Retrieving Data from a Series
- Series Alignment
- Using Boolean Series to Get New Series
- Comparing One Series with Another
- Element-wise Operations and the apply() Method
- Series: A More Practical Example
- DataFrame
- Creating a DataFrame from a NumPy Array
- Creating a DataFrame using Existing Series as Rows
- Creating a DataFrame using Existing Series as Columns
- Creating a DataFrame from a CSV
- Exploring a DataFrame
- Getting Columns
- Exploring a DataFrame
- Cleaning Data
- Getting Rows
- Combining Row and Column Selection
- Scalar Data: at[] and iat[]
- Boolean Selection
- Using a Boolean Series to Filter a DataFrame
- Series and DataFrames
- Plotting with matplotlib
- Inline Plots in Jupyter Notebook
- Line Plot
- Bar Plot
- Annotation
- Plotting a DataFrame
- Other Kinds of Plots
- Series
Each student will receive a comprehensive set of materials, including course notes and all the class examples.
Experience in the following is required for this Python class:
- Basic Python programming experience. In particular, you should be very comfortable with:
- Working with strings.
- Working with lists, tuples and dictionaries.
- Loops and conditionals.
- Writing your own functions.