Intro to Machine Learning with Python
A basic knowledge of python is assumed for this course. i.e. knowledge of basic data structures, operations and how to write a script. The notebooks have been developed to be compatible with Python 2.7x and 3.x and, may require some Python packages to be updated to the most recent version (e.g. scikit-learn).
The dataset used in this workshop is compiled from Galaxy Zoo DR1 and the Sloan Digital Sky Survey (SDSS) (using the DR9 SQL search).
The notebooks are descriptive and comprehensive enough to be attempted at your own pace – a solution notebook is also provided. The lecture notes explain the intuition behind how different machine learning algorithms work.
Topics covered
- Data preparation
- Exploratory data analysis
- Classification
- Cross-validation
- Learning curves
- Model tuning
- Reporting
- Regression
- Clustering
- Dimensionality reduction