{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Homework 5 - Classification of flower petal shapes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This data set consists of 3 different types of irises’ (Setosa, Versicolour, and Virginica) petal and sepal length, stored in a 150x4 numpy.ndarray.\n",
    "\n",
    "The rows being the samples and the columns being:\n",
    "\n",
    "1. sepal length,\n",
    "2. sepal width,\n",
    "3. petal length and\n",
    "4. petal width."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn import datasets\n",
    "iris = datasets.load_iris()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The data comes as a dictionary. You can access the predictors using `iris.data` and the classes using `iris.target`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = iris.data\n",
    "y = iris.target"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Task**: How many samples are in the data set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Task**: Plot the sepal length on the x-axis and the sepal width on the y-axis. Color each of the three types of irises differently.\n",
    "Add a legend that gives the correct iris type (0-Setosa, 1-Versicolour, 2-Virginica)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Task**:\n",
    "Split your data into a training and a test set.\n",
    "Put the first 40 samples within each class in the training set and the remaining samples in a test data set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the lecture you've heard about the classification method called\n",
    "*Linear discriminant analysis (LDA)*.\n",
    "\n",
    "**Task**: Find a way using `scikit-learn` to accomplish a linear discriminant analysis.\n",
    "\n",
    "Perform an LDA using only the first two predictors, i.e., `sepal length` and `sepal width`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Task**: What is the proportion of correctly classified irises in the *test* data set."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Task**: Now, incorporate all of the predictors. How does the proportion of correct classifications change?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}