Perceptron

Author

Alexandre Dauphin

Open in Colab

1 Clasification task

In this section we discuss yet anotehr linear classifier, the perceptron. The perceptron is a machinea learning algorithm that allows one to separate with an hyperplane a linearly separable dataset.

To this end, we use the function make_blobs of scikit learn to generate a dataset with two labelled cluters in two dimensions.

x, y =make_blobs(centers=np.array([[3,3],[0,0]]), cluster_std=0.5)
y[y==0]=-1
print(x[:4])
print(y[:4])
[[-1.18004026 -0.61965393]
 [-0.57032133  0.96112062]
 [ 3.67974571  2.09263418]
 [ 0.16344673 -0.22585435]]
[ 1  1 -1  1]

Figure 1 shows the auto generated dataset.

Code
fig =px.scatter(x=x[:,0],y=x[:,1],color=y.astype(str))
fig.update_layout(xaxis_title='x1',yaxis_title='x2')
Figure 1: Auto generated dataset with two clusters

The goal of the perceptron algorithm is to find an hyperplane that separates these two clusters. Let us therefore consider the line

\[f(x) = w_0+w_1x_1+w_2 x_2=0\]

The normal vector to this line is given by \(\mathbf{w}^*=\mathbf{w}/\parallel \mathbf{w} \parallel\).

Exercise

Show that

  • for any point on the line \[ {\mathbf{w}^*}^T \mathbf{x}_1 = -w_0.\]

  • for any two points \(\mathbf{x_1}\) and \(\mathbf{x}_2\) one the line \[ {\mathbf{w}^*}^T (\mathbf{x_1}-\mathbf{x}_2)=0.\]

The signed distance between the line and a point is given by

\[{\mathbf{w}^*}^T(\mathbf{x}-\mathbf{x}_0)=\frac{\mathbf{w}^T\mathbf{x}+w_0}{\parallel \mathbf{w}\parallel}=\frac{f(x)}{\parallel \mathbf{w}\parallel}\]

2 Loss function

The signed distance between the line and a point discussed in the previsou section offers a natural way to define a loss function. For each datapoint\({x_i,y_i}\), we would like to maximize the product

\[y_i (\mathbf{w}^T\mathbf{x}+w_0).\]

Indeed when \(y_i\) and \(\mathbf{w}^T\mathbf{x}+w_0\), this quantity is positive.

Therefore, we define the loss function to minimize the mean over the missclassified examples

\[L =-\sum_{i=1}^{N_\text{misclassified}} y_i(\mathbf{w}^T\mathbf{x}_i+w_0)\]

def ff(w,x):
    return -(w[1]*x+w[0])/w[2]
x1 = np.array([x[:,0].min()-1,x[:,0].max()+1])

vec = np.zeros((df.shape[0],2))
for i,w in enumerate(df['w']):
    vec[i,:] = ff(w,x1)
    
loss = df['value'].to_numpy()
Code
color = y.astype(str)
color[y>0] = "blue"
color[y<0] = "red"

frames = [go.Frame(data=[go.Scatter(x=x1, y=vec[i,:],mode='lines')],layout=go.Layout(title_text=f'step:{i}, Loss:{loss[i]:.2f}')) for i in range(loss.size)]

buttons = [dict(label="Play",method="animate",
                args=[None, {"frame": {"duration": 100, "redraw": True},
                             "fromcurrent": True, 
                             "transition": {"duration": 300,"easing": "quadratic-in-out"}}]),
           dict(label="Pause",method="animate",
                args=[[None], {"frame": {"duration": 0, "redraw": False},"mode": "immediate","transition": {"duration": 0}}]),
          dict(label="Restart",method="animate",
                args=[None])]

Fig = go.Figure(
    data=[go.Scatter(x=x1, y= vec[0,:],mode='lines',name = 'line'),
          go.Scatter(x=x[:,0], y=x[:,1], mode="markers", marker_color=color,name='data',
                hovertemplate='x:%{x:.2f}'
                +'<br>y:%{y:.2f}</br><extra></extra>')],
    layout=go.Layout(
        xaxis=dict(range=[x[:,0].min()-2, x[:,0].max()+2], autorange=False),       
        yaxis=dict(range=[x[:,1].min()-2, x[:,1].max()+2], autorange=False),
        updatemenus=[dict(
            type="buttons",
            buttons=buttons)]
    ),
    frames= frames
)

Fig.show()
Figure 2: Animation of t