ML-As-3

Problem 1: Hard-margin SVM. (18 pts)

You are given the following two sets of data points, each belonging to one of the two classes
(class 1 and class -1):

  • Class 1 (labeled as +1):
  • Class -1 (labeled as -1):

Please find the optimal separating hyperplane using a linear SVM and derive the equation of the hyperplane. Assume the hard-margin SVM.

1. Write down the formulation of SVM, including the separation hyperplane, the constraints and the final optimization problem with parameters. (4 pts)

The hyperplane is defined through and as a set of points such that

  • : weight vector
  • : scalar

Subject to the constraint

  • is the class label of

Final optimization problem:

2. Write down the Lagrangian form for this problem using the parameters and Lagrange multipliers. Please also write out its dual form. (10 pts)

The Lagrangian form:

where are the Lagrange multipliers associated with each constraint.

The dual form of the optimization problem

subject to

3. Assume that the Lagrangian multipliers ’s are all 0.5 and that the point is a support vector for ease of calculation. Please calculate the values of weight vector and bias . Write out the explicit form of the hyperplane. (4 pts)

If

Since the support vector is , we have:

The explicit form of the hyperplane.

Problem 2: Soft-margin SVM. (20 pts)

Suppose we have the data points with corresponding labels . We want to use a soft-margin SVM to classify these data points with a regularization parameter .

1. Write down the formulation of the soft-margin SVM. for this problem using , , , and . Write out explicitly their dimensions. (3 pts)

For a soft-margin SVM, the optimization problem can be formulated as follows:

subject to:

where:

  • is the weight vector,
  • is the bias,
  • is the vector of slack variables, and
  • is the regularization parameter that controls the trade-off between maximizing the margin and minimizing the classification error.

Dimensions:

  • has dimension ,
  • has dimension ,
  • has dimension ,
  • is a scalar,
  • has dimension .

2. Write down the Lagrangian form and derive the dual for the problem. Write down the detailed derivation steps. (12 pts)

The primal objective function is:

where and are Lagrange multipliers for the constraints and , respectively.

To derive the dual problem, we take the partial derivatives of with respect to , , and and set them to zero:

Partial derivative with respect to :

Partial derivative with respect to :

Partial derivative with respect to :

By substituting back into the objective function, we obtain the dual problem:

subject to:

3. Obtain the decision boundary. (3 pts)

The decision boundary is given by:

where from the dual problem. To classify a new point , we use the decision function:

4. Explain why disappears in the dual. (2 pts)

In the dual formulation, disappears because it only appears in the primal objective function as part of the constraints and is not involved in the dual variables.
By taking the partial derivatives with respect to and setting them to zero, we express in terms of and .
Consequently, is not explicitly present in the dual formulation, as it is fully captured by the constraints on (specifically, ).

Problem 3: Kernel SVM. (17 pts)

Consider the following dataset with four training points:

We want to use the polynomial kernel to classify these data points with a soft-margin SVM. The regularization parameters .

To solve Problem 3 on Kernel SVM, let’s go through each part step-by-step.

1. Compute the Kernel Matrix (6 pts)

The kernel matrix is a matrix where each element . Let’s compute each entry using the given kernel function.










Since is symmetric, we can fill in the remaining entries by symmetry:

2. Set up the Dual Optimization Problem. You can use the results from Problem 2. (4 pts)

Using the results from Problem 2, the dual problem for a soft-margin SVM with a kernel function becomes:

subject to:

where in this problem.

3. Suppose the Lagrange multipliers ’s are

, , , and .

and is a support vector. (2 pts)

The bias is calculated as:

where we can use (with ) as the support vector.

Substitute :

Calculating each term in the summation:




(since )

Summing these values:

4. Classify a New Point using the learned kernel SVM model. (5 pts)

To classify the point , we use the decision function:

Let’s compute each :

Now, calculate :

Substitute the values:

Calculate each term:



Adding them up with :

Since , we classify ​ as belonging to class .