[Coursera机器学习]Support Vector Machines WEEK7编程作业

来源：互联网发布：画平面图软件编辑：程序博客网时间：2024/06/05 20:14

1.2 SVM with Gaussian Kernels

You should now complete the code in gaussianKernel.m to compute the Gaussian kernel between two examples, (x(i); x(j)). The Gaussian kernel function is dened as:

Kgaussian(x(i),x(j))=exp(−∥x(i)−x(j)∥22σ2)=exp(−∑nk=1(x(i),x(j))22σ2)

% ====================== YOUR CODE HERE ======================% Instructions: Fill in this function to return the similarity between x1%               and x2 computed using a Gaussian kernel with bandwidth%               sigma%%sim = exp(-(sum((x1 - x2).^2)) / (2 * sigma^2));

1.2.3 Example Dataset 3

Your task is to use the cross validation set Xval, yval to determine the best C and parameter σ to use. You should write any additional code necessary to help you search over the parameters C and σ.For both C and σ, we suggest trying values in multiplicative steps (e.g., 0:01; 0:03; 0:1; 0:3; 1; 3; 10; 30). Note that you should try all possible pairs of values for C and σ (e.g., C = 0:3 and σ = 0:1).

% ====================== YOUR CODE HERE ======================% Instructions: Fill in this function to return the optimal C and sigma%               learning parameters found using the cross validation set.%               You can use svmPredict to predict the labels on the cross%               validation set. For example, %                   predictions = svmPredict(model, Xval);%               will return the predictions on the cross validation set.%%  Note: You can compute the prediction error using %        mean(double(predictions ~= yval))%steps = [ 0.01 0.03 0.1 0.3 1 3 10 30 ];minError = Inf;minC = Inf;minSigma = Inf;% i*j means every condition of different C and Sigma.for i = 1:length(steps)    for j = 1:length(steps)        currentC = steps(i);        currentSigma = steps(j);        model = svmTrain(X, y, currentC, @(x1, x2) gaussianKernel(x1, x2, currentSigma));        predictions = svmPredict(model, Xval);        error = mean(double(predictions ~= yval));        if(error < minError)            minError = error;            minC = currentC;            minSigma = currentSigma;        end    end    endC = minC;sigma = minSigma;

2.1.1 Vocabulary List

Your task now is to complete the code in processEmail.m to perform this mapping. In the code, you are given a string str which is a single word from the processed email. You should look up the word in the vocabulary list vocabList and find if the word exists in the vocabulary list. If the word exists, you should add the index of the word into the word indices variable. If the word does not exist, and is therefore not in the vocabulary, you can skip the word.

% Look up the word in the dictionary and add to word_indices if    % found    % ====================== YOUR CODE HERE ======================    % Instructions: Fill in this function to add the index of str to    %               word_indices if it is in the vocabulary. At this point    %               of the code, you have a stemmed word from the email in    %               the variable str. You should look up str in the    %               vocabulary list (vocabList). If a match exists, you    %               should add the index of the word to the word_indices    %               vector. Concretely, if str = 'action', then you should    %               look up the vocabulary list to find where in vocabList    %               'action' appears. For example, if vocabList{18} =    %               'action', then, you should add 18 to the word_indices     %               vector (e.g., word_indices = [word_indices ; 18]; ).    %     % Note: vocabList{idx} returns a the word with index idx in the    %       vocabulary list.    %     % Note: You can use strcmp(str1, str2) to compare two strings (str1 and    %       str2). It will return 1 only if the two strings are equivalent.    %    for i = 1:length(vocabList)        if(strcmp(vocabList(i), str))            word_indices = [word_indices; i]            break;        end    end

2.2 Extracting Features from Emails

You should now complete the code in emailFeatures.m to generate a feature vector for an email, given the word_indices.

% ====================== YOUR CODE HERE ======================% Instructions: Fill in this function to return a feature vector for the%               given email (word_indices). To help make it easier to %               process the emails, we have have already pre-processed each%               email and converted each word in the email into an index in%               a fixed dictionary (of 1899 words). The variable%               word_indices contains the list of indices of the words%               which occur in one email.% %               Concretely, if an email has the text:%%                  The quick brown fox jumped over the lazy dog.%%               Then, the word_indices vector for this text might look %               like:%               %                   60  100   33   44   10     53  60  58   5%%               where, we have mapped each word onto a number, for example:%%                   the   -- 60%                   quick -- 100%                   ...%%              (note: the above numbers are just an example and are not the%               actual mappings).%%              Your task is take one such word_indices vector and construct%              a binary feature vector that indicates whether a particular%              word occurs in the email. That is, x(i) = 1 when word i%              is present in the email. Concretely, if the word 'the' (say,%              index 60) appears in the email, then x(60) = 1. The feature%              vector should look like:%%              x = [ 0 0 0 0 1 0 0 0 ... 0 0 0 0 1 ... 0 0 0 1 0 ..];%%for i = 1:length(word_indices)    x(word_indices(i)) = 1;end

0 0