machine learning - SVM Tensorflow implementation -
i've been following prof. ng 's lecture, , trying implement svm on jupyter notebook using tensorflow. however, model doesn't seem converged properly.
i guess have wrong loss function , might end fit model improperly.
and below full graph construction code of model:
tf.reset_default_graph() #training hyper parameters learning_rate = 0.000001 c = 20 gamma = 50 x = tf.placeholder(tf.float32, shape=(none,2)) y = tf.placeholder(tf.float32, shape=(none,1)) landmark = tf.placeholder(tf.float32, shape=(none,2)) w = tf.variable(np.random.random((num_data)),dtype=tf.float32) b = tf.variable(np.random.random((1)),dtype=tf.float32) batch_size = tf.shape(x)[0] #rbf kernel tile = tf.tile(x, (1,num_data)) diff = tf.reshape( tile, (-1, num_data, 2)) - landmark tile_shape = tf.shape(diff) sq_diff = tf.square(diff) sq_dist = tf.reduce_sum(sq_diff, axis=2) f = tf.exp(tf.negative(sq_dist * gamma)) wf = tf.reduce_sum(w * f, axis=1) + b condition = tf.greater_equal(wf, 0) h = tf.where(condition, tf.ones_like(wf),tf.zeros_like(wf)) error_loss = c * tf.reduce_sum(y * tf.maximum(0.,1-wf) + (1-y) * tf.maximum(0.,1+wf)) weight_loss = tf.reduce_sum(tf.square(w))/2 total_loss = error_loss + weight_loss optimizer = tf.train.gradientdescentoptimizer(learning_rate) train = optimizer.minimize(total_loss)
i'm using gaussian kernel , feeding whole training set landmark.
and loss function same 1 shown on lecture long have right implementation on it.
i'm pretty sure i'm missing something.
note kernel matrix should have batch_size^2
entries, while tensor wf
has shape (batch_size, 2)
. idea compute k(x_i, x_j) each pair (x_i, x_j) in dataset, , use these kernel values inputs svm.
i'm using andrew ng's lecture notes on svms reference; on page 20 derives final optimization problem. you'll want replace inner-product <x_i, x_j>
kernel function.
i recommend starting linear kernel instead of rbf , comparing code against out-of-the-box svm implementation sklearn's. make sure optimization code working .
a final note: though should possible train svm using gradient descent, never trained way in practice. svm optimization problem can solved via quadratic programming, , methods training svms take advantage of this.
Comments
Post a Comment