machine learning - SVM Tensorflow implementation -

August 15, 2015

i've been following prof. ng 's lecture, , trying implement svm on jupyter notebook using tensorflow. however, model doesn't seem converged properly.

scattered plot after 5000 steps of training

i guess have wrong loss function , might end fit model improperly.

and below full graph construction code of model:

tf.reset_default_graph()  #training hyper parameters  learning_rate = 0.000001 c = 20 gamma = 50  x = tf.placeholder(tf.float32, shape=(none,2)) y = tf.placeholder(tf.float32, shape=(none,1)) landmark = tf.placeholder(tf.float32, shape=(none,2))  w = tf.variable(np.random.random((num_data)),dtype=tf.float32) b = tf.variable(np.random.random((1)),dtype=tf.float32)  batch_size = tf.shape(x)[0]  #rbf kernel tile = tf.tile(x, (1,num_data)) diff = tf.reshape( tile, (-1, num_data, 2)) - landmark tile_shape = tf.shape(diff) sq_diff = tf.square(diff) sq_dist = tf.reduce_sum(sq_diff, axis=2) f = tf.exp(tf.negative(sq_dist * gamma))  wf = tf.reduce_sum(w * f, axis=1) + b  condition = tf.greater_equal(wf, 0) h = tf.where(condition,  tf.ones_like(wf),tf.zeros_like(wf))  error_loss = c * tf.reduce_sum(y * tf.maximum(0.,1-wf) + (1-y) * tf.maximum(0.,1+wf)) weight_loss = tf.reduce_sum(tf.square(w))/2  total_loss = error_loss + weight_loss  optimizer = tf.train.gradientdescentoptimizer(learning_rate) train = optimizer.minimize(total_loss)

i'm using gaussian kernel , feeding whole training set landmark.

and loss function same 1 shown on lecture long have right implementation on it.

loss function on lecture

i'm pretty sure i'm missing something.

note kernel matrix should have batch_size^2 entries, while tensor wf has shape (batch_size, 2). idea compute k(x_i, x_j) each pair (x_i, x_j) in dataset, , use these kernel values inputs svm.

i'm using andrew ng's lecture notes on svms reference; on page 20 derives final optimization problem. you'll want replace inner-product <x_i, x_j> kernel function.

i recommend starting linear kernel instead of rbf , comparing code against out-of-the-box svm implementation sklearn's. make sure optimization code working .

a final note: though should possible train svm using gradient descent, never trained way in practice. svm optimization problem can solved via quadratic programming, , methods training svms take advantage of this.

Search This Blog

RT

machine learning - SVM Tensorflow implementation -

Comments

Post a Comment

Popular posts from this blog

Ansible warning on jinja2 braces on when -

Parsing a protocol message from Go by Java -

node.js - Node js - Trying to send POST request, but it is not loading javascript content -