python 2.7 - Getting predicted probabilities back in FastFM / factorisation machine models -


i built model fastfm - , have y_prediction_hat (back calculated based on equations) x_test vector learnt model coeffs (inputs : x_train, x_test,y_train)

i used mcmc (monte carlo) , have bias, coefficients linear combination, second order factor matrix (together, model coeffs) , hyper parameter.

wondering how predicted probabilities @ runtime ?

import numpy np  def get_first_order_weights(x_test, w_):     m_x_test = np.matrix(x_test)     m_w_ = np.matrix(w_).transpose()     return float((m_x_test * m_w_))  def get_weight(v_, i,j,rank=none):     """returns weight of i,j # see equation #2 , steffen rendle(2011, sigir) """     weight=0     rank_ in xrange(rank) :         weight=weight+v_[rank_][i]*v_[rank_][j]     return weight  def get_second_order_weights(x_test, v_,rank):     second_order_w=0     in xrange(len(x_test)):         j in range(i, len(x_test)):             if !=j :                 if x_test[i] !=0 ,  x_test[j] !=0  :                     weight = get_weight(v_, i=i, j=j, rank=rank)                     second_order_w =  second_order_w + weight * x_test[i] * x_test[j]     return  second_order_w   def find_prediction(x_test, w0_, w_, v_,rank):     y_pred = w0_ + get_first_order_weights(x_test, w_) + get_second_order_weights(x_test, v_,rank)     return y_pred    """quick test """ y_pred_proba=0.666498763558 # want. output of fm.fit_predict_proba(x_train, y_train,x_test) x_test y_pred=0.551379351199 # find_prediction(x_test, w0_, w_, v_) gives hyper_param_=[ 0.72957556,  1.09654054,  4.28621216,  0.80880482, -0.26278131,         0.17551987, -0.17272419] x_test = [0, 0, 0, 1, 0, 1, 0, 0] w0_ =0.44717137049755357 rank=2 # rank : rank of factorization used second order interactions. w_ = [ 0.13673361, -0.50175393, -0.43582785,  0.91480033,  0.64150534,          0.85911802, -0.20877941, -0.20461079] v_ =[[-0.30315417, -0.01520948,  0.35000127,  0.54788385, -0.26731813,          -0.07202204,  0.74163199,  0.25263453],         [ 0.73052313,  0.93649875,  0.55294677,  1.23317741, -0.88026332,          -1.321992  , -0.44626548,  0.27878056]]  get_y_prediction=find_prediction(x_test, w0_, w_, v_,rank=rank) print get_y_prediction #0.551379337337 

question : find_prediction(x_train , w0_, w_, v_) ( 0.551379337337) above maps output of fm.predict(x_test).

what tried

  1. modelling probit link function

1a)

    scipy.stats import norm     x_beta=(np.matrix(x_test)*np.matrix(w_).transpose())      norm.cdf(x_beta)  # gives 0.96196167 , y =Φ(xβ + ε), cumulative normal cdf 

1b)

def phi(x):         return (1.0 + math.erf(x / math.sqrt(2.0))) / 2.0     phi(x_beta+w0_) # gives 0.9868275574290135 y =Φ(xβ + ε): cumulative distribution function standard normal distribution 
  1. using sigmoid function 1.0/(1+math.exp(-1*float(y_pred))) #gives 0.6344555517817589 different 0.666498763558 # want.(see side note #2)

  2. mapping logit link function , gives 0.6394991222434503

    import math import numpy np x_beta=(np.matrix(x_test)*np.matrix(w_).transpose()) # see x_test & w_ above math.pow(1+math.pow(float(x_beta), -1),-1) # pr(y=1∣x)=[1+e−x′β]−1  

what like : function output of fm.fit_predict_proba(x_train, y_train,x_test) - how transform y_pred_hat give probability values instead ? (rendle equation #2, page :4 : https://www.ismll.uni-hildesheim.de/pub/pdfs/rendle_et_al2011-context_aware.pdf )

i need these @ runtime, saving w0_, w1_, v_,and hyper parameters.

fm.fit_predict(x_train, y_train,x_test) // class "labels" 1/0

fm.fit_predict_proba(x_train, y_train,x_test) // class probabilities

fm.predict(x_test) // continuous number, eg : 15.35047575

additional note: // fm initialised fm = mcmc.fmclassification(n_iter=100, init_stdev=0.1, rank=rank, random_state=seed, copy_x=true) 

side notes :

  1. i looked @ author's paper(immanuel bayer), mentions mcmc classification modelled loss function of probit(map) , probit, sigmoid, couldn't see option specify loss function (i hoping use model params prob. value if clear)

  2. to y_pred (probability values) author uses fit_predict_proba(self, x_train, y_train, x_test), gets y_pred ffm.ffm_mcmc_fit_predict, in turn calls cffm.ffm_mcmc_fit_predict, , ffm.h/ffm.c - couldn't locate actual function giving probability value in y_pred. https://github.com/ibayer/fastfm/blob/master/fastfm/mcmc.py

update :

here results of prob. returned fm.fit_predict_proba (red), sigmoid of y_pred_hat (back calculated, real number, y_pred_hat maps fm.predict(x_test)) (green) , corrected probability values when corrected median of % difference between red & green (in yellow). closest retrieving prob. , corrected (in yellow). closer red.

enter image description here

enter image description here


Comments

Popular posts from this blog

python - Selenium remoteWebDriver (& SauceLabs) Firefox moseMoveTo action exception -

html - How to custom Bootstrap grid height? -

transpose - Maple isnt executing function but prints function term -