python 2.7 - Getting predicted probabilities back in FastFM / factorisation machine models -
i built model fastfm - , have y_prediction_hat (back calculated based on equations) x_test vector learnt model coeffs (inputs : x_train, x_test,y_train)
i used mcmc (monte carlo) , have bias, coefficients linear combination, second order factor matrix (together, model coeffs) , hyper parameter.
wondering how predicted probabilities @ runtime ?
import numpy np def get_first_order_weights(x_test, w_): m_x_test = np.matrix(x_test) m_w_ = np.matrix(w_).transpose() return float((m_x_test * m_w_)) def get_weight(v_, i,j,rank=none): """returns weight of i,j # see equation #2 , steffen rendle(2011, sigir) """ weight=0 rank_ in xrange(rank) : weight=weight+v_[rank_][i]*v_[rank_][j] return weight def get_second_order_weights(x_test, v_,rank): second_order_w=0 in xrange(len(x_test)): j in range(i, len(x_test)): if !=j : if x_test[i] !=0 , x_test[j] !=0 : weight = get_weight(v_, i=i, j=j, rank=rank) second_order_w = second_order_w + weight * x_test[i] * x_test[j] return second_order_w def find_prediction(x_test, w0_, w_, v_,rank): y_pred = w0_ + get_first_order_weights(x_test, w_) + get_second_order_weights(x_test, v_,rank) return y_pred """quick test """ y_pred_proba=0.666498763558 # want. output of fm.fit_predict_proba(x_train, y_train,x_test) x_test y_pred=0.551379351199 # find_prediction(x_test, w0_, w_, v_) gives hyper_param_=[ 0.72957556, 1.09654054, 4.28621216, 0.80880482, -0.26278131, 0.17551987, -0.17272419] x_test = [0, 0, 0, 1, 0, 1, 0, 0] w0_ =0.44717137049755357 rank=2 # rank : rank of factorization used second order interactions. w_ = [ 0.13673361, -0.50175393, -0.43582785, 0.91480033, 0.64150534, 0.85911802, -0.20877941, -0.20461079] v_ =[[-0.30315417, -0.01520948, 0.35000127, 0.54788385, -0.26731813, -0.07202204, 0.74163199, 0.25263453], [ 0.73052313, 0.93649875, 0.55294677, 1.23317741, -0.88026332, -1.321992 , -0.44626548, 0.27878056]] get_y_prediction=find_prediction(x_test, w0_, w_, v_,rank=rank) print get_y_prediction #0.551379337337 question : find_prediction(x_train , w0_, w_, v_) ( 0.551379337337) above maps output of fm.predict(x_test).
what tried
- modelling
probit linkfunction
1a)
scipy.stats import norm x_beta=(np.matrix(x_test)*np.matrix(w_).transpose()) norm.cdf(x_beta) # gives 0.96196167 , y =Φ(xβ + ε), cumulative normal cdf 1b)
def phi(x): return (1.0 + math.erf(x / math.sqrt(2.0))) / 2.0 phi(x_beta+w0_) # gives 0.9868275574290135 y =Φ(xβ + ε): cumulative distribution function standard normal distribution using
sigmoidfunction1.0/(1+math.exp(-1*float(y_pred)))#gives0.6344555517817589different0.666498763558# want.(see side note #2)mapping
logit linkfunction , gives0.6394991222434503import math import numpy np x_beta=(np.matrix(x_test)*np.matrix(w_).transpose()) # see x_test & w_ above math.pow(1+math.pow(float(x_beta), -1),-1) # pr(y=1∣x)=[1+e−x′β]−1
what like : function output of fm.fit_predict_proba(x_train, y_train,x_test) - how transform y_pred_hat give probability values instead ? (rendle equation #2, page :4 : https://www.ismll.uni-hildesheim.de/pub/pdfs/rendle_et_al2011-context_aware.pdf )
i need these @ runtime, saving w0_, w1_, v_,and hyper parameters.
fm.fit_predict(x_train, y_train,x_test) // class "labels" 1/0
fm.fit_predict_proba(x_train, y_train,x_test) // class probabilities
fm.predict(x_test) // continuous number, eg : 15.35047575
additional note: // fm initialised fm = mcmc.fmclassification(n_iter=100, init_stdev=0.1, rank=rank, random_state=seed, copy_x=true) side notes :
i looked @ author's paper(immanuel bayer), mentions mcmc classification modelled loss function of probit(map) , probit, sigmoid, couldn't see option specify loss function (i hoping use model params prob. value if clear)
to y_pred (probability values) author uses
fit_predict_proba(self, x_train, y_train, x_test), gets y_predffm.ffm_mcmc_fit_predict, in turn callscffm.ffm_mcmc_fit_predict, ,ffm.h/ffm.c- couldn't locate actual function giving probability value in y_pred. https://github.com/ibayer/fastfm/blob/master/fastfm/mcmc.py
update :
here results of prob. returned fm.fit_predict_proba (red), sigmoid of y_pred_hat (back calculated, real number, y_pred_hat maps fm.predict(x_test)) (green) , corrected probability values when corrected median of % difference between red & green (in yellow). closest retrieving prob. , corrected (in yellow). closer red.


Comments
Post a Comment