I would like to know, how probabilities are calculated in support vector machine.
I have used Iris data set and here is my decision values for three "SupportVectorMachine" (please find the PMML below to know the support vector and coefficient values) presented.
setosa/versicolor setosa/virginica versicolor/virginica
1.196152 1.091757 0.6708810
Based on voting mechanism explained in "dmg.org" site, I obtained the result as "setosa"
I am so confused how the probability values are calculated ? and these are the probability parameters obtained from R.
model$probA
[1] -3.589058 -3.793546 -3.518305
model$probB
[1] -0.16396052 -0.04387233 0.13178304
The probabilities from the above decision function and probability parameters are as follows.
setosa versicolor virginica
0.9795937 0.01161942 0.008786859
More info:
Kernel type : RBF
Input record:
Sepal_Length Sepal_Width Petal_Length Petal_Width
5.1 3.5 1.4 0.2
Could you please explain, how the probabilities are being calculated?
PMML
<SupportVectorMachineModel
modelName="C-SVC"
functionName="classification"
svmRepresentation="SupportVectors">
<MiningSchema>
<MiningField
name="Sepal_Length"
usageType="active"/>
<MiningField
name="Sepal_Width"
usageType="active"/>
<MiningField
name="Petal_Length"
usageType="active"/>
<MiningField
name="Petal_Width"
usageType="active"/>
<MiningField
name="Species"
usageType="predicted"/>
</MiningSchema>
<RadialBasisKernelType
gamma="0.1"/>
<VectorDictionary
numberOfVectors="55">
<VectorFields
numberOfFields="4">
<FieldRef
field="Sepal_Length_scaled"/>
<FieldRef
field="Sepal_Width_scaled"/>
<FieldRef
field="Petal_Length_scaled"/>
<FieldRef
field="Petal_Width_scaled"/>
</VectorFields>
<VectorInstance
id="vec1">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.222222 0.541667 0.118644 0.166667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec2">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.194444 0.416667 0.101695 0.0416667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec3">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.0555556 0.125 0.0508475 0.0833333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec4">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.194444 0.625 0.101695 0.208333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec5">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.138889 0.416667 0.0677966 0.0833333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec6">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.75 0.5 0.627119 0.541667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec7">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.583333 0.5 0.59322 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec8">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.722222 0.458333 0.661017 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec9">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.333333 0.125 0.508475 0.5 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec10">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.611111 0.333333 0.610169 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec11">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.388889 0.333333 0.59322 0.5 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec12">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.555556 0.541667 0.627119 0.625 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec13">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.166667 0.166667 0.389831 0.375 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec14">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.444444 0.416667 0.542373 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec15">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.5 0.375 0.627119 0.541667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec16">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.361111 0.375 0.440678 0.5 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec17">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.361111 0.416667 0.59322 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec18">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.527778 0.0833333 0.59322 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec19">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.444444 0.5 0.644068 0.708333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec20">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.555556 0.208333 0.661017 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec21">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.694444 0.333333 0.644068 0.541667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec22">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.666667 0.416667 0.677966 0.666667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec23">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.472222 0.375 0.59322 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec24">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.388889 0.25 0.423729 0.375 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec25">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.472222 0.291667 0.694915 0.625 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec26">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.305556 0.416667 0.59322 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec27">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.472222 0.583333 0.59322 0.625 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec28">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.666667 0.458333 0.627119 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec29">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.555556 0.125 0.576271 0.5 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec30">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.5 0.416667 0.610169 0.541667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec31">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.194444 0.125 0.389831 0.375 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec32">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.222222 0.208333 0.338983 0.416667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec33">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.416667 0.291667 0.694915 0.75 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec34">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.555556 0.375 0.779661 0.708333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec35">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.166667 0.208333 0.59322 0.666667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec36">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.611111 0.5 0.694915 0.791667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec37">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.583333 0.291667 0.728814 0.75 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec38">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.388889 0.208333 0.677966 0.791667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec39">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.611111 0.416667 0.762712 0.708333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec40">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.472222 0.0833333 0.677966 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec41">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.361111 0.333333 0.661017 0.791667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec42">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.555556 0.291667 0.661017 0.708333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec43">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.805556 0.5 0.847458 0.708333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec44">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.527778 0.333333 0.644068 0.708333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec45">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.5 0.416667 0.661017 0.708333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec46">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.805556 0.416667 0.813559 0.625 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec47">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>1 0.75 0.915254 0.791667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec48">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.555556 0.333333 0.694915 0.583333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec49">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.5 0.25 0.779661 0.541667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec50">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.583333 0.458333 0.762712 0.708333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec51">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.472222 0.416667 0.644068 0.708333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec52">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.416667 0.291667 0.694915 0.75 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec53">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.555556 0.208333 0.677966 0.75 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec54">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.611111 0.416667 0.711864 0.791667 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
<VectorInstance
id="vec55">
<REAL-SparseArray
n="4">
<Indices>1 2 3 4 </Indices>
<REAL-Entries>0.444444 0.416667 0.694915 0.708333 </REAL-Entries>
</REAL-SparseArray>
</VectorInstance>
</VectorDictionary>
<SupportVectorMachine>
<Extension
extender="spss.com">
<ResponseCategory
Response="setosa"
NonResponse="versicolor"/>
<ProbabilityParameter
paramA="-2.97547968747972"
paramB="-0.142289488974155"/>
</Extension>
<SupportVectors
numberOfAttributes="4"
numberOfSupportVectors="10">
<SupportVector
vectorId="vec1"/>
<SupportVector
vectorId="vec2"/>
<SupportVector
vectorId="vec3"/>
<SupportVector
vectorId="vec4"/>
<SupportVector
vectorId="vec5"/>
<SupportVector
vectorId="vec13"/>
<SupportVector
vectorId="vec16"/>
<SupportVector
vectorId="vec24"/>
<SupportVector
vectorId="vec31"/>
<SupportVector
vectorId="vec32"/>
</SupportVectors>
<Coefficients
absoluteValue="0.100637757773477"
numberOfCoefficients="10">
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="9.19776233533869"/>
<Coefficient
value="1.79961796126222"/>
<Coefficient
value="-10"/>
<Coefficient
value="-0.997380296600909"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
</Coefficients>
</SupportVectorMachine>
<SupportVectorMachine>
<Extension
extender="spss.com">
<ResponseCategory
Response="setosa"
NonResponse="virginica"/>
<ProbabilityParameter
paramA="-3.02394365729689"
paramB="-0.161186331924648"/>
</Extension>
<SupportVectors
numberOfAttributes="4"
numberOfSupportVectors="5">
<SupportVector
vectorId="vec1"/>
<SupportVector
vectorId="vec3"/>
<SupportVector
vectorId="vec4"/>
<SupportVector
vectorId="vec35"/>
<SupportVector
vectorId="vec48"/>
</SupportVectors>
<Coefficients
absoluteValue="0.0259927211275794"
numberOfCoefficients="5">
<Coefficient
value="10"/>
<Coefficient
value="3.75012722182129"/>
<Coefficient
value="3.31782269260525"/>
<Coefficient
value="-10"/>
<Coefficient
value="-7.06794991442654"/>
</Coefficients>
</SupportVectorMachine>
<SupportVectorMachine>
<Extension
extender="spss.com">
<ResponseCategory
Response="versicolor"
NonResponse="virginica"/>
<ProbabilityParameter
paramA="-3.81578321075645"
paramB="0.211406625456119"/>
</Extension>
<SupportVectors
numberOfAttributes="4"
numberOfSupportVectors="45">
<SupportVector
vectorId="vec6"/>
<SupportVector
vectorId="vec7"/>
<SupportVector
vectorId="vec8"/>
<SupportVector
vectorId="vec9"/>
<SupportVector
vectorId="vec10"/>
<SupportVector
vectorId="vec11"/>
<SupportVector
vectorId="vec12"/>
<SupportVector
vectorId="vec14"/>
<SupportVector
vectorId="vec15"/>
<SupportVector
vectorId="vec17"/>
<SupportVector
vectorId="vec18"/>
<SupportVector
vectorId="vec19"/>
<SupportVector
vectorId="vec20"/>
<SupportVector
vectorId="vec21"/>
<SupportVector
vectorId="vec22"/>
<SupportVector
vectorId="vec23"/>
<SupportVector
vectorId="vec25"/>
<SupportVector
vectorId="vec26"/>
<SupportVector
vectorId="vec27"/>
<SupportVector
vectorId="vec28"/>
<SupportVector
vectorId="vec29"/>
<SupportVector
vectorId="vec30"/>
<SupportVector
vectorId="vec33"/>
<SupportVector
vectorId="vec34"/>
<SupportVector
vectorId="vec35"/>
<SupportVector
vectorId="vec36"/>
<SupportVector
vectorId="vec37"/>
<SupportVector
vectorId="vec38"/>
<SupportVector
vectorId="vec39"/>
<SupportVector
vectorId="vec40"/>
<SupportVector
vectorId="vec41"/>
<SupportVector
vectorId="vec42"/>
<SupportVector
vectorId="vec43"/>
<SupportVector
vectorId="vec44"/>
<SupportVector
vectorId="vec45"/>
<SupportVector
vectorId="vec46"/>
<SupportVector
vectorId="vec47"/>
<SupportVector
vectorId="vec48"/>
<SupportVector
vectorId="vec49"/>
<SupportVector
vectorId="vec50"/>
<SupportVector
vectorId="vec51"/>
<SupportVector
vectorId="vec52"/>
<SupportVector
vectorId="vec53"/>
<SupportVector
vectorId="vec54"/>
<SupportVector
vectorId="vec55"/>
</SupportVectors>
<Coefficients
absoluteValue="-0.0019238862926096"
numberOfCoefficients="45">
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="4.24131327716667"/>
<Coefficient
value="10"/>
<Coefficient
value="8.20985324384774"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-1.88732510918139"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-0.563841411833021"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
<Coefficient
value="-10"/>
</Coefficients>
</SupportVectorMachine>
</SupportVectorMachineModel>
Best Answer
Your SVM implementation is very likely based on the LibSVM library. Please refer to LibSVM's documentation (eg. this FAQ item) for an explanation how probabilities are calculated.
In brief, probability calculation is based on a separate procedure, which has nothing in common with the decision function. It is even possible that the decision function and calculated probabilities predict different class as winner.
The PMML specification does not support LibSVM's probability calculation procedure. Hence, you can't use the
probability
output feature with SVM models.