The pmml package for R (used by Rattle, which is mentioned in highBandWidth's answer), provides a fairly transparent look at how to turn a model into PMML output.
In the pmml package reference manual, the example of building a linear model for the iris data set and then producing PMML is given:
> library("pmml")
> (iris.lm <- lm(Sepal.Length ~ ., data=iris))
> pmml(iris.lm)
This will produce the following PMML:
<PMML version="3.2" xmlns="http://www.dmg.org/PMML-3_2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.dmg.org/PMML-3_2 http://www.dmg.org/v3-2/pmml-3-2.xsd">
<Header copyright="Copyright (c) 2011 user" description="Linear Regression Model">
<Extension name="user" value="user" extender="Rattle/PMML"/>
<Application name="Rattle/PMML" version="1.2.27"/>
<Timestamp>2011-08-27 23:17:42</Timestamp>
</Header>
<DataDictionary numberOfFields="5">
<DataField name="Sepal.Length" optype="continuous" dataType="double"/>
<DataField name="Sepal.Width" optype="continuous" dataType="double"/>
<DataField name="Petal.Length" optype="continuous" dataType="double"/>
<DataField name="Petal.Width" optype="continuous" dataType="double"/>
<DataField name="Species" optype="categorical" dataType="string">
<Value value="setosa"/>
<Value value="versicolor"/>
<Value value="virginica"/>
</DataField>
</DataDictionary>
<RegressionModel modelName="Linear_Regression_Model" functionName="regression" algorithmName="least squares" targetFieldName="Sepal.Length">
<MiningSchema>
<MiningField name="Sepal.Length" usageType="predicted"/>
<MiningField name="Sepal.Width" usageType="active"/>
<MiningField name="Petal.Length" usageType="active"/>
<MiningField name="Petal.Width" usageType="active"/>
<MiningField name="Species" usageType="active"/>
</MiningSchema>
<RegressionTable intercept="2.17126629215507">
<NumericPredictor name="Sepal.Width" exponent="1" coefficient="0.495888938388551"/>
<NumericPredictor name="Petal.Length" exponent="1" coefficient="0.829243912234806"/>
<NumericPredictor name="Petal.Width" exponent="1" coefficient="-0.315155173326474"/>
<CategoricalPredictor name="Species" value="setosa" coefficient="0"/>
<CategoricalPredictor name="Species" value="versicolor" coefficient="-0.72356195778073"/>
<CategoricalPredictor name="Species" value="virginica" coefficient="-1.02349781449083"/>
</RegressionTable>
</RegressionModel>
</PMML>
Source Code
The relevant source code for this linear model is in the pmml package pmml.R
and pmml.lm.R
files. As will be the case for any PMML producer, it basically reads model parameters (here the model is in iris.lm
), and then builds up the XML nodes from the model data.
The code in pmml.lm.R
is pretty straightforward, and basically node-by-node builds up the PMML.
Below are some of the queries on the data model that are used (indirectly) in pmml.lm.R
:
> terms <- attributes(iris.lm$terms)
> terms$dataClasses
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
"numeric" "numeric" "numeric" "numeric" "factor"
> iris.lm$xlevels
$Species
[1] "setosa" "versicolor" "virginica"
> iris.lm$coefficients
(Intercept) Sepal.Width Petal.Length Petal.Width Speciesversicolor Speciesvirginica
2.1712663 0.4958889 0.8292439 -0.3151552 -0.7235620 -1.0234978
The qcc
package comes to mind. A quick search through the packages list at http://cran.r-project.org/ shows other packages that may be helpful: graphicsQC
, IQCC
, qualityTools
, SixSigma
, and two Rcmdr
plugins.
Best Answer
Flowing Data has a tutorial on how to use the
map.market
function in theportfolio
package in R.