If you want to merely, "Like, if there were some way to break down the pixels into small areas, but the pixels still having the same values, it also would work." Then merely change the cell size to the new number (say from 9 to 3). You can do this in QGIS via the raster calculator or gdal warp (How to resample GeoTIFF images to the same resolution?).
Now if you want to decrease the cell size you could certainly convert cells to points and then you could use kriging if required or items such as spline, weighted nearest neighbor, and IDW, but I think a simple TIN would suffice and then convert the TIN to raster. You may even be able to create the TIN direct from the raster.
This may a suitable case for Tobler's pycnophylactic interpolation in raster. This mass preserving analysis will assign values to your new cells based on their neighbors. I have only ever done this in ArcGIS but it can be run GRASS 7 using v.surf.mass. You cells would break into say 9 cells per cell and then use pycnophylactic interpolation at this level.
I would suggest you start with a TIN.
Changing Raster Resolution
Resizing image resolution from 1000 m to 250 m in ArcGIS Desktop?
How to create a TIN in QGIS
https://docs.qgis.org/2.6/en/docs/user_manual/plugins/plugins_interpolation.html
Just a quick note on this "problem". When you read in a raster, be it a single raster on in a stack/brick, the default names are the names of the on-disk files. In using the raster::predict
function the names in the model object must match the names in the stack/brick. As such, it is good convention the assign the names that you want to use across your modeling workflow. This also provides an addition advantage in easing data management.
Let's say you have a naming convention in your raster layers that correspond to your covariates. You can define a vector of covariate names and then use the vector to read the data with very efficient code.
dummy covariate/raster names
covariates <- paste0(rep("v", 10), 1:10)
Create a vector of rasters (tif) in specified directory. If different from your working directory you can use the full.names = TRUE
argument in list.files
.
rlist <- list.files(getwd(), "tif$")
Then you can use grep to query the vector of rasters to match your covariate names, and since you already have a vector of names you can then assign it to the stack object. The grep function returns an index, thus the brackets, of the query. Using paste with collapse allows you to pass multiple values to grep, based on the covariates vector.
vars <- stack(rlist[grep(paste(covariates, collapse = "|"), rlist)])
names(vars) <- covariates
Now, the names issue is solved for the raster::predict
function. We should address calling the function itself. It is important to keep in mind that raster::predict
is wrapper for other predict functions that each have their own data structures. The example at hand would be the predict method for randomForest:::predict.randomForest
. In a classification model, if type="prob" or "votes" a data.frame is returned, with n columns, representing each class. You will notice that raster::predict has some arguments that can control output. The fun
argument lets you pass a custom predict function, superseding any existing predict method for the model object. The index
argument lets you define the column of a multi-column data.frame or matrix that is returned from a given predict method. With randomForest probability predictions a column is returned for each class so, you have to define with column you want using index
. For a binomial model, for returning the prevalence class ["1"] you would use index=2
.
raster::predict(model=rf1, object=ApPl_stack, type="prob", index=2)
I would also note, based on the OP's code, that you want to avoid symbolic (formula) model calls if an index interface is possible. For some reason symbolic calls really slow down predictions such as this, specifically in randomForest. Here is what an index call looks like for randomForest
.
rf1 <- randomForest(y=factor(dcc.s.dummydcc.s.dummy[,"SITE_NONSITE"]),
x=dcc.s.dummy[,-which(names(dcc.s.dummy)=="SITE_NONSITE")])
Or, if you know the positions of your covariates, simply.
rf1 <- randomForest(y=factor(dcc.s.dummy[,"SITE_NONSITE"]),
x=dcc.s.dummy[,2:ncol(dcc.s.dummy)])
For this model, I would also highly recommend addressing model fit through parameter selection. Elsewise, you are fitting random variation in your models and this is reflected in the spatial estimates. Parsimony is actually an important factor in spatial estimates using nonparametric methods. You can address model/parameter selection using the rfUtilites::rf.modelSel as well as addressing multivariate multicollinearity issues and evaluate model fit/performance through a Bootstrap approach.
Best Answer
A raster is an NxM grid where every cell has a value. If the value is not known, it is generally set to NA (or some special value like "-9999"). There's no way you can "remove" a value from a cell without replacing it with something else.