[GIS] Link raster (.tiff) with attribute table (.txt) in QGIS or R

qgisr

I am new in GIS, and I downloaded a gridded dataset from this source

The data come with a raster file as well as a range of attribute data files. In ArcGIS I can easily join the .mdb Microsoft Access attribute table with the .adf raster. However, I am now in a Mac, that can't handle .mdb. Fortunately the data is also downloadable in an interchangeable format, including a .tiff raster file together with several .txt files with attributes.

How can I link the TIFF with the TXT data in R or QGIS from a Mac?

Once uploaded, how can I select one particular attribute (and discard the others) similar to what I would do with the lookup function? The idea is to finally save a raster in tiff with only one specific column among all the different variables included in the attribute tables.

Best Answer

The data appears to be in three formats: a .mdb in ./Data/, a .tif and some txt files in ./Interchangeable_format/, and an Arc/Info binary coverage in ./GISfiles with a raster layer called wise30sec_fin.

Both the tif and the Arc/Info coverage can be read by R or QGIS via the underlying GDAL library. The tif appears to be just integers that will need to be used for lookups, but the Arc/Info coverage includes those and is a bit richer too, so I'll use that...

Use the raster package to read it in:

> library(raster)
> r = raster("./wise30sec_fin/")
> r
class       : RasterLayer 
dimensions  : 16753, 43201, 723746353  (nrow, ncol, ncell)
resolution  : 0.008333333, 0.008333333  (x, y)
extent      : -180, 180.0083, -55.98333, 83.625  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
data source : /home/rowlings/Downloads/wise_30sec_v1/GISfiles/wise30sec_fin 
names       : wise30sec_fin 
values      : 0, 32156  (min, max)
attributes  :
          ID MU_GLOBAL MUGLB_NEW COVERAGE CODE CLIMATE_CO    NEWSUID     COUNT
 from:     0         0         0        0             Y-0            502784485
 to  : 32156     31111     32156        5    C    C-32156 WD30032156        51
 VALUE30SEC
          0
      32156

Note this hasn't actually read it into memory because of its size. R will get bits of it as needed, and we have to be careful not to read it all into memory unless we have a lot of memory. You may struggle to process the whole raster in R, so crop it to your study area if you have one.

Note the raster is a single band, and has a set of attributes. This is a data frame that links integer values in the raster to attribute values. Lets look at the first few, by using levels on the raster. The raster is a single band, so we get the first attribute table by extracting with [[1]]:

> levels(r)[[1]][1:4,1:5]
  ID MU_GLOBAL MUGLB_NEW COVERAGE CODE
1  0         0         0        0     
2  2         2         2        4    A
3 36        36        36        4    B
4 37        37        37        4    A

So to get the information for a location, get the value of the raster at that location and that's the ID in the attributes table:

> r[5000,7000]

4713 
> subset(levels(r)[[1]], ID==4713)
       ID MU_GLOBAL MUGLB_NEW COVERAGE CODE CLIMATE_CO    NEWSUID COUNT
1999 4713      4713      4713        4    D     D-4713 WD40004713  8522
     VALUE30SEC
1999       4713

Now I can read in one of the data files and look up in that with the NEWSUID code:

> data = read.table("./Interchangeable_format/HW30s_wD1.txt",head=TRUE,sep=",")

The data is one row shorter then the attribute table, so its possible one NEWSUID code is missing:

> dim(data)
[1] 16413    55
> dim(levels(r)[[1]])
[1] 16414     9

and its the empty string which I guess is code for the sea.

Anyway, we can now select the values from the data file since we know the NEWSUID value at r[5000,7000]:

> subset(data, NEWSUID=="WD40004713")
         NEWSUID NofComponents SoilUnits Layer PROP_aw MiscUnits PROP_misc
12604 WD40004713             1   GLm100     D1     100                   0
      Drain DrainProp DrainNum DrainMin DrainMax TopDep BotDep CFRAG CFRAG_std
12604     P       100        2        2        2      0     20     1         1

So that's the fundamentals of looking up data values given the raster and some data. You can do all sorts of clever things like adding the data to the levels(r) but I think the best thing to do might depend on what you actually want to end up doing with the data - if its just sampling at a few points you might want to do something different to if you want to create a whole load of global maps of a quantity in the data files.

There's a useful answer on manipulating raster level attributes here:

https://stackoverflow.com/questions/28617678/using-a-raster-attribute-from-a-multi-attribute-raster-for-colour-levels-in-a-pl

but watch out - some of those things might try and load the whole raster into memory and that might be pretty slow or hang your computer.

Alternatively, using exclusively the files in ./Interchange_files, the tif file stores pixel values, which are in the .tsv file, where you can lookup NEWSUID values to look up in the data file. Again, the large size stops me wanting to do this for the whole raster but everything is well-named.

Related Question