PyQGIS – Logic Behind the Order of Features When Getting Features by PyQGIS

pyqgisqgis

  • When I run the script below, I get features as ordered by id as expected.

    layer = iface.activeLayer()
    
    # getFeatures()
    ids = [f.id() for f in layer.getFeatures()]
    print(ids)
    
    # OUTPUT: [0L, 1L, ..... 499L, 500L]
    
  • If I select all features and run the next script, I get the same output.

    # selectedFeatures()
    ids = [f.id() for f in layer.selectedFeatures()] 
    print(ids)
    
    # OUTPUT: [0L, 1L, ..... 499L, 500L]
    
  • But if I select manually some features -for example features with 12, 13, 17 id in my shapefile- and run the previous script, I get unordered list of id. I always get the same output, no matter what the selection order is.

    ids = [f.id() for f in layer.selectedFeatures()]
    print(ids)
    
    # OUTPUT: [17L, 12L, 13L]
    
    ## OR selectedFeaturesIterator()
    ids = [f.id() for f in layer.selectedFeaturesIterator()]
    print(ids)
    
    # OUTPUT: [17L, 12L, 13L]
    

Why selectedFeatures() and selectedFeaturesIterator() return unordered features by id, while getFeatures() returns ordered features? What is the logic behind?

EDIT: The point I wonder is not whether features are ordered in database/file/memory or not. What I mean by "why" is that why a method returns ordered features but another one returns unordered or same method returns different results. Pay attention to selectedFeatures(). As an example, if all features are selected, it returns [1, 2, 3, 4]. But it returns [2, 1, 3] if some features are selected.

NOTE:

  • getFeatures() and selectedFeaturesIterator() return an iterator (QgsFeatureIterator), selectedFeatures() returns a list (QgsFeatureList).
  • QGIS 3 has getSelectedFeatures() instead of selectedFeaturesIterator() in QGIS 2.

  • I get the same result in both QGIS v2.18 and v3.2.

Best Answer

Since you haven't specified a sort order, it is wrong to assume the result is sorted.

Seeing the result as ordered by ID is usually nothing more than luck and typically occurs because the data is inserted in the same order as the ID.

How the data is physically saved on disk/DB would matters. Start inserting records with a random ID and the output will not look ordered anymore. Add some columns to your layer (especially in a DB) and the return records order will also be severely affected.

Similarly, the order can be affected by the query used to retrieve them. Add more or less columns to the query and the database planner (not sure for other data storage) will generate a different execution plan, which can result in different order of retrieval. This is likely what is happening in you question.

Edit

The values used in the where clause can also affect the execution plan for data stored in a DB. See this post as an example, where ids below a given values lead to a plan that includes a sort while above the given value leads to a totally different plan that 'mix' all the data.

So, to generalize, the order of the returned rows depends on the data driver (the DB engine, file reader etc) and not on the code/app calling it (QGIS, python, arcobject etc). And it is dangerous to even assume that the same query will generate result in the same order, as "environmental conditions" can have changed and lead to different order (ex: new index, DB vacuuming, new column, new values in existing rows cause the row to span over several (and distant!) pages etc).

Related Question