QGIS – Divergence in Feature Count Between ‘Show Feature Count’ and Attribute Table

csvimport

I have a CSV file with 187.533 observations. They are points, violence events with their latitude and longitude. I import the CSV to QGIS (version 3.20.1). When I clicked to "Show the Feature Count" appears 175.896 observations. However, when I put the attribute table, it appears 187.533, the original number of observations.

I attach two screenshoot of what I am saying.

enter image description here

enter image description here

Does anyone know what can be happening or the reason of this divergence?

To give you more information, when I do an "attribute by location with a shapefile which divides my Africa in 0.5×0.5 degree latitude, I obtained as a result 182.989 observations which makes sense if the observations are 175.896 and not 187.533.

Update due to a comment:

enter image description here

Best Answer

Quick solution for importing CSV with all geometries

When loading your CSV, be sure to set semicolon ; as only delimiter and select double quotes " for Quote to ignore semicolons inside a text string ("quote"). Otherwise, they are mis-interpret as field delimiters. See below for more details.

What the difference means

The difference can be explained that in the layers panel, you have the number of geometries, whereas in the attribute table, you have the number of features. That means: not all features could be correctly generated as geometries. Identify the values in lat/lon column that are problematic. Seems to be connected to your earlier problem.

See this simple example: one point without lat/lon value was not generated as a point, thus you have three points, but still 4 features:

enter image description here

Understand why loading the CSV does not create geometries for all features

You loaded your CSV with empty field for Quote in the File Format section of the Data Source Manager. However, you should fill in a double quote " in each of these fields. Why?

Have a look at your CSV where you have lines like this one here:

894;ZAM1104;1104;14-dic-16;2016;1;Protests;Peaceful protest;Protesters (Zambia);"Health Workers (Zambia); Teachers (Zambia)";6;;;0;60;Southern Africa;Zambia;Lusaka;Lusaka;;Lusaka;-15.4169998168945;28.2830009460449;1;Zambia Reports;National;Doctors at the University Teaching Hopsital are commencing a 10-day sit-in protest to bring attention to unfilfilled commitments by the government;0;1567465455

Colum no. 10 looks like:

"Health Workers (Zambia); Teachers (Zambia)"

You see that you have a semicolon ; as part of the string inside this field. However, normally, the semicolon in this file works as a delimiter for fields - that is what we told QGIS when importing: when it finds a semicolon, interpret this as the end of the field and put what follows to the next field (column).

However, in the case here, the semicolon is part of a text string - the text is marked with double quotes ". So we have to tell QGIS to ignore all semicolons inside quote statement with ": where there is a ; quoted inside a " text string, don't use it as a delimiter, but just render it as part of the text string.

Thats why you have to define the " in the import dialog:

enter image description here

What went wrong in your case

When you import the CSV without that, everywhere where you have a semicolon inside a double quote, it is wrongly interpreted as delimiter and the fields are shifted to the right. So in these cases, the lat/lon fields are not correct any more, just NULL. Like this, no point can be created.

Go to the layer with the wrong number of points, draw a rectangle with the the Select features by Area tool and select all features on the map canvas. Then open the attribute table, invert the selection and bring all the selected features to the top. Then you see all the features that are not rendered as points (have a look at the next screenshot).

In the field assoc_actor_2 you have a text string starting with " and this text is "dismembered" where there was the ; and continues in the next field inter2 - in some cases (with two semicolons in the text), it is even shifted twice until the final ". So fields for latitude (and longitude as well where you had two ; inside double quotes) are empty.

Selected (blue) features with wrong lat/lon values; white features are correctly interpreted: enter image description here