I have a CSV file with 187.533 observations. They are points, violence events with their latitude and longitude. I import the CSV to QGIS (version 3.20.1). When I clicked to "Show the Feature Count" appears 175.896 observations. However, when I put the attribute table, it appears 187.533, the original number of observations.
I attach two screenshoot of what I am saying.
Does anyone know what can be happening or the reason of this divergence?
To give you more information, when I do an "attribute by location with a shapefile which divides my Africa in 0.5×0.5 degree latitude, I obtained as a result 182.989 observations which makes sense if the observations are 175.896 and not 187.533.
Update due to a comment:
Best Answer
Quick solution for importing CSV with all geometries
When loading your CSV, be sure to set semicolon
;
as only delimiter and select double quotes"
forQuote
to ignore semicolons inside a text string ("quote"). Otherwise, they are mis-interpret as field delimiters. See below for more details.What the difference means
The difference can be explained that in the layers panel, you have the number of geometries, whereas in the attribute table, you have the number of features. That means: not all features could be correctly generated as geometries. Identify the values in lat/lon column that are problematic. Seems to be connected to your earlier problem.
See this simple example: one point without lat/lon value was not generated as a point, thus you have three points, but still 4 features:
Understand why loading the CSV does not create geometries for all features
You loaded your CSV with empty field for
Quote
in theFile Format
section of the Data Source Manager. However, you should fill in a double quote"
in each of these fields. Why?Have a look at your CSV where you have lines like this one here:
Colum no. 10 looks like:
You see that you have a semicolon
;
as part of the string inside this field. However, normally, the semicolon in this file works as a delimiter for fields - that is what we told QGIS when importing: when it finds a semicolon, interpret this as the end of the field and put what follows to the next field (column).However, in the case here, the semicolon is part of a text string - the text is marked with double quotes
"
. So we have to tell QGIS to ignore all semicolons insidequote
statement with"
: where there is a;
quoted inside a"
text string, don't use it as a delimiter, but just render it as part of the text string.Thats why you have to define the
"
in the import dialog:What went wrong in your case
When you import the CSV without that, everywhere where you have a semicolon inside a double quote, it is wrongly interpreted as delimiter and the fields are shifted to the right. So in these cases, the lat/lon fields are not correct any more, just
NULL
. Like this, no point can be created.Go to the layer with the wrong number of points, draw a rectangle with the the
Select features by Area
tool and select all features on the map canvas. Then open the attribute table, invert the selection and bring all the selected features to the top. Then you see all the features that are not rendered as points (have a look at the next screenshot).In the field
assoc_actor_2
you have a text string starting with"
and this text is "dismembered" where there was the;
and continues in the next fieldinter2
- in some cases (with two semicolons in the text), it is even shifted twice until the final"
. So fields forlatitude
(andlongitude
as well where you had two;
inside double quotes) are empty.Selected (blue) features with wrong lat/lon values; white features are correctly interpreted: