You should check what the scale and offset are for your file. This can be done as follows:
van_taken.header.scale
van_taken.header.offset
This almost looks like an overflow error to me. The lower case x, y, and z properties need to re-scale and re-offset the coordinates to store it as an integer (which is how LAS files store them). To be honest, setting the values of the coordinates as scaled values is a bit of an anti-patern, because you lose control of how precision is lost.
Here's the code that handles your set operation, located in base.py:
def set_x(self, X, scale = False):
'''Wrapper for set_dimension("X", new_dimension)'''
if not scale:
self.set_dimension("X", X)
return
self.set_dimension("X", np.round((X - self.header.offset[0])/self.header.scale[0]))
return
If you can figure out what the unscaled integer representation of your converted data should be, that's probably a better way to store it (e.g., using the capital letter X, Y, and Z properties of the file).
If you're fine with the above approach to converting between the integer representation and the floating point representation, then I'd consider adjusting your scale to ensure that you don't end up with integers greater than four bytes in size.
If this isn't explicable via integer overflow due to scaling issues, then we definitely need to figure out what's going on. If it's an overflow issue, I'd be open to trying to guard against this case so long as it doesn't have too terrible a performance penalty.
Edit:
It looks like overflow is definitely the issue.
When you're assigning a scaled coordinate value into a LAS file, laspy needs to find some way of representing it as a four byte integer. Currently, it faithfully believes the information in the header. That is, it will subtract the appropriate offset (for the X, Y,or Z dimension) and divide by the appropriate scale (for the X, Y, or Z dimension). The result is then rounded to produce an integer.
Your file has an X scale of 1e-7, and an X offset of -83.11. Thus, to convert any new scaled value of x to its integer representation (which is what happens when you assign into the lower case 'x' property of your file), you need to add 83.11 and divide by 1e-7. For your first value, 269873.21570411, this results in a value of 2.699563e+12. The largest number you can store in four bytes is 2.14e9 for signed integers and 4.29e9 for unsigned.
Currently, laspy doesn't check for this mistake, resulting in an integer overflow. As I mentioned above, it's probably best to assign the integer values (to the capital X, Y, and Z properties) yourself to avoid any ambiguity.
As a quick fix, however, you can simply change the offset. The following ought to work:
van_taken.header.scale = [0.01,0.01,0.01]
van_taken.header.offset = [0,0,0]
You can increase the precision of your conversion by using a large offset and large scale. For example, if all of your scaled X coordinates are greater than 200000, you could use an X offset of 200000. Then, when a small scale like 1e-7 is used, the numbers it will be inflating will be smaller. That's something to play around with, keeping in mind the four byte limit.
In a lot of problems, and in a lot of computing environments, it's easy to gloss over the fact that floating point arithmetic is fundamentally not like real number arithmetic. Unfortunately, working with LAS files is not one of those cases.
Edit 2:
So can I change the scale value? Will this affect the data in any other ways?
Yes, the reason you can change the scale in this case is that you're supplying the scaled value. If you tell the LAS file that the scale is some particular value, that's the value it will use when re-scaling data. You wouldn't want to mess with the scale if you were reading existing LAS data.
Edit 3
Last question. So it seems that you were right about the scales. But I tried the .01 and it wasn't giving me the number. I then tried 1.000001 and it seemed closer, but they still aren't the same. Any tips for selecting the correct value, other than trial and error?
I don't think there's really a 'correct' value per-se, it's a trade off. No matter what you do, you're trying to store a (probably 8 byte) floating point vector as a vector of 4-byte integers multiplied by an 8 byte (double precision) floating point scale term and added to an 8 byte floating point offset term. That's not a lossless conversion in general, but it's what you have to do to store data in a standards compliant LAS file.
I would consider trying something like this (not tested):
converted_x = converted_csv[:,1]
converted_y = converted_csv[:,2]
converted_z = converted_csv[:,3]
xmin = np.floor(np.min(converted_x))
ymin = np.floor(np.min(converted_y))
zmin = np.floor(np.min(converted_z))
xmax = np.ceil(np.max(converted_x))
ymax = np.ceil(np.max(converted_y))
zmax = np.ceil(np.max(converted_z))
xrg = xmax - xmin
yrg = ymax - ymin
zrg = zmax - zmin
van_taken.header.offset = [xmin,ymin,zmin]
safety_factor = 2
maxval = 2e9
van_taken.header.scale = [safety_factor*xrg/maxval, safety_factor*yrg/maxval, safety_factor*zrg/maxval]
van_taken.x = converted_x
van_taken.y = converted_y
van_taken.z = converted_z
Best Answer
Run the example code in the help:
Converting that to an
sf
object gives one point per LAS input point (37,000 of them), with tree ID of 269 trees:(Note that I can't find a direct las-to-sf function, so I'm using
lidR::as.spatial
to make ansp
object, thenst_as_sf
to convert tosf
.)To save that as CSV, convert to a data frame with the coordinates:
and that can be saved using
write.csv
orwrite.table
or whatever.If you want one record per tree, use
tree_metrics
to summarize over trees, convert tosf
:that is 269 rows, one per tree. Convert to data frame as before and save as before.