Python – Save DataFrame as DBF with simpledbf in Python

attribute-joinsdbfpandaspython

I am doing a number of joins outside of ArcGIS as pandas seems to outperform it. I read in the dbf file with simpledbf, convert it to a dataframe, then do the join:

import pandas as pd
from simpledbf import Dbf5

#Source of the join
data_source=pd.read_excel('C:\\Users\\myfolder\\mysource.xlsx')

#Target of the join
dbf = Dbf5('C:\\Users\\myfolder\\mydbf.dbf')
d_b_f = dbf.to_dataframe()

#Join
d_b_f_2=pd.merge(d_b_f, data_source, on='COLUMN', how='left')

How can I save this dataframe as a dbf and overwrite the existing dbf?

I expected something along these lines to work:

final_dbf = Dbf5('C:\\Users\\myfolder\\finaldbf.dbf')
final_dbf = d_b_f_2.to_dbf()

but an error pops up saying that 'C:\Users\myfolder\finaldbf.dbf' does not exist. What do you suggest as a workaround? I wouldn't want to look into other packages, as I am sure there must be a way of doing this within simpledbf itself.

Best Answer

1) simpledbf cannot export dbf format as output (only CSV, SQL,pandas DataFrame, HDF5 Table)
2) Pandas DataFrames don't have a .to_dbf method

Simply use PySAL(dbf) and DBF files and the pandas DataFrame with the functions df2dbf(convert a pandas.DataFrame into a dbf), dbf2df(read a dbf file as a pandas.DataFrame) and appendcol2dbf(append a column and the associated data to a DBF) in dataIO.py) that you can adapt

import numpy as np
import pysal as ps
import pandas as pd
# adaptation of the function `df2dbf` to write your resulting dbf file
type2spec = {int: ('N', 20, 0),
             np.int64: ('N', 20, 0),
             float: ('N', 36, 15),
             np.float64: ('N', 36, 15),
             str: ('C', 14, 0),
             unicode: ('C', 14, 0)}
types = [type(d_b_f_2[i].iloc[0]) for i in d_b_f_2.columns]
specs = [type2spec[t] for t in types]
with ps.open("finaldbf.dbf", 'w') as db:
    db.header = list(df.columns)
    db.field_spec = specs
    for i, row in d_b_f_2.T.iteritems():
        db.write(row)
Related Question