[GIS] Use psycopg2 to do loop in postgresql

postgresqlpsycopg2python-2.7windows 8

I use postgresql 8.4 to route a river network, and I want to use psycopg2 to loop through all data points in my river network.

#set up python and postgresql connection
import psycopg2

query = """
    select *
    from driving_distance ($$
        select
            gid as id,
            start_id::int4 as source,
            end_id::int4 as target,
            shape_leng::double precision as cost
        from network
        $$, %s, %s, %s, %s
    )
;"""


conn = psycopg2.connect("dbname = 'routing_template' user = 'postgres' host = 'localhost' password = '****'")
cur = conn.cursor()
while True:
    i = 1
    if i <= 2:
        cur.execute(query, (i, 1000000, False, False))
        i = i + 1
    else:
        break
rs = cur.fetchall()
conn.close()
print rs

The code above costs a lot of time to run even though I have set the maximum iterator i equals to 2, and the output is an error message contains garbage,

enter image description here

I am thinking that if postgresql can accept only one result at one time, so I tried to put this line in my loop,

rs(i) = cur.fetchall()

and the error message said that this line has bugs,

enter image description here

I know that I can't write code like rs(i), but I don't know the replacement to validate my assumption.

So should I save one result to a file first then use the next iterator to run the loop, and again and again?

I am working with postgresql 8.4, python 2.7.6 under Windows 8.1 x64.


Update#1

I can do loop using the code below, and the result follows,

import sys
import psycopg2

query = """
    select *
    from driving_distance ($$
        select
            gid as id,
            start_id::int4 as source,
            end_id::int4 as target,
            shape_leng::double precision as cost
        from network
        $$, %s, %s, %s, %s
    )
;"""


conn = psycopg2.connect("dbname = 'routing_template' user = 'postgres' host = 'localhost' password = '****'")
cur = conn.cursor()

rs = []
i = 1
while i <= k:
    cur.execute(query, (i, 1000000, False, False))
    rs.append(cur.fetchall())
    i = i + 1

for record in rs:
    print record

conn.close()


[(1, 2, 0.0), (2, 2, 4729.33082850235), (3, 19, 4874.27571718902), (4, 3, 7397.215962901), (5, 4, 
6640.31749097187), (6, 7, 10285.3869655786), (7, 7, 14376.1087618696), (8, 5, 15053.164236979), (9, 10, 16243.5973710466), (10, 8, 19307.3024368889), (11, 9, 21654.8669532788), (12, 11, 23522.6224229233), (13, 18, 29706.6964721152), (14, 21, 24034.6792693279), (15, 18, 25408.306370489), (16, 20, 34204.1769580924), (17, 11, 26465.8348728118), (18, 20, 38596.7313209197), (19, 13, 35184.9925532175), (20, 16, 36530.059646027), (21, 15, 35789.4069722436), (22, 15, 38168.1750567026)]
[(1, 2, 4729.33082850235), (2, 2, 0.0), (3, 19, 144.944888686669), (4, 3, 2667.88513439865), (5, 4, 1910.98666246952), (6, 7, 5556.05613707624), (7, 7, 9646.77793336723), (8, 5, 10323.8334084767), (9, 10, 11514.2665425442), (10, 8, 14577.9716083866), (11, 9, 16925.5361247765), (12, 11, 18793.2915944209), (13, 18, 24977.3656436129), (14, 21, 19305.3484408255), (15, 18, 20678.9755419867), (16, 20, 29474.8461295901), (17, 11, 21736.5040443094), (18, 20, 33867.4004924174), (19, 13, 30455.6617247151), (20, 16, 31800.7288175247), (21, 15, 31060.0761437413), (22, 15, 33438.8442282003)]

but if I want to get this look of output,

(1, 2, 7397.215962901)
(2, 2, 2667.88513439865)
(3, 19, 2522.94024571198)
(4, 3, 0.0)
(5, 4, 4288.98201949483)
(6, 7, 7934.05149410155)
(7, 7, 12024.7732903925)
(8, 5, 12701.828765502)
(9, 10, 13892.2618995696)
(10, 8, 16955.9669654119)
(11, 9, 19303.5314818018)
(12, 11, 21171.2869514462)
(13, 18, 27355.3610006382)
(14, 21, 21683.3437978508)
(15, 18, 23056.970899012)
(16, 20, 31852.8414866154)
(17, 11, 24114.4994013347)
(18, 20, 36245.3958494427)
(19, 13, 32833.6570817404)
(20, 16, 34178.72417455)
(21, 15, 33438.0715007666)
(22, 15, 35816.8395852256)

What should I make a little change in the code?

Best Answer

For one thing, you should generally avoid explicit looping like that in Python whenever possible. Psycopg2's cursor objects support the iterator protocol. This means you can iterate row by row over the results without needing to manually take care of indices.

Another thing is that you are calling the execute function many times inside that loop when it only needs to be called once. I would not be surprised if it were bottlenecking your code's speed significantly.

conn = psycopg2.connect("dbname = 'routing_template' user = 'postgres' host = 'localhost' password = '****'")
cur = conn.cursor()
cur.execute(query)
for row in cur:
    #do something with every single row here
    #optionally print the row
    print row
conn.close()