Linux Tips I Wish I Knew Long Ago: Postgresql & Python

This one isn’t strictly Linux per se, but I really went down the rabbit hole and wasted about a half of day trying to track this one down! Spoiler: the answer was in the documentation the whole time, of course!

The scenario is this: I have a python script, it queries an API periodically and stores some of the results in a Postgresql database. The API object I’m pulling is like an article and there’s metadata and statistics that can change and get updated over time, which I keep track of in my DB. So, my python script compares these fields and would make a DB update if required. The SQL query looks loosely like this:

UPDATE post set {field}={value} where id={lookupid}

In this query, “field”, “value” and “lookupid” are all variables in the Python script. When building database queries we have to do things the right way to be sure the code is not vulnerable to any SQL injection attacks. In Python the psycopg module for postgresql ( http://initd.org/psycopg/docs/usage.html ) does this quite nicely and the documentation covers the correct way to pass in values:

# Don't use the string operator, pass execute the variables as the second input:
cur.execute("INSERT INTO numbers VALUES (%s, %s)" % (10, 20)) # WRONG
cur.execute("INSERT INTO numbers VALUES (%s, %s)", (10, 20))  # correct

# The second argument must always be a sequence, even if it contains a single variable
cur.execute("INSERT INTO foo VALUES (%s)", ("bar"))  # WRONG
cur.execute("INSERT INTO foo VALUES (%s)", ("bar",)) # correct

# The placeholder must not be quoted. Psycopg will add quotes where needed       
cur.execute("INSERT INTO numbers VALUES ('%s')", (10,)) # WRONG
cur.execute("INSERT INTO numbers VALUES (%s)", (10,))   # correct         

This was all well and good, and the part about the single values still going in as a sequence was something else I learned!

The problem was that I was trying to pass in the “field name” in addition to the values so that I could update only the fields that needed to be updated with my code. Following these examples, I made my query look similar to this:

cur.execute("UPDATE post set %s=%s where id=%s", (field, value, lookupid)) 

This results in the “field name” having single quotes as well as the “value” and “lookupid” which ends up being incorrect syntax in postgresql, and errors such as:

syntax error at or near "'Flag'"
LINE 1: UPDATE post set 'Flag'='True' where i... 

This is where I got stuck and started Googling furiously. Searches like:

“postgresql python single quotes and no single qutoes in string”
“postgresql python quote_ident in UPDATE”
“postgresql execute no single quote”
“postgresql field name single quote”
“python postgresql mogrify”

The key to this issue is that we need to treat “field name” differently and not have it quoted in the same was as “values” or “lookupid”. And after combing through stack overflow posts, blog posts, and the documentation, I finally found what I was looking for in the documentation:

Only query values should be bound via this method: it shouldn’t be used to merge table or field names to the query (Psycopg will try quoting the table name as a string value, generating invalid SQL). If you need to generate dynamically SQL queries (for instance choosing dynamically a table name) you can use the facilities provided by the psycopg2.sql module:

>>> cur.execute("INSERT INTO %s VALUES (%s)", ('numbers', 10))  # WRONG 
>>> cur.execute(SQL("INSERT INTO {} VALUES (%s)").format(Identifier('numbers')), ...     (10,))

There it was! So, I was able to adjust my code to the following:

cur.execute(sql.SQL("UPDATE post set {}=%s where id=%s").format(sql.Identifier(value.lower())),(str(post[value]), idkeys)

And then everything works nicely: the field name is entered properly, and lowercase, and all the values are properly quoted as the strings they need to be. Not exactly a simple solution, but of course it was there in the documentation, as long as you looked in the right place.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.