EzDevInfo.com

psycopg2

Python PostgreSQL adapter PostgreSQL + Python | Psycopg python adapter for postgresql

Psycopg2 image not found

Trying to setup postgres with the postgres mac app and hit this error, which I haven't been able to solve. Any thoughts?

    ImportError: dlopen(/Users/Craig/pyenv/mysite/lib/python2.7/site-packages/psycopg2/_psycopg.so, 2): Library not loaded: @executable_path/../lib/libssl.1.0.0.dylib
  Referenced from: /Applications/Postgres.app/Contents/MacOS/lib/libpq.dylib
  Reason: image not found

Source: (StackOverflow)

Mac + virtualenv + pip + postgresql = Error: pg_config executable not found

I was trying to install postgres for a tutorial, but pip gives me error:

pip install psycopg

A snip of error I get:

Error: pg_config executable not found.

Please add the directory containing pg_config to the PATH

or specify the full executable path with the option:

    python setup.py build_ext --pg-config /path/to/pg_config build ...

or with the pg_config option in 'setup.cfg'.

Where is pg_config in my virtualenv? How to configure it? I'm using virtualenv because I do not want a system-wide installation of postgres.

Source: (StackOverflow)

psycopg2: insert multiple rows with one query

I need to insert multiple rows with one query (number of rows is not constant), so I need to execute query like this one:

INSERT INTO t (a, b) VALUES (1, 2), (3, 4), (5, 6);

The only way I know is

args = [(1,2), (3,4), (5,6)]
args_str = ','.join(cursor.mogrify("%s", (x, )) for x in args)
cursor.execute("INSERT INTO t (a, b) VALUES "+args_str)

but I want some simpler way.

Source: (StackOverflow)

DatabaseError: current transaction is aborted, commands ignored until end of transaction block

I got a lot of errors with the message :

"DatabaseError: current transaction is aborted, commands ignored until end of transaction block"

after changed from python-psycopg to python-psycopg2 as Django project's database engine.

The code remains the same, just dont know where those errors are from.

Source: (StackOverflow)

Improving Postgres psycopg2 query performance for Python to the same level of Java's JDBC driver

Overview

I'm attempting to improve the performance of our database queries for SQLAlchemy. We're using psycopg2. In our production system, we're chosing to go with Java because it is simply faster by at least 50%, if not closer to 100%. So I am hoping someone in the Stack Overflow community has a way to improve my performance.

I think my next step is going to be to end up patching the psycopg2 library to behave like the JDBC driver. If that's the case and someone has already done this, that would be fine, but I am hoping I've still got a settings or refactoring tweak I can do from Python.

Details

I have a simple "SELECT * FROM someLargeDataSetTable" query running. The dataset is GBs in size. A quick performance chart is as follows:

Timing Table

        Records    | JDBC  | SQLAlchemy[1] |  SQLAlchemy[2] |  Psql
-------------------------------------------------------------------- 
         1 (4kB)   | 200ms |         300ms |          250ms |   10ms
        10 (8kB)   | 200ms |         300ms |          250ms |   10ms
       100 (88kB)  | 200ms |         300ms |          250ms |   10ms
     1,000 (600kB) | 300ms |         300ms |          370ms |  100ms
    10,000 (6MB)   | 800ms |         830ms |          730ms |  850ms  
   100,000 (50MB)  |    4s |            5s |           4.6s |     8s
 1,000,000 (510MB) |   30s |           50s |            50s |  1m32s  
10,000,000 (5.1GB) | 4m44s |         7m55s |          6m39s |    n/a
-------------------------------------------------------------------- 
 5,000,000 (2.6GB) | 2m30s |         4m45s |          3m52s | 14m22s
-------------------------------------------------------------------- 
[1] - With the processrow function
[2] - Without the processrow function (direct dump)

I could add more (our data can be as much as terabytes), but I think changing slope is evident from the data. JDBC just performs significantly better as the dataset size increases. Some notes...

Timing Table Notes:

The datasizes are approximate, but they should give you an idea of the amount of data.
I'm using the 'time' tool from a Linux bash commandline.
The times are the wall clock times (i.e. real).
I'm using Python 2.6.6 and I'm running with python -u
Fetch Size is 10,000
I'm not really worried about the Psql timing, it's there just as a reference point. I may not have properly set fetchsize for it.
I'm also really not worried about the timing below the fetch size as less than 5 seconds is negligible to my application.
Java and Psql appear to take about 1GB of memory resources; Python is more like 100MB (yay!!).
I'm using the [cdecimals] library.
I noticed a [recent article] discussing something similar to this. It appears that the JDBC driver design is totally different to the psycopg2 design (which I think is rather annoying given the performance difference).
My use-case is basically that I have to run a daily process (with approximately 20,000 different steps... multiple queries) over very large datasets and I have a very specific window of time where I may finish this process. The Java we use is not simply JDBC, it's a "smart" wrapper on top of the JDBC engine... we don't want to use Java and we'd like to stop using the "smart" part of it.
I'm using one of our production system's boxes (database and backend process) to run the query. So this is our best-case timing. We have QA and Dev boxes that run much slower and the extra query time can become significant.

testSqlAlchemy.py

#!/usr/bin/env python
# testSqlAlchemy.py
import sys
try:
    import cdecimal
    sys.modules["decimal"]=cdecimal
except ImportError,e:
    print >> sys.stderr, "Error: cdecimal didn't load properly."
    raise SystemExit
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker

def processrow (row,delimiter="|",null="\N"):
    newrow = []
    for x in row:
        if x is None:
            x = null
        newrow.append(str(x))
    return delimiter.join(newrow)

fetchsize = 10000
connectionString = "postgresql+psycopg2://usr:pass@server:port/db"
eng = create_engine(connectionString, server_side_cursors=True)
session = sessionmaker(bind=eng)()

with open("test.sql","r") as queryFD:
   with open("/dev/null","w") as nullDev:
        query = session.execute(queryFD.read())
        cur = query.cursor
        while cur.statusmessage not in ['FETCH 0','CLOSE CURSOR']:
            for row in query.fetchmany(fetchsize):
                print >> nullDev, processrow(row)

After timing, I also ran a cProfile and this is the dump of worst offenders:

Timing Profile (with processrow)

Fri Mar  4 13:49:45 2011    sqlAlchemy.prof

         415757706 function calls (415756424 primitive calls) in 563.923 CPU seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001  563.924  563.924 {execfile}
        1   25.151   25.151  563.924  563.924 testSqlAlchemy.py:2()
     1001    0.050    0.000  329.285    0.329 base.py:2679(fetchmany)
     1001    5.503    0.005  314.665    0.314 base.py:2804(_fetchmany_impl)
 10000003    4.328    0.000  307.843    0.000 base.py:2795(_fetchone_impl)
    10011    0.309    0.000  302.743    0.030 base.py:2790(__buffer_rows)
    10011  233.620    0.023  302.425    0.030 {method 'fetchmany' of 'psycopg2._psycopg.cursor' objects}
 10000000  145.459    0.000  209.147    0.000 testSqlAlchemy.py:13(processrow)

Timing Profile (without processrow)

Fri Mar  4 14:03:06 2011    sqlAlchemy.prof

         305460312 function calls (305459030 primitive calls) in 536.368 CPU seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.001    0.001  536.370  536.370 {execfile}
        1   29.503   29.503  536.369  536.369 testSqlAlchemy.py:2()
     1001    0.066    0.000  333.806    0.333 base.py:2679(fetchmany)
     1001    5.444    0.005  318.462    0.318 base.py:2804(_fetchmany_impl)
 10000003    4.389    0.000  311.647    0.000 base.py:2795(_fetchone_impl)
    10011    0.339    0.000  306.452    0.031 base.py:2790(__buffer_rows)
    10011  235.664    0.024  306.102    0.031 {method 'fetchmany' of 'psycopg2._psycopg.cursor' objects}
 10000000   32.904    0.000  172.802    0.000 base.py:2246(__repr__)

Final Comments

Unfortunately, the processrow function needs to stay unless there is a way within SQLAlchemy to specify null = 'userDefinedValueOrString' and delimiter = 'userDefinedValueOrString' of the output. The Java we are using currently already does this, so the comparison (with processrow) needed to be apples to apples. If there is a way to improve the performance of either processrow or SQLAlchemy with pure Python or a settings tweak, I'm very interested.

Source: (StackOverflow)

How do you get PyPy, Django and PostgreSQL to work together?

What fork, or combination of packages should one to use to make PyPy, Django and PostgreSQL play nice together?

I know that PyPy and Django play nice together, but I am less certain about PyPy and PostgreSQL. I do see that Alex Gaynor has made a fork of PyPy called pypy-postgresql. I also know that some people are using psycopg2-ctypes.

Is there a difference between these forks? Or should we use the stable 1.9 PyPy and use psycopg2-ctypes? Using the ctypes options could hurt performance, see the comment below.

Also, has anyone experienced any pitfalls with using PyPy with pyscopg2? It seems easy enough to fall back on CPython if something isn't working right, but mostly I'm looking for things a programmer can do ahead of time to prepare.

I looked around, it doesn't seem that psycopg2 works natively with PyPy. Although, psycopg2-ctypes does seem to be working for some people, there was a discussion on pypy-dev. I work on Windows, and I don't think psycopg2-ctypes is ready for Windows yet, sadly.

Source: (StackOverflow)

Error: No module named psycopg2.extensions

I am trying to set up a PostgreSQL database for my django project, which I believe I have done now thanks to the replies to my last question Problems setting up a postgreSQL database for a django project. I am now trying to run the command 'python manage.py runserver' in Terminal to get my localhost up but when I run the command, I see this response...

Error: No module named psycopg2.extensions

I'm not sure what this means - I have tried to download psycopg2 but can't seem to find a way to download psycopg2 using homebrew. I have tried easy_install, pip install and sudo but all return errors like this...

Downloading http://www.psycopg.org/psycopg/tarballs/PSYCOPG-2-4/psycopg2-2.4.5.tar.gz
Processing psycopg2-2.4.5.tar.gz
Writing /tmp/easy_install-l7Qi62/psycopg2-2.4.5/setup.cfg
Running psycopg2-2.4.5/setup.py -q bdist_egg --dist-dir /tmp/easy_install-l7Qi62/psycopg2-2.4.5/egg-dist-tmp-PBP5Ds
no previously-included directories found matching 'doc/src/_build'
unable to execute gcc-4.0: No such file or directory
error: Setup script exited with error: command 'gcc-4.0' failed with exit status 1

Any help on how to fix this would be very much appreciated!

Thanks

Jess

(Sorry if this is a really simple problem to solve - this is my first django project)

Source: (StackOverflow)

How to setup PostgreSQL Database in Django?

I'm new to Python and Django.

I'm configuring a Django project using PostgreSQL database engine backend, But I'm getting errors on each database operations, for example when i run manage.py syncdb, I'm getting:

C:\xampp\htdocs\djangodir>python manage.py syncdb
Traceback (most recent call last):
  File "manage.py", line 11, in <module>
    execute_manager(settings)
  File "C:\Python27\lib\site-packages\django\core\management\__init__.py", line
438, in execute_manager
    utility.execute()
  File "C:\Python27\lib\site-packages\django\core\management\__init__.py", line
379, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "C:\Python27\lib\site-packages\django\core\management\__init__.py", line
261, in fetch_command
    klass = load_command_class(app_name, subcommand)
  File "C:\Python27\lib\site-packages\django\core\management\__init__.py", line
67, in load_command_class
    module = import_module('%s.management.commands.%s' % (app_name, name))
  File "C:\Python27\lib\site-packages\django\utils\importlib.py", line 35, in im
port_module
    __import__(name)
  File "C:\Python27\lib\site-packages\django\core\management\commands\syncdb.py"
, line 7, in <module>
    from django.core.management.sql import custom_sql_for_model, emit_post_sync_
signal
  File "C:\Python27\lib\site-packages\django\core\management\sql.py", line 6, in
 <module>
    from django.db import models
  File "C:\Python27\lib\site-packages\django\db\__init__.py", line 77, in <modul
e>
    connection = connections[DEFAULT_DB_ALIAS]
  File "C:\Python27\lib\site-packages\django\db\utils.py", line 92, in __getitem
__
    backend = load_backend(db['ENGINE'])
  File "C:\Python27\lib\site-packages\django\db\utils.py", line 33, in load_back
end
    return import_module('.base', backend_name)
  File "C:\Python27\lib\site-packages\django\utils\importlib.py", line 35, in im
port_module
    __import__(name)
  File "C:\Python27\lib\site-packages\django\db\backends\postgresql\base.py", li
ne 23, in <module>
    raise ImproperlyConfigured("Error loading psycopg module: %s" % e)
django.core.exceptions.ImproperlyConfigured: Error loading psycopg module: No mo
dule named psycopg

Can someone give me a clue on what is going on?

Source: (StackOverflow)

Installing psycopg2 into virtualenv when PostgreSQL is not installed on development system

Is it possible to install psycopg2 into a virtualenv when PostgreSQL isn't installed on my development system—MacBook Pro with OS X 10.6?

When I run pip install psycopg2 from within my virtualenv, I received the error shown below.

I'm trying to connect to a legacy database on a server using Django, and I'd prefer not to install PostgreSQL on my development system if possible.

Why not install PostgreSQL?

I received an error when installing PostgreSQL using homebrew. I have Xcode4—and only Xcode4—installed on my MacBook Pro and am thinking it's related to missing gcc 4.0. However, this is a problem for another StackOverflow question.

Update 8:37 AM on April 12, 2011: I'd still like to know if this is possible without installing PostgreSQL on my MacBook Pro. However, I ran brew update and forced a reinstallation of ossp-uuid with brew install --force ossp-uuid and now brew install postgresql works. With PostgreSQL successfully installed, I was able to pip install psycopg2 from within my virtualenv.

Error from `pip install psycopg2`

$ pip install psycopg2
Downloading/unpacking psycopg2
  Running setup.py egg_info for package psycopg2

    Error: pg_config executable not found.

    Please add the directory containing pg_config to the PATH
    or specify the full executable path with the option:

        python setup.py build_ext --pg-config /path/to/pg_config build ...

    or with the pg_config option in 'setup.cfg'.
    Complete output from command python setup.py egg_info:
    running egg_info

writing pip-egg-info/psycopg2.egg-info/PKG-INFO
writing top-level names to pip-egg-info/psycopg2.egg-info/top_level.txt
writing dependency_links to pip-egg-info/psycopg2.egg-info/dependency_links.txt
warning: manifest_maker: standard file '-c' not found

Error: pg_config executable not found.

Please add the directory containing pg_config to the PATH
or specify the full executable path with the option:

    python setup.py build_ext --pg-config /path/to/pg_config build ...

or with the pg_config option in 'setup.cfg'.

----------------------------------------
Command python setup.py egg_info failed with error code 1
Storing complete log in /Users/matthew/.pip/pip.log

Preliminary Research

Below are the articles I read as preliminary research:

Source: (StackOverflow)

Problems using psycopg2 on Mac OS (Yosemite)

Currently i am installing psycopg2 for work within eclipse with python.

I am finding a lot of problems:

The first problem sudo pip3.4 install psycopg2 is not working and it is showing the following message

Error: pg_config executable not found.

FIXED WITH:export PATH=/Library/PostgreSQL/9.4/bin/:"$PATH”

When I import psycopg2 in my project i obtein:

ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/psycopg2/_psycopg.so Library libssl.1.0.0.dylib Library libcrypto.1.0.0.dylib

FIXED WITH: sudo ln -s /Library/PostgreSQL/9.4/lib/libssl.1.0.0.dylib /usr/lib sudo ln -s /Library/PostgreSQL/9.4/lib/libcrypto.1.0.0.dylib /usr/lib

Now I am obtaining:

ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/psycopg2/_psycopg.so, 2): Symbol not found: _lo_lseek64 Referenced from: /Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/psycopg2/_psycopg.so Expected in: /usr/lib/libpq.5.dylib in /Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/psycopg2/_psycopg.so

After 3 hours I didn´t find the solution. Can you help me?

Thank you so much! Regards Benja.

Source: (StackOverflow)

xcrun/lipo freezes with OS X Mavericks and XCode 4.x

Been trying to install psycopg2 with either easy_install or pip, and the terminal gets stuck in a loop between xcrun and lipo.

sidwyn$ sudo easy_install psycopg2
Searching for psycopg2
Reading https://pypi.python.org/simple/psycopg2/
Reading http://initd.org/psycopg/
Reading http://initd.org/projects/psycopg2
Best match: psycopg2 2.5.1
Downloading https://pypi.python.org/packages/source/p/psycopg2/psycopg2-2.5.1.tar.gz#md5=1b433f83d50d1bc61e09026e906d84c7
Processing psycopg2-2.5.1.tar.gz
Writing /tmp/easy_install-dTk7cd/psycopg2-2.5.1/setup.cfg
Running psycopg2-2.5.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-dTk7cd/psycopg2-2.5.1/egg-dist-tmp-4jaXas
clang: warning: argument unused during compilation: '-mno-fused-madd'

It bounces between xcrun and lipo and is stuck forever in this loop. Would appreciate some insights on this.

I'm on OS X Mavericks 10.9, latest build.

Source: (StackOverflow)

Checking if a postgresql table exists under python (and probably Psycopg2)

How can I determine if a table exists using the Psycopg2 Python library? I want a true or false boolean.

Source: (StackOverflow)

How do I get a list of column names from a psycopg2 cursor?

I would like a general way to generate column labels directly from the selected column names, and recall seeing that python's psycopg2 module supports this feature.

Source: (StackOverflow)

Python/postgres/psycopg2: getting ID of row just inserted

I'm using Python and psycopg2 to interface to postgres.

When I insert a row...

           sql_string = "INSERT INTO hundred (name,name_slug,status) VALUES ("
           sql_string += hundred_name + ", '" + hundred_slug + "', " + status + ");"
           cursor.execute(sql_string)

... how do I get the ID of the row I've just inserted? Trying:

           hundred = cursor.fetchall()

returns an error, while using RETURNING id:

           sql_string = "INSERT INTO domes_hundred (name,name_slug,status) VALUES ("
           sql_string += hundred_name + ", '" + hundred_slug + "', " + status + ") RETURNING id;"
           hundred = cursor.execute(sql_string)

simply returns None.

UPDATE: So does currval (even though using this command directly into postgres works):

           sql_string = "SELECT currval(pg_get_serial_sequence('hundred', 'id'));"
           hundred_id = cursor.execute(sql_string)

Can anyone advise?

thanks!

Source: (StackOverflow)

DictCursor doesn't seem to work under psycopg2

I haven't worked with psycopg2 before but I'm trying to change the cursor factory to DictCursor so that fetchall or fetchone will return a dictionary instead of a list.

I created a test script to make things simple and only test this functionality. Here's my little bit of code that I feel should work

import psycopg2
import psycopg2.extras

conn = psycopg2.connect("dbname=%s user=%s password=%s" % (DATABASE, USERNAME, PASSWORD))

cur = conn.cursor(cursor_factory = psycopg2.extras.DictCursor)
cur.execute("SELECT * from review")

res = cur.fetchall()

print type(res)
print res

The res variable is always a list and not a dictionary as I would expect.

A current workaround that I've implemented is to use this function that builds a dictionary and run each row returned by fetchall through it.

def build_dict(cursor, row):
    x = {}
    for key,col in enumerate(cursor.description):
        x[col[0]] = row[key]
    return d

Python is version 2.6.7 and psycopg2 is version 2.4.2.

Source: (StackOverflow)

psycopg2

Overview

Details

Timing Table

Timing Table Notes:

testSqlAlchemy.py

Timing Profile (with processrow)

Timing Profile (without processrow)

Final Comments

Why not install PostgreSQL?

Error from pip install psycopg2

Preliminary Research

Error from `pip install psycopg2`