EzDevInfo.com

sqlalchemy interview questions

Top sqlalchemy frequently asked interview questions

SQLAlchemy Inheritance

I'm a bit confused about inheritance under sqlalchemy, to the point where I'm not even sure what type of inheritance (single table, joined table, concrete) I should be using here. I've got a base class with some information that's shared amongst the subclasses, and some data that are completely separate. Sometimes, I'll want data from all the classes, and sometimes only from the subclasses. Here's an example:

class Building:
    def __init__(self, x, y):
        self.x = x
        self.y = y
class Commercial(Building):
    def __init__(self, x, y, business):
        Building.__init__(self, x, y)
        self.business = business
class Residential(Building):
    def __init__(self, x, y, numResidents):
        Building.__init__(self, x, y, layer)
        self.numResidents = numResidents

How would I convert this to SQLAlchemy using declarative? How, then, would I query which buildings are within x>5 and y>3? Or which Residential buildings have only 1 resident?


Source: (StackOverflow)

sqlalchemy flush() and get inserted id?

I want to do something like this:

f = Foo(bar='x')
session.add(f)
session.flush()

# do additional queries using f.id before commit()
print f.id # should be not None

session.commit()

But f.id is None when I try it. How can I get this to work?

-Dan


Source: (StackOverflow)

Advertisements

SQLAlchemy Obtain Primary Key With Autoincrement Before Commit

When I have created a table with an auto-incrementing primary key, is there a way to obtain what the primary key would be (that is, do something like reserve the primary key) without actually committing?

I would like to place two operations inside a transaction however one of the operations will depend on what primary key was assigned in the previous operation.


Source: (StackOverflow)

SQLAlchemy ─ Mapping a Class against Multiple Tables

# ! /usr/bin/env python
# -*- coding: utf-8 -*-
# login_frontend.py

""" Python        2.7.3
    Cherrypy      3.2.2
    PostgreSQL    9.1
    psycopy2      2.4.5
    SQLAlchemy    0.7.10
"""

I'm having a problem joining four tables in one Python/SQLAlchemy class. I'm trying this, so I can iterate the instance of this class, instead of the named tuple, which I get from joining tables with the ORM.

Why all of this? Because I already started that way and I came too far, to just leave it. Also, it has to be possible, so I want to know how it's done.

For this project (cherrypy web-frontend) I got an already completed module with the table classes. I moved it to the bottom of this post, because maybe it isn't even necessary for you.

The following is just one example of a joined multiple tables class attempt. I picked a simple case with more than only two tables and a junction table. Here I don't write into these joined tables, but it is necessary somewhere else. That's why classes would be a nice solution to this problem.


My attempt of a join class,

which is a combination of the given table classes module and the examples from these two websites:

-Mapping a Class against Multiple Tables
-SQLAlchemy: one classes – two tables

class JoinUserGroupPerson (Base):

    persons = md.tables['persons']
    users = md.tables['users']
    user_groups = md.tables['user_groups']
    groups = md.tables['groups']

    user_group_person =(
        join(persons, users, persons.c.id == users.c.id).
        join(user_groups, users.c.id == user_groups.c.user_id).
        join(groups, groups.c.id == user_groups.c.group_id))

    __table__ = user_group_person

    """ I expanded the redefinition of 'id' to three tables,
        and removed this following one, since it made no difference:
        users_id = column_property(users.c.id, user_groups.c.user_id)
    """

    id = column_property(persons.c.id, users.c.id, user_groups.c.user_id)
    groups_id = column_property(groups.c.id, user_groups.c.group_id)
    groups_name = groups.c.name

    def __init__(self, group_name, login, name, email=None, phone=None):
        self.groups_name = group_name
        self.login = login
        self.name = name
        self.email = email
        self.phone = phone

    def __repr__(self):
        return(
            "<JoinUserGroupPerson('%s', '%s', '%s', '%s', '%s')>" %(
            self.groups_name, self.login, self.name, self.email, self.phone))

Different table accesses with this join class

  • This is how I tried to query this class in another module:

    pg = sqlalchemy.create_engine(
        'postgresql://{}:{}@{}:{}/{}'.
        format(user, password, server, port, data))
    Session = sessionmaker(bind=pg)
    s1 = Session()
    
    query = (s1.query(JoinUserGroupPerson).
        filter(JoinUserGroupPerson.login==user).
        order_by(JoinUserGroupPerson.id))
    
        record = {}
        for rowX in query:
            for colX in rowX.__table__.columns:
                record[column.name] = getattr(rowX,colX.name)
    
    
    """ RESULT:
    """
    
    
    Traceback (most recent call last):
      File "/usr/local/lib/python2.7/dist-packages/cherrypy/_cprequest.py", line 656, in respond
        response.body = self.handler()
      File "/usr/local/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 228, in __call__
        ct.params['charset'] = self.find_acceptable_charset()
      File "/usr/local/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 134, in find_acceptable_charset
        if encoder(encoding):
      File "/usr/local/lib/python2.7/dist-packages/cherrypy/lib/encoding.py", line 86, in encode_string
        for chunk in self.body:
      File "XXX.py", line YYY, in ZZZ
        record[colX.name] = getattr(rowX,colX.name)
    AttributeError: 'JoinUserGroupPerson' object has no attribute 'user_id'
    
  • Then I checked the table attributes:

    for rowX in query:
        return (u'{}'.format(rowX.__table__.columns))
    
    
    """ RESULT:
    """
    
    
    ['persons.id',
     'persons.name',
     'persons.email',
     'persons.phone',
     'users.id',
     'users.login',
     'user_groups.user_id',
     'user_groups.group_id',
     'groups.id',
     'groups.name']
    
  • Then I checked, if the query or my class isn't working at all, by using a counter. I got up to (count == 5), so the first two joined tables. But when I set the condition to (count == 6), I got the first error message again. AttributeError: 'JoinUserGroupPerson' object has no attribute 'user_id'.:

    list = []
    for rowX in query:
        for count, colX in enumerate(rowX.__table__.columns):
            list.append(getattr(rowX,colX.name))
            if count == 5:
                break
    return (u'{}'.format(list))
    
    
    """ RESULT:
    """
    
    
    [4, u'user real name', None, None, 4, u'user']
    
    
    """ which are these following six columns:
        persons[id, name, email, phone], users[id, login]
    """
    
  • Then I checked each column:

    list = []
    for rowX in query:
        for colX in rowX.__table__.columns:
            list.append(colX)
    return (u'{}'.format(list))
    
    
    """ RESULT:
    """
    
    
    [Column(u'id', INTEGER(), table=, primary_key=True, nullable=False, server_default=DefaultClause(, for_update=False)),
     Column(u'name', VARCHAR(length=252), table=, nullable=False),
     Column(u'email', VARCHAR(), table=),
     Column(u'phone', VARCHAR(), table=),
     Column(u'id', INTEGER(), ForeignKey(u'persons.id'), table=, primary_key=True, nullable=False),
     Column(u'login', VARCHAR(length=60), table=, nullable=False),
     Column(u'user_id', INTEGER(), ForeignKey(u'users.id'), table=, primary_key=True, nullable=False),
     Column(u'group_id', INTEGER(), ForeignKey(u'groups.id'), table=, primary_key=True, nullable=False),
     Column(u'id', INTEGER(), table=, primary_key=True, nullable=False),
     Column(u'name', VARCHAR(length=60), table=, nullable=False)]
    
  • Then I tried another two direct accesses, which got me both KeyErrors for 'id' and 'persons.id':

    for rowX in query:
        return (u'{}'.format(rowX.__table__.columns['id'].name))
    
    for rowX in query:
        return (u'{}'.format(rowX.__table__.columns['persons.id'].name))
    

Conclusion

I tried a few other things, which were even more confusing. Since they didn't reveal any more information, I didn't add them, and anyway, this post is already long enough. I really would appreciate some help with this matter, because I don't see, where my class is wrong.

I guess, somehow I must have set the class in a way, which would only correctly join the first two tables. But the join works at least partially, because when the 'user_groups' table was empty, I got an empty query as well.

Or maybe I did something wrong with the mapping of this 'user_groups' table. Since with the join some columns are double, they need an additional definition. And the 'user_id' is already part of the persons and users table, so I had to map it twice.

I even tried to remove the 'user_groups' table from the join, because it's in the relationships (with secondary). It got me a foreign key error message. But maybe I just did it wrong.

Admittedly, I even don't know why ...

rowX.__table__.columns                  # column names as table name suffix 

... has different attribute names than ...

colX in rowX.__table__.columns        # column names without table names

So please help, thank you.


Extra Edits

  • Another thought! Would all of this be possible with inheritance? Each class has its own mapping, but then the user_groups class may be necessary. The joins had to be between the single classes instead. The init() and repr() still had to be redefined.

  • It probably has something to do with the 'user_groups' table, because I even couldn't join it with the 'groups' or 'users' table. And it always says, that the class object has no attribute 'user_id'. Maybe it's something about the many-to-many relationship.


Attachment

Here is the already given SQLAlchemy module, with header, without specific information about the database, and the classes of the joined tables:

#!/usr/bin/python
# vim: set fileencoding=utf-8 :

import sqlalchemy
from sqlalchemy import join
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship, backref, column_property

pg = sqlalchemy.create_engine(
    'postgresql://{}@{}:{}/{}'.format(user, host, port, data))

md = sqlalchemy.MetaData(pg, True)
Base = declarative_base()



""" ... following, three of the four joined tables.
    UserGroups isn't necessary, so it wasn't part of the module.
    And the other six classes shouldn't be important for this ...
"""


class Person(Base):
    __table__ = md.tables['persons']

    def __init__(self, name, email=None, phone=None):
        self.name = name
        self.email = email
        self.phone = phone

    def __repr__(self):
        return(
            "<Person(%s, '%s', '%s', '%s')>" %(
            self.id, self.name, self.email, self.phone))

class Group(Base):
    __table__ = md.tables['groups']

    def __init__(self, name):
        self.name = name

    def __repr__(self):
        return("<Group(%s, '%s')>" %(self.id, self.name))

class User(Base):
    __table__ = md.tables['users']

    person = relationship('Person')
    groups = relationship(
        'Group', secondary=md.tables['user_groups'], order_by='Group.id',
        backref=backref('users', order_by='User.login'))

    def __init__(self, person, login):
        if isinstance(person, Person):
            self.person = person
        else:
            self.id = person
        self.login = login

    def __repr__(self):
        return("<User(%s, '%s')>" %(self.id, self.login))

Maybe the following script, which created the database, and also was already given, will prove useful here. As last part of it comes some test data - but between the columns are supposed to be tabs, no spaces. Because of that, this script also can be found as gist on github:

-- file create_str.sql
-- database creation script
-- central script for creating all database objects

-- set the database name
\set strdbname logincore

\c admin

BEGIN;
\i str_roles.sql
COMMIT;

DROP DATABASE IF EXISTS :strdbname;
CREATE DATABASE :strdbname TEMPLATE template1 OWNER str_db_owner
    ENCODING 'UTF8';
\c :strdbname

SET ROLE str_db_owner;

BEGIN;
\i str.sql
COMMIT;
RESET ROLE;





-- file str_roles.sql
-- create roles for the database

-- owner of the database objects
SELECT create_role('str_db_owner', 'NOINHERIT');

-- role for using
SELECT create_role('str_user');

-- make str_db_owner member in all relevant roles
GRANT str_user TO str_db_owner WITH ADMIN OPTION;





-- file str.sql
-- creation of database

-- prototypes
\i str_prototypes.sql

-- domain for non empty text
CREATE DOMAIN ntext AS text CHECK (VALUE<>'');

-- domain for email addresses
CREATE DOMAIN email AS varchar(252) CHECK (is_email_address(VALUE));

-- domain for phone numbers
CREATE DOMAIN phone AS varchar(60) CHECK (is_phone_number(VALUE));

-- persons
CREATE TABLE persons (
    id    serial       PRIMARY KEY,
    name  varchar(252) NOT NULL,
    email email,
    phone phone
);

GRANT SELECT, INSERT, UPDATE, DELETE ON persons TO str_user;
GRANT USAGE ON SEQUENCE persons_id_seq TO str_user;

CREATE TABLE groups (
    id   integer     PRIMARY KEY,
    name varchar(60) UNIQUE NOT NULL
);

GRANT SELECT ON groups TO str_user;

-- database users
CREATE TABLE users (
    id    integer     PRIMARY KEY REFERENCES persons(id) ON UPDATE CASCADE,
    login varchar(60) UNIQUE NOT NULL
);

GRANT SELECT ON users TO str_user;

-- user <-> groups
CREATE TABLE user_groups (
    user_id  integer NOT NULL REFERENCES users(id)
                              ON UPDATE CASCADE ON DELETE CASCADE,
    group_id integer NOT NULL REFERENCES groups(id)
                              ON UPDATE CASCADE ON DELETE CASCADE,
    PRIMARY KEY (user_id, group_id)
);

-- functions
\i str_functions.sql





-- file str_prototypes.sql
-- prototypes for database

-- simple check for correct email address
CREATE FUNCTION is_email_address(email varchar) RETURNS boolean
    AS $CODE$
    SELECT FALSE
    $CODE$ LANGUAGE sql IMMUTABLE STRICT;

-- simple check for correct phone number
CREATE FUNCTION is_phone_number(nr varchar) RETURNS boolean
    AS $CODE$
    SELECT FALSE
    $CODE$ LANGUAGE sql IMMUTABLE STRICT;





-- file str_functions.sql
-- functions for database

-- simple check for correct email address
CREATE OR REPLACE FUNCTION is_email_address(email varchar) RETURNS boolean
    AS $CODE$
    SELECT $1 ~ E'^[A-Za-z0-9.!#$%&\'\*\+\-/=\?\^_\`{\|}\~\.]+@[-a-z0-9\.]+$'
    $CODE$ LANGUAGE sql IMMUTABLE STRICT;

-- simple check for correct phone number
CREATE OR REPLACE FUNCTION is_phone_number(nr varchar) RETURNS boolean
    AS $CODE$
    SELECT $1 ~ E'^[-+0-9\(\)/ ]+$'
    $CODE$ LANGUAGE sql IMMUTABLE STRICT;





-- file fill_str_test.sql
-- test data for database
-- between the columns are supposed to be tabs, no spaces !!!

BEGIN;

COPY persons (id, name, email) FROM STDIN;
1   Joseph Schneider    jschneid@lab.uni.de
2   Test User   jschneid@lab.uni.de
3   Hans Dampf  \N
\.
SELECT setval('persons_id_seq', (SELECT max(id) FROM persons));

COPY groups (id, name) FROM STDIN;
1   IT
2   SSG
\.

COPY users (id, login) FROM STDIN;
1   jschneid
2   tuser
3   dummy
\.

COPY user_groups (user_id, group_id) FROM STDIN;
1   1
2   1
3   2
\.

COMMIT;


Source: (StackOverflow)

Bulk insert with SQLAlchemy ORM

Is there any way to get SQLAlchemy to do a bulk insert rather than inserting each individual object. i.e.,

doing:

INSERT INTO `foo` (`bar`) VALUES (1), (2), (3)

rather than:

INSERT INTO `foo` (`bar`) VALUES (1)
INSERT INTO `foo` (`bar`) VALUES (2)
INSERT INTO `foo` (`bar`) VALUES (3)

I've just converted some code to use sqlalchemy rather than raw sql and although it is now much nicer to work with it seems to be slower now (up to a factor of 10), I'm wondering if this is the reason.

May be I could improve the situation using sessions more efficiently. At the moment I have autoCommit=False and do a session.commit() after I've added some stuff. Although this seems to cause the data to go stale if the DB is changed elsewhere, like even if I do a new query I still get old results back?

Thanks for your help!


Source: (StackOverflow)

Sqlalchemy - Difference between query and query.all in for loops

I would like to ask whats the difference between

for row in session.Query(Model1):
    pass

and

for row in session.Query(Model1).all():
    pass

is the first somehow an iterator bombarding your DB with single queries and the latter "eager" queries the whole thing as a list (like range(x) vs xrange(x)) ?


Source: (StackOverflow)

Why is SQLAlchemy insert with sqlite 25 times slower than using sqlite3 directly?

Why is this simple test case inserting 100,000 rows 25 times slower with SQLAlchemy than it is using the sqlite3 driver directly? I have seen similar slowdowns in real-world applications. Am I doing something wrong?

#!/usr/bin/env python
# Why is SQLAlchemy with SQLite so slow?
# Output from this program:
# SqlAlchemy: Total time for 100000 records 10.74 secs
# sqlite3:    Total time for 100000 records  0.40 secs


import time
import sqlite3

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String,  create_engine 
from sqlalchemy.orm import scoped_session, sessionmaker

Base = declarative_base()
DBSession = scoped_session(sessionmaker())

class Customer(Base):
    __tablename__ = "customer"
    id = Column(Integer, primary_key=True)
    name = Column(String(255))

def init_sqlalchemy(dbname = 'sqlite:///sqlalchemy.db'):
    engine  = create_engine(dbname, echo=False)
    DBSession.configure(bind=engine, autoflush=False, expire_on_commit=False)
    Base.metadata.drop_all(engine)
    Base.metadata.create_all(engine)

def test_sqlalchemy(n=100000):
    init_sqlalchemy()
    t0 = time.time()
    for i in range(n):
        customer = Customer()
        customer.name = 'NAME ' + str(i)
        DBSession.add(customer)
    DBSession.commit()
    print "SqlAlchemy: Total time for " + str(n) + " records " + str(time.time() - t0) + " secs"

def init_sqlite3(dbname):
    conn = sqlite3.connect(dbname)
    c = conn.cursor()
    c.execute("DROP TABLE IF EXISTS customer")
    c.execute("CREATE TABLE customer (id INTEGER NOT NULL, name VARCHAR(255), PRIMARY KEY(id))")
    conn.commit()
    return conn

def test_sqlite3(n=100000, dbname = 'sqlite3.db'):
    conn = init_sqlite3(dbname)
    c = conn.cursor()
    t0 = time.time()
    for i in range(n):
        row = ('NAME ' + str(i),)
        c.execute("INSERT INTO customer (name) VALUES (?)", row)
    conn.commit()
    print "sqlite3: Total time for " + str(n) + " records " + str(time.time() - t0) + " sec"

if __name__ == '__main__':
    test_sqlalchemy(100000)
    test_sqlite3(100000)

I have tried numerous variations (see http://pastebin.com/zCmzDraU )


Source: (StackOverflow)

Example of what SQLAlchemy can do, and Django ORM cannot

I've been doing a lot of research lately into using Pyramid with SQLAlchemy versus keeping a current application in Django. That by itself is an entire debate, but I'm not here to discuss that.

What I do want to know is, why is SQLAlchemy universally considered better than Django ORM? Almost every, if not every, comparison I've found between the two favors SQLAlchemy. I assume performance is a big one, as the structure of SQLAlchemy lets it translate to SQL much more smoothly.

But, I've also heard that with harder tasks, Django ORM is nearly impossible to use. I want to scope out how huge of an issue this can be. I've been reading one of the reasons to switch to SQLAlchemy is when Django ORM is no longer suiting your needs.

So, in short, could someone provide a query (doesn't have to be actual SQL syntax) that SQLAlchemy can do, but Django ORM cannot possibly do without adding in additional raw SQL?

Update:

I've been noticing this question getting quite some attention since I first asked, so I'd like to throw in my extra two cents.

In the end we ended up using SQLAlchemy and I must say I'm happy with the decision.

I'm revisiting this question to provide an additional feature of SQLAlchemy that, so far, I've not been able to replicate in Django ORM. If someone can provide an example of how to do this I'll gladly eat my words.

Let's say you want to use some postgresql function, such as similarity(), which provides a fuzzy comparison (see: Finding similar strings with postgresql quickly - tl;dr input two strings get back a percent similarity).

I've done some searching on how to do this using the Django ORM and have found nothing other than using raw sql as seems to be apparent from their documentation: https://docs.djangoproject.com/en/dev/topics/db/sql/.

i.e.

Model.objects.raw('SELECT * FROM app_model ORDER BY \
similarity(name, %s) DESC;', [input_name])

SQLalchemy, however, has func(), as described here: http://docs.sqlalchemy.org/en/latest/core/sqlelement.html#sqlalchemy.sql.expression.func

from sqlalchemy import desc, func
session.query(Model).order_by(func.similarity(Model.name, input_name))

This allows you to generate sql for any defined sql/postgresql/etc function and not require raw sql.


Source: (StackOverflow)

memory-efficient built-in SqlAlchemy iterator/generator?

I have a ~10M record MySQL table that I interface with using SqlAlchemy. I have found that queries on large subsets of this table will consume too much memory even though I thought I was using a built-in generator that intelligently fetched bite-sized chunks of the dataset:

for thing in session.query(Things):
    analyze(thing)

To avoid this, I find I have to build my own iterator that bites off in chunks:

lastThingID = None
while True:
    things = query.filter(Thing.id < lastThingID).limit(querySize).all()
    if not rows or len(rows) == 0: 
        break
    for thing in things:
        lastThingID = row.id
        analyze(thing)

Is this normal or is there something I'm missing regarding SA built-in generators?

The answer to this question seems to indicate that the memory consumption is not to be expected.


Source: (StackOverflow)

method of iterating over sqlalchemy model's defined columns?

I've been trying to figure out how to iterate over the list of columns defined in a SqlAlchemy model. I want it for writing some serialization and copy methods to a couple of models. I can't just iterate over the obj.dict since it contains a lot of SA specific items.

Anyone know of a way to just get the id, and desc names from the following?

class JobStatus(Base):
    __tablename__ = 'jobstatus'

    id = Column(Integer, primary_key=True)
    desc = Column(Unicode(20))

In this small case I could easily create a:

def logme(self):
    return {'id': self.id, 'desc': self.desc}

but I'd prefer something that was auto generating for larger objects.

Thanks for the assistance.


Source: (StackOverflow)

How to serialize SqlAlchemy result to JSON?

Django has some good automatic serialization of ORM models returned from DB to JSON format.

How to serialize SQLAlchemy query result to JSON format?

I tried jsonpickle.encode but it encodes query object itself. I tried json.dumps(items) but it returns

TypeError: <Product('3', 'some name', 'some desc')> is not JSON serializable

Is it really so hard to serialize SQLAlchemy ORM objects to JSON /XML? Isn't there any default serializer for it? It's very common task to serialize ORM query results nowadays.

What I need is just to return JSON or XML data representation of SQLAlchemy query result.

SQLAlchemy objects query result in JSON/XML format is needed to be used in javascript datagird (JQGrid http://www.trirand.com/blog/)


Source: (StackOverflow)

How can I profile a SQLAlchemy powered application?

Does anyone have experience profiling a Python/SQLAlchemy app? And what are the best way to find bottlenecks and design flaws?

We have a Python application where the database layer is handled by SQLAlchemy. The application uses a batch design, so a lot of database requests is done sequentially and in a limited timespan. It currently takes a bit too long to run, so some optimization is needed. We don't use the ORM functionality, and the database is PostgreSQL.


Source: (StackOverflow)

SQLAlchemy: print the actual query

I'd really like to be able to print out valid SQL for my application, including values, rather than bind parameters, but it's not obvious how to do this in SQLAlchemy (by design, I'm fairly sure).

Has anyone solved this problem in a general way?


Source: (StackOverflow)

MongoKit vs MongoEngine vs Flask-MongoAlchemy for Flask [closed]

Anyone has experiences with MongoKit, MongoEngine or Flask-MongoAlchemy for Flask?

Which one do you prefer? Positive or negative experiences?. Too many options for a Flask-Newbie.


Source: (StackOverflow)

Getting random row through SQLAlchemy

How do I select a(or some) random row(s) from a table using SQLAlchemy?


Source: (StackOverflow)