SQLAlchemy

Jul 7, 2021 · 3 min read ·

Share on:

About
Key Concepts
Usage - Basic Setup Connecting to Existing Database
General Usage Recommendations
- Connection Engine
Justification
See Also
Reference

This is incomplete at this point. I tend to use this package to interface with Pandas, and I'm in the process of working through the 1.4 to 2.0 migration.

About

SQLAlchemy is an ORM and general purpose data abstraction tool within Python.

Key Concepts

SQLAlchemy is a tool to create abstraction between your database and application and also provides an ORM. I'm not as much of a user with respect to the ORM, but the metadata and engine are handy to interact with the database.

The typical usage for most people is to generate a database via models described in an application. The flow would be to build the models, create the engine connection that aligns with the db you want to use (typically sqllite for development) and then you can generate the database. A session object would then be created and the ORM uses the session and metadata to create object representations of the data.

I don't use SQLAlchemy in this manner. Typically I'm connecting to existing database and I want to use it in conjunction with other packages in a data pipeline scenario. So many of the more powerful automation features aren't as useful, but there are still plenty of benefits, especially if you can keep your code somewhat minimal.

Usage - Basic Setup Connecting to Existing Database

.env

1CONN=DRIVER={SQL Server Native Client 10.0};SERVER=dagger;DATABASE=test;UID=user;PWD=password

conn/source.py

 1#!/usr/bin/python
 2# -*- coding: utf-8 -*-
 3import os
 4import urllib
 5
 6from sqlalchemy import create_engine
 7
 8conn = os.environ.get('CONN')
 9
10params = urllib.quote_plus(conn)
11engine = create_engine('mssql+pyodbc:///?odbc_connect=%s' % params, fast_executemany=True)

./models.py

If you need to specify the schema, do so in the MetaData definition and the reflect and table methods will inherit appropriately. Depending on your needs, it may be appropriate to separate into models/source.py.

1#!/usr/bin/python
2# -*- coding: utf-8 -*-
3from sqlalchemy import MetaData
4
5from conn.source import engine
6
7metadata = MetaData()
8metadata.reflect(engine, only=['user', 'address']) # only limits the tables captured

./main.py

This is just a random example to show how to access table information from the metadata.

1#!/usr/bin/python
2# -*- coding: utf-8 -*-
3from conn.source import engine
4from models import metadata
5
6with engine.connect() as connection:
7    users = connection.execute(metadata.tables['user'].delete())

General Usage Recommendations

Connection Engine

The standard connection format is generally formatted like this:

1engine = create_engine('dialect+driver://username:password@host:port/database')

That format is fine, but I typically don't use that format. Instead, I ran across a different format that I find useful that is based on typical odbc conventions:

1import urllib
2
3params = urllib.quote_plus('DRIVER={SQL Server Native Client 10.0};SERVER=dagger;DATABASE=test;UID=user;PWD=password')
4engine = create_engine('mssql+pyodbc:///?odbc_connect=%s' % params,fast_executemany=True)

Make sure to update your driver to what is available on your machine. If using pyodbc, you can query what drivers are available by running:

1import pyodbc
2pyodbc.drivers()

Note: I'm not certain this works on all different dialect+driver situations, but it definately works on mssql+pyodbc.