SQLAlchemy
- About
- Key Concepts
- Usage - Basic Setup Connecting to Existing Database
- General Usage Recommendations
- Justification
- See Also
- Reference
This is incomplete at this point. I tend to use this package to interface with Pandas, and I'm in the process of working through the 1.4 to 2.0 migration.
About
SQLAlchemy is an ORM and general purpose data abstraction tool within Python.
Key Concepts
SQLAlchemy is a tool to create abstraction between your database and application and also provides an ORM. I'm not as much of a user with respect to the ORM, but the metadata and engine are handy to interact with the database.
The typical usage for most people is to generate a database via models described in an application. The flow would be to build the models, create the engine connection that aligns with the db you want to use (typically sqllite for development) and then you can generate the database. A session object would then be created and the ORM uses the session and metadata to create object representations of the data.
I don't use SQLAlchemy in this manner. Typically I'm connecting to existing database and I want to use it in conjunction with other packages in a data pipeline scenario. So many of the more powerful automation features aren't as useful, but there are still plenty of benefits, especially if you can keep your code somewhat minimal.
Usage - Basic Setup Connecting to Existing Database
.env
1CONN=DRIVER={SQL Server Native Client 10.0};SERVER=dagger;DATABASE=test;UID=user;PWD=password
conn/source.py
1#!/usr/bin/python
2# -*- coding: utf-8 -*-
3import os
4import urllib
5
6from sqlalchemy import create_engine
7
8conn = os.environ.get('CONN')
9
10params = urllib.quote_plus(conn)
11engine = create_engine('mssql+pyodbc:///?odbc_connect=%s' % params, fast_executemany=True)
./models.py
If you need to specify the schema, do so in the MetaData definition and the reflect and table methods will inherit appropriately. Depending on your needs, it may be appropriate to separate into models/source.py.
1#!/usr/bin/python
2# -*- coding: utf-8 -*-
3from sqlalchemy import MetaData
4
5from conn.source import engine
6
7metadata = MetaData()
8metadata.reflect(engine, only=['user', 'address']) # only limits the tables captured
./main.py
This is just a random example to show how to access table information from the metadata.
1#!/usr/bin/python
2# -*- coding: utf-8 -*-
3from conn.source import engine
4from models import metadata
5
6with engine.connect() as connection:
7 users = connection.execute(metadata.tables['user'].delete())
General Usage Recommendations
Connection Engine
The standard connection format is generally formatted like this:
1engine = create_engine('dialect+driver://username:password@host:port/database')
That format is fine, but I typically don't use that format. Instead, I ran across a different format that I find useful that is based on typical odbc conventions:
1import urllib
2
3params = urllib.quote_plus('DRIVER={SQL Server Native Client 10.0};SERVER=dagger;DATABASE=test;UID=user;PWD=password')
4engine = create_engine('mssql+pyodbc:///?odbc_connect=%s' % params,fast_executemany=True)
Make sure to update your driver to what is available on your machine. If using pyodbc, you can query what drivers are available by running:
1import pyodbc
2pyodbc.drivers()
Note: I'm not certain this works on all different dialect+driver situations, but it definately works on mssql+pyodbc.