Engine Configuration — SQLAlchemy 2.0.0b1 documentation

From Get docs
Sqlalchemy/docs/latest/core/engines

Engine Configuration

The _engine.Engine is the starting point for any SQLAlchemy application. It’s “home base” for the actual database and its DBAPI, delivered to the SQLAlchemy application through a connection pool and a Dialect, which describes how to talk to a specific kind of database/DBAPI combination.

The general structure can be illustrated as follows:

../_images/sqla_engine_arch.png Where above, an _engine.Engine references both a Dialect and a _pool.Pool, which together interpret the DBAPI’s module functions as well as the behavior of the database.

Creating an engine is just a matter of issuing a single call, _sa.create_engine():

from sqlalchemy import create_engine
engine = create_engine('postgresql://scott:tiger@localhost:5432/mydatabase')

The above engine creates a Dialect object tailored towards PostgreSQL, as well as a _pool.Pool object which will establish a DBAPI connection at localhost:5432 when a connection request is first received. Note that the _engine.Engine and its underlying _pool.Pool do not establish the first actual DBAPI connection until the _engine.Engine.connect() method is called, or an operation which is dependent on this method such as _engine.Engine.execute() is invoked. In this way, _engine.Engine and _pool.Pool can be said to have a lazy initialization behavior.

The _engine.Engine, once created, can either be used directly to interact with the database, or can be passed to a Session object to work with the ORM. This section covers the details of configuring an _engine.Engine. The next section, Working with Engines and Connections, will detail the usage API of the _engine.Engine and similar, typically for non-ORM applications.

Supported Databases

SQLAlchemy includes many Dialect implementations for various backends. Dialects for the most common databases are included with SQLAlchemy; a handful of others require an additional install of a separate dialect.

See the section Dialects for information on the various backends available.


Database Urls

The _sa.create_engine() function produces an _engine.Engine object based on a URL. These URLs follow RFC-1738, and usually can include username, password, hostname, database name as well as optional keyword arguments for additional configuration. In some cases a file path is accepted, and in others a “data source name” replaces the “host” and “database” portions. The typical form of a database URL is:

dialect+driver://username:password@host:port/database

Dialect names include the identifying name of the SQLAlchemy dialect, a name such as sqlite, mysql, postgresql, oracle, or mssql. The drivername is the name of the DBAPI to be used to connect to the database using all lowercase letters. If not specified, a “default” DBAPI will be imported if available - this default is typically the most widely known driver available for that backend.

As the URL is like any other URL, special characters such as those that may be used in the password need to be URL encoded to be parsed correctly.. Below is an example of a URL that includes the password "kx%jj5/g", where the percent sign and slash characters are represented as %25 and %2F, respectively:

postgresql+pg8000://dbuser:kx%25jj5%2Fg@pghost10/appdb

The encoding for the above password can be generated using urllib.parse:

>>> import urllib.parse
>>> urllib.parse.quote_plus("kx%jj5/g")
'kx%25jj5%2Fg'

Examples for common connection styles follow below. For a full index of detailed information on all included dialects as well as links to third-party dialects, see Dialects.

PostgreSQL

The PostgreSQL dialect uses psycopg2 as the default DBAPI. pg8000 is also available as a pure-Python substitute:

# default
engine = create_engine('postgresql://scott:tiger@localhost/mydatabase')

# psycopg2
engine = create_engine('postgresql+psycopg2://scott:tiger@localhost/mydatabase')

# pg8000
engine = create_engine('postgresql+pg8000://scott:tiger@localhost/mydatabase')

More notes on connecting to PostgreSQL at PostgreSQL.


MySQL

The MySQL dialect uses mysql-python as the default DBAPI. There are many MySQL DBAPIs available, including MySQL-connector-python and OurSQL:

# default
engine = create_engine('mysql://scott:tiger@localhost/foo')

# mysqlclient (a maintained fork of MySQL-Python)
engine = create_engine('mysql+mysqldb://scott:tiger@localhost/foo')

# PyMySQL
engine = create_engine('mysql+pymysql://scott:tiger@localhost/foo')

More notes on connecting to MySQL at MySQL and MariaDB.


Oracle

The Oracle dialect uses cx_oracle as the default DBAPI:

engine = create_engine('oracle://scott:[email protected]:1521/sidname')

engine = create_engine('oracle+cx_oracle://scott:tiger@tnsname')

More notes on connecting to Oracle at Oracle.


Microsoft SQL Server

The SQL Server dialect uses pyodbc as the default DBAPI. pymssql is also available:

# pyodbc
engine = create_engine('mssql+pyodbc://scott:tiger@mydsn')

# pymssql
engine = create_engine('mssql+pymssql://scott:tiger@hostname:port/dbname')

More notes on connecting to SQL Server at Microsoft SQL Server.


SQLite

SQLite connects to file-based databases, using the Python built-in module sqlite3 by default.

As SQLite connects to local files, the URL format is slightly different. The “file” portion of the URL is the filename of the database. For a relative file path, this requires three slashes:

# sqlite://<nohostname>/<path>
# where <path> is relative:
engine = create_engine('sqlite:///foo.db')

And for an absolute file path, the three slashes are followed by the absolute path:

# Unix/Mac - 4 initial slashes in total
engine = create_engine('sqlite:////absolute/path/to/foo.db')

# Windows
engine = create_engine('sqlite:///C:\\path\\to\\foo.db')

# Windows alternative using raw string
engine = create_engine(r'sqlite:///C:\path\to\foo.db')

To use a SQLite :memory: database, specify an empty URL:

engine = create_engine('sqlite://')

More notes on connecting to SQLite at SQLite.


Others

See Dialects, the top-level page for all additional dialect documentation.


Engine Creation API

Pooling

The _engine.Engine will ask the connection pool for a connection when the connect() or execute() methods are called. The default connection pool, QueuePool, will open connections to the database on an as-needed basis. As concurrent statements are executed, QueuePool will grow its pool of connections to a default size of five, and will allow a default “overflow” of ten. Since the _engine.Engine is essentially “home base” for the connection pool, it follows that you should keep a single _engine.Engine per database established within an application, rather than creating a new one for each connection.

Note

QueuePool is not used by default for SQLite engines. See SQLite for details on SQLite connection pool usage.


For more information on connection pooling, see Connection Pooling.


Custom DBAPI connect() arguments / on-connect routines

For cases where special connection methods are needed, in the vast majority of cases, it is most appropriate to use one of several hooks at the _sa.create_engine() level in order to customize this process. These are described in the following sub-sections.

Special Keyword Arguments Passed to dbapi.connect()

All Python DBAPIs accept additional arguments beyond the basics of connecting. Common parameters include those to specify character set encodings and timeout values; more complex data includes special DBAPI constants and objects and SSL sub-parameters. There are two rudimentary means of passing these arguments without complexity.

Add Parameters to the URL Query string

Simple string values, as well as some numeric values and boolean flags, may be often specified in the query string of the URL directly. A common example of this is DBAPIs that accept an argument encoding for character encodings, such as most MySQL DBAPIs:

engine = create_engine(
    "mysql+pymysql://user:pass@host/test?charset=utf8mb4"
)

The advantage of using the query string is that additional DBAPI options may be specified in configuration files in a manner that’s portable to the DBAPI specified in the URL. The specific parameters passed through at this level vary by SQLAlchemy dialect. Some dialects pass all arguments through as strings, while others will parse for specific datatypes and move parameters to different places, such as into driver-level DSNs and connect strings. As per-dialect behavior in this area currently varies, the dialect documentation should be consulted for the specific dialect in use to see if particular parameters are supported at this level.

Tip

A general technique to display the exact arguments passed to the DBAPI for a given URL may be performed using the Dialect.create_connect_args() method directly as follows:

>>> from sqlalchemy import create_engine
>>> engine = create_engine("mysql+pymysql://some_user:some_pass@some_host/test?charset=utf8mb4")
>>> args, kwargs = engine.dialect.create_connect_args(engine.url)
>>> args, kwargs
([], {'host': 'some_host', 'database': 'test', 'user': 'some_user', 'password': 'some_pass', 'charset': 'utf8mb4', 'client_flag': 2})

The above args, kwargs pair is normally passed to the DBAPI as dbapi.connect(*args, **kwargs).


Use the connect_args dictionary parameter

A more general system of passing any parameter to the dbapi.connect() function that is guaranteed to pass all parameters at all times is the :paramref:`_sa.create_engine.connect_args` dictionary parameter. This may be used for parameters that are otherwise not handled by the dialect when added to the query string, as well as when special sub-structures or objects must be passed to the DBAPI. Sometimes it’s just that a particular flag must be sent as the True symbol and the SQLAlchemy dialect is not aware of this keyword argument to coerce it from its string form as presented in the URL. Below illustrates the use of a psycopg2 “connection factory” that replaces the underlying implementation the connection:

engine = create_engine(
    "postgresql://user:pass@hostname/dbname",
    connect_args={"connection_factory": MyConnectionFactory}
)

Another example is the pyodbc “timeout” parameter:

engine = create_engine(
  "mssql+pyodbc://user:pass@sqlsrvr?driver=ODBC+Driver+13+for+SQL+Server",
  connect_args={"timeout": 30}
)

The above example also illustrates that both URL “query string” parameters as well as :paramref:`_sa.create_engine.connect_args` may be used at the same time; in the case of pyodbc, the “driver” keyword has special meaning within the URL.


Controlling how parameters are passed to the DBAPI connect() function

Beyond manipulating the parameters passed to connect(), we can further customize how the DBAPI connect() function itself is called using the DialectEvents.do_connect() event hook. This hook is passed the full *args, **kwargs that the dialect would send to connect(). These collections can then be modified in place to alter how they are used:

from sqlalchemy import event

engine = create_engine("postgresql://user:pass@hostname/dbname")

@event.listens_for(engine, "do_connect")
def receive_do_connect(dialect, conn_rec, cargs, cparams):
    cparams['connection_factory'] = MyConnectionFactory

Generating dynamic authentication tokens

DialectEvents.do_connect() is also an ideal way to dynamically insert an authentication token that might change over the lifespan of an _sa.engine.Engine. For example, if the token gets generated by get_authentication_token() and passed to the DBAPI in a token parameter, this could be implemented as:

from sqlalchemy import event

engine = create_engine("postgresql://user@hostname/dbname")

@event.listens_for(engine, "do_connect")
def provide_token(dialect, conn_rec, cargs, cparams):
    cparams['token'] = get_authentication_token()

See also

mssql_pyodbc_access_tokens - a more concrete example involving SQL Server


Modifying the DBAPI connection after connect, or running commands after connect

For a DBAPI connection that SQLAlchemy creates without issue, but where we would like to modify the completed connection before it’s actually used, such as for setting special flags or running certain commands, the PoolEvents.connect() event hook is the most appropriate hook. This hook is called for every new connection created, before it is used by SQLAlchemy:

from sqlalchemy import event

engine = create_engine(
    "postgresql://user:pass@hostname/dbname"
)

@event.listens_for(engine, "connect")
def connect(dbapi_connection, connection_record):
    cursor_obj = dbapi_connection.cursor()
    cursor_obj.execute("SET some session variables")
    cursor_obj.close()

Fully Replacing the DBAPI connect() function

Finally, the DialectEvents.do_connect() event hook can also allow us to take over the connection process entirely by establishing the connection and returning it:

from sqlalchemy import event

engine = create_engine(
    "postgresql://user:pass@hostname/dbname"
)

@event.listens_for(engine, "do_connect")
def receive_do_connect(dialect, conn_rec, cargs, cparams):
    # return the new DBAPI connection with whatever we'd like to
    # do
    return psycopg2.connect(*cargs, **cparams)

The DialectEvents.do_connect() hook supersedes the previous :paramref:`_sa.create_engine.creator` hook, which remains available. DialectEvents.do_connect() has the distinct advantage that the complete arguments parsed from the URL are also passed to the user-defined function which is not the case with :paramref:`_sa.create_engine.creator`.


Configuring Logging

Python’s standard logging module is used to implement informational and debug log output with SQLAlchemy. This allows SQLAlchemy’s logging to integrate in a standard way with other applications and libraries. There are also two parameters :paramref:`_sa.create_engine.echo` and :paramref:`_sa.create_engine.echo_pool` present on _sa.create_engine() which allow immediate logging to sys.stdout for the purposes of local development; these parameters ultimately interact with the regular Python loggers described below.

This section assumes familiarity with the above linked logging module. All logging performed by SQLAlchemy exists underneath the sqlalchemy namespace, as used by logging.getLogger('sqlalchemy'). When logging has been configured (i.e. such as via logging.basicConfig()), the general namespace of SA loggers that can be turned on is as follows:

  • sqlalchemy.engine - controls SQL echoing. Set to logging.INFO for SQL query output, logging.DEBUG for query + result set output. These settings are equivalent to echo=True and echo="debug" on :paramref:`_sa.create_engine.echo`, respectively.
  • sqlalchemy.pool - controls connection pool logging. Set to logging.INFO to log connection invalidation and recycle events; set to logging.DEBUG to additionally log all pool checkins and checkouts. These settings are equivalent to pool_echo=True and pool_echo="debug" on :paramref:`_sa.create_engine.echo_pool`, respectively.
  • sqlalchemy.dialects - controls custom logging for SQL dialects, to the extend that logging is used within specific dialects, which is generally minimal.
  • sqlalchemy.orm - controls logging of various ORM functions to the extent that logging is used within the ORM, which is generally minimal. Set to logging.INFO to log some top-level information on mapper configurations.

For example, to log SQL queries using Python logging instead of the echo=True flag:

import logging

logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)

By default, the log level is set to logging.WARN within the entire sqlalchemy namespace so that no log operations occur, even within an application that has logging enabled otherwise.

Note

The SQLAlchemy _engine.Engine conserves Python function call overhead by only emitting log statements when the current logging level is detected as logging.INFO or logging.DEBUG. It only checks this level when a new connection is procured from the connection pool. Therefore when changing the logging configuration for an already-running application, any _engine.Connection that’s currently active, or more commonly a Session object that’s active in a transaction, won’t log any SQL according to the new configuration until a new _engine.Connection is procured (in the case of Session, this is after the current transaction ends and a new one begins).


More on the Echo Flag

As mentioned previously, the :paramref:`_sa.create_engine.echo` and :paramref:`_sa.create_engine.echo_pool` parameters are a shortcut to immediate logging to sys.stdout:

>>> from sqlalchemy import create_engine, text
>>> e = create_engine("sqlite://", echo=True, echo_pool='debug')
>>> with e.connect() as conn:
...    print(conn.scalar(text("select 'hi'")))
...
2020-10-24 12:54:57,701 DEBUG sqlalchemy.pool.impl.SingletonThreadPool Created new connection <sqlite3.Connection object at 0x7f287819ac60>
2020-10-24 12:54:57,701 DEBUG sqlalchemy.pool.impl.SingletonThreadPool Connection <sqlite3.Connection object at 0x7f287819ac60> checked out from pool
2020-10-24 12:54:57,702 INFO sqlalchemy.engine.Engine select 'hi'
2020-10-24 12:54:57,702 INFO sqlalchemy.engine.Engine ()
hi
2020-10-24 12:54:57,703 DEBUG sqlalchemy.pool.impl.SingletonThreadPool Connection <sqlite3.Connection object at 0x7f287819ac60> being returned to pool
2020-10-24 12:54:57,704 DEBUG sqlalchemy.pool.impl.SingletonThreadPool Connection <sqlite3.Connection object at 0x7f287819ac60> rollback-on-return

Use of these flags is roughly equivalent to:

import logging
logging.basicConfig()
logging.getLogger("sqlalchemy.engine").setLevel(logging.INFO)
logging.getLogger("sqlalchemy.pool").setLevel(logging.DEBUG)

It’s important to note that these two flags work independently of any existing logging configuration, and will make use of logging.basicConfig() unconditionally. This has the effect of being configured in addition to any existing logger configurations. Therefore, when configuring logging explicitly, ensure all echo flags are set to False at all times, to avoid getting duplicate log lines.


Setting the Logging Name

The logger name of instance such as an Engine or Pool defaults to using a truncated hex identifier string. To set this to a specific name, use the :paramref:`_sa.create_engine.logging_name` and :paramref:`_sa.create_engine.pool_logging_name` with sqlalchemy.create_engine():

>>> from sqlalchemy import create_engine
>>> from sqlalchemy import text
>>> e = create_engine("sqlite://", echo=True, logging_name='myengine')
>>> with e.connect() as conn:
...     conn.execute(text("select 'hi'"))
...
2020-10-24 12:47:04,291 INFO sqlalchemy.engine.Engine.myengine select 'hi'
2020-10-24 12:47:04,292 INFO sqlalchemy.engine.Engine.myengine ()

Setting Per-Connection / Sub-Engine Tokens

New in version 1.4.0b2.


While the logging name is appropriate to establish on an _engine.Engine object that is long lived, it’s not flexible enough to accommodate for an arbitrarily large list of names, for the case of tracking individual connections and/or transactions in log messages.

For this use case, the log message itself generated by the _engine.Connection and _engine.Result objects may be augmented with additional tokens such as transaction or request identifiers. The :paramref:`_engine.Connection.execution_options.logging_token` parameter accepts a string argument that may be used to establish per-connection tracking tokens:

>>> from sqlalchemy import create_engine
>>> e = create_engine("sqlite://", echo="debug")
>>> with e.connect().execution_options(logging_token="track1") as conn:
...     conn.execute("select 1").all()
2021-02-03 11:48:45,754 INFO sqlalchemy.engine.Engine [track1] select 1
2021-02-03 11:48:45,754 INFO sqlalchemy.engine.Engine [track1] [raw sql] ()
2021-02-03 11:48:45,754 DEBUG sqlalchemy.engine.Engine [track1] Col ('1',)
2021-02-03 11:48:45,755 DEBUG sqlalchemy.engine.Engine [track1] Row (1,)

The :paramref:`_engine.Connection.execution_options.logging_token` parameter may also be established on engines or sub-engines via :paramref:`_sa.create_engine.execution_options` or _engine.Engine.execution_options(). This may be useful to apply different logging tokens to different components of an application without creating new engines:

>>> from sqlalchemy import create_engine
>>> e = create_engine("sqlite://", echo="debug")
>>> e1 = e.execution_options(logging_token="track1")
>>> e2 = e.execution_options(logging_token="track2")
>>> with e1.connect() as conn:
...     conn.execute("select 1").all()
2021-02-03 11:51:08,960 INFO sqlalchemy.engine.Engine [track1] select 1
2021-02-03 11:51:08,960 INFO sqlalchemy.engine.Engine [track1] [raw sql] ()
2021-02-03 11:51:08,960 DEBUG sqlalchemy.engine.Engine [track1] Col ('1',)
2021-02-03 11:51:08,961 DEBUG sqlalchemy.engine.Engine [track1] Row (1,)

>>> with e2.connect() as conn:
...     conn.execute("select 2").all()
2021-02-03 11:52:05,518 INFO sqlalchemy.engine.Engine [track2] Select 1
2021-02-03 11:52:05,519 INFO sqlalchemy.engine.Engine [track2] [raw sql] ()
2021-02-03 11:52:05,520 DEBUG sqlalchemy.engine.Engine [track2] Col ('1',)
2021-02-03 11:52:05,520 DEBUG sqlalchemy.engine.Engine [track2] Row (1,)

Hiding Parameters

The logging emitted by _engine.Engine also indicates an excerpt of the SQL parameters that are present for a particular statement. To prevent these parameters from being logged for privacy purposes, enable the :paramref:`_sa.create_engine.hide_parameters` flag:

>>> e = create_engine("sqlite://", echo=True, hide_parameters=True)
>>> with e.connect() as conn:
...     conn.execute(text("select :some_private_name"), {"some_private_name": "pii"})
...
2020-10-24 12:48:32,808 INFO sqlalchemy.engine.Engine select ?
2020-10-24 12:48:32,808 INFO sqlalchemy.engine.Engine [SQL parameters hidden due to hide_parameters=True]