Advanced Topics

As with all frameworks, Geniusql can't cover every need out-of-the-box. However, Geniusql has been specifically designed to be hackable. In particular, the addition of new providers is a well-defined process. Read on if there's a feature you need that you might consider building yourself.

Passing through SQL

If you want to keep writing SQL, there's nothing stopping you from doing so. If nothing else, it can be a handy way to prototype or migrate an application, and then replace the SQL with Geniusql API calls later on. You'll need to make your deployers aware that you're using SQL directly (and which DB's your SQL runs on), so they don't try to deploy your application with an unsupported store.

Use the extension methods built into the Database class to get data:

>>> rows, cols = db.fetch("SELECT djvFields.Value, Count(djvCity.ID) AS NumCities "
                          "FROM djvFields LEFT JOIN djvCity ON djvFields.ID = "
                          "djvCity.Field GROUP BY djvFields.Value")
>>> cols     # [(name, type), (name, type), ...]
[(u'Value', 202), (u'NumCities', 3)]
>>> rows
[(u'Baja California', 3), (u'Ciudad Juarez', 1), (u'Puerto Penasco', 0), (u'Yucatan Peninsula', 1)]

...or update:

>>> db.execute("UPDATE djvFields SET ShortCode = Left(Value, 1) WHERE ShortCode Is Null;")

There are a lot of other things you can do with the builtin Database, Table, Column, and SQLDeparser classes. Feel free to open them up in an interactive session and explore.

Custom Providers

In most cases, you will add new functionality to Geniusql itself by creating a custom "provider" (set of Database/Table/Column/etc objects) for a new backend. Providers must conform to simple interfaces which are specific to each class. They are free to implement that functionality however they like.

Generic Database Wrappers

Writing a provider for a database is relatively straightforward, mostly because Geniusql doesn't have complicated storage interfaces or demands. If you find your application depends heavily upon using advanced features of a particular database, or upon hand-crafted SQL, then Geniusql is not for you or your application. A Geniusql provider module for a database usually includes:

  1. DatabaseType objects, which model database types and their properties. Base classes can be found in geniusql.dbtypes.
  2. Adapters, which coerce values from Python types to database types and back again. Base classes can be found in geniusql.adapters.
  3. An SQLDeparser, which converts geniusql Expression objects (essentially, Python lambdas) into SQL.
  4. Subclasses of geniusql.Database, Schema, Table, Column, and IndexSet. These handle requests to SELECT data using the above components, as well as ALTER TABLE, etc.
  5. A ConnectionManager, which provides pooling (or not), plus transaction management.
Type Mapping

Any database code in a general-purpose programming language will eventually have to come to terms with the gap between native code types and native storage types. For us, this means matching Python types (like int and datetime) to database types (like INT8 and TEXT). Geniusql provides this layer for databases by using a mapping layer between your model code (Tables and Columns) and the underlying tables and columns. The implementation of that is unimportant (and possibly storage-dependent), but Geniusql needs to know the database types in effect in order to translate data safely.

If your application has created all of its own tables using Geniusql, then there is generally nothing to worry about in terms of the "type gap"; Geniusql will default to creating columns of the types it knows best. But if you are building a Geniusql interface onto an existing database, or if you customize/optimize your database by hand, then you should look into creating your own DatabaseType and/or Adapter subclasses.

DatabaseTypes

You model a database by making new subclasses of the base classes in geniusql.dbtypes, one new subclass for each named type your database exposes. Mostly, this means giving new names to SQL92 types (that geniusql already provides); sometimes, you will add metadata to a type like 'bytes' or 'synonyms'. For example, here are the DECIMAL types for MySQL:

class DECIMAL(dbtypes.SQL92DECIMAL):
    synonyms = ['DEC', 'NUMERIC']
    
    default_adapters = dbtypes.SQL92DECIMAL.default_adapters.copy()
    default_adapters[datetime.timedelta] = MySQL_timedelta_to_DECIMAL()
    
    # "Before 3.23.6, precision and scale both must be specified explicitly."
    _precision = 10
    max_precision = 16

class DECIMAL503(DECIMAL):
    max_precision = 64

class DECIMAL505(DECIMAL):
    max_precision = 65
Here, we declare that "DEC" and "NUMERIC" are synonyms for "DECIMAL" when using MySQL. We also attached a custom adapter for Python timedelta, so we could use MySQL's builtin "INTERVAL" keyword to store timedeltas. There's also a separate DECIMAL503 for MySQL version 5.0.3 (which increased the maximum precision of the DECIMAL type), and another for version 5.0.5 (which increased it again).

Adapters

Adapters (subclasses of adapters.Adapter) all need to do four things: push(value, dbtype) converts Geniusql values to properly-escaped SQL, pull(value, dbtype) does the reverse, binary_op(op1, op, sqlop, op2) translates binary operations from Python into equivalent SQL, and compare_op(op1, op, sqlop, op2) does the same for comparison operations. In general, you will define a separate adapter for each pytype-dbtype pair (e.g. float_to_SQL92DOUBLE), and then attach them to the dbtype.default_adapters dict.

Deparser

The SQLDeparser is the tricky bit of any provider. You must receive a set of tables and an Expression, and produce valid SQL for your database from both. For example, given:

tables = (schema['Things'], schema['Places'])
expr = logic.Expression(lambda x, y: x.Category == y.Category)
...your deparser should produce something like:
"[djvThings].[Category] = [djvPlaces].[Category]"
...which the caller can then use in a WHERE or ON clause.

The above example may seem trivial to you, but add in proper quoting, diverse datatypes (like dates and decimals), complex operators (like 'in', 'Like', and 'ieq'), logic functions (like today() and iscurrentweek()), binary operators, null queries, and just-in-time keyword args, and it becomes complex very quickly. You are, in effect, writing a mini-ORM.

But, don't despair. Geniusql provides you with tools to make this task easier:

  1. The most important tool is geniusql.deparse.SQLDeparser, a complete base class. You should be able to tweak it for most databases with a couple of SQL syntax changes.
  2. SQLDeparser is built on a simple Visitor-style base class, astwalk.LambdaDeparser. More complicated extensions are easily added to this base class; each node in the Expression (Python lambda) AST gets its own method call.
  3. You don't have to handle globals or cell references within the lambda--when the lambda gets wrapped in an Expression, all free variables are converted to constants.
  4. You aren't forced to handle every possible operator, function, or term in SQL. The base SQLDeparser doesn't; when it encounters a function it can't handle, for example, it punts by flagging the SQL as imperfect. This signals the caller to run each row through the restriction expression (in pure Python) before yielding it back to the caller. In fact, you can start writing your provider without a deparser at all! Just return all rows of the given tables and use the Expression to filter whole Units. Then, when your provider works, add a deparser.
Database/Table/IndexSet

You'll need a subclass of geniusql.Database. Override the container methods (like __setitem__ and __delitem__). For most popular databases, these are pretty straightforward. Some notes:

Use geniusql/test/zoo_fixture.py to test your new provider. Copy one of the (very short) test_* modules for the other providers, and make the necessary changes for yours. All of the heavy lifting of the tests is done in zoo_fixture.

Legacy Database Wrappers

Sometimes you do not have complete control over the database you want to reference. In that case, you should probably still write a custom Database, Schema, Adapters, and a Deparser. Often, you can get away with providing a simple column mapping to use as you deparse. I've built one, for example, to wrap The Raiser's Edge (third-party fundraising software). My Geniusql model manages directory records and income without regard for the underlying database; a custom provider maps between that ideal model and the Raiser's Edge API. This allows me to integrate data from RE with our custom inventory, invoice, and scheduling software.

One of the more important parts of wrapping existing tables is getting your pretty Python names mapped to ugly database names. Do this by making a custom Schema: override the _column_name and table_name methods to do the mapping.