| 1 |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" |
|---|
| 2 |
"http://www.w3.org/TR/xhtml1/DTD/strict.dtd"> |
|---|
| 3 |
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
|---|
| 4 |
|
|---|
| 5 |
<head> |
|---|
| 6 |
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> |
|---|
| 7 |
<title>Geniusql: Advanced Topics</title> |
|---|
| 8 |
<link href='geniusql.css' rel='stylesheet' type='text/css' /> |
|---|
| 9 |
</head> |
|---|
| 10 |
|
|---|
| 11 |
<body> |
|---|
| 12 |
|
|---|
| 13 |
<h2>Advanced Topics</h2> |
|---|
| 14 |
<p>As with all frameworks, Geniusql can't cover every need out-of-the-box. |
|---|
| 15 |
However, Geniusql has been specifically designed to be hackable. In particular, |
|---|
| 16 |
the addition of new providers is a well-defined process. Read on if |
|---|
| 17 |
there's a feature you need that you might consider building yourself.</p> |
|---|
| 18 |
|
|---|
| 19 |
|
|---|
| 20 |
<a name='sql'><h3>Passing through SQL</h3></a> |
|---|
| 21 |
<p>If you <i>want</i> to keep writing SQL, there's nothing stopping you |
|---|
| 22 |
from doing so. If nothing else, it can be a handy way to prototype or |
|---|
| 23 |
migrate an application, and then replace the SQL with Geniusql API calls |
|---|
| 24 |
later on. You'll need to make your deployers aware that you're using SQL |
|---|
| 25 |
directly (and which DB's your SQL runs on), so they don't try to deploy |
|---|
| 26 |
your application with an unsupported store.</p> |
|---|
| 27 |
|
|---|
| 28 |
<p>Use the extension methods built into the Database class to get data:</p> |
|---|
| 29 |
|
|---|
| 30 |
<pre>>>> rows, cols = db.fetch("SELECT djvFields.Value, Count(djvCity.ID) AS NumCities " |
|---|
| 31 |
"FROM djvFields LEFT JOIN djvCity ON djvFields.ID = " |
|---|
| 32 |
"djvCity.Field GROUP BY djvFields.Value") |
|---|
| 33 |
>>> cols # [(name, type), (name, type), ...] |
|---|
| 34 |
[(u'Value', 202), (u'NumCities', 3)] |
|---|
| 35 |
>>> rows |
|---|
| 36 |
[(u'Baja California', 3), (u'Ciudad Juarez', 1), (u'Puerto Penasco', 0), (u'Yucatan Peninsula', 1)] |
|---|
| 37 |
</pre> |
|---|
| 38 |
|
|---|
| 39 |
<p>...or update:</p> |
|---|
| 40 |
|
|---|
| 41 |
<pre>>>> db.execute("UPDATE djvFields SET ShortCode = Left(Value, 1) WHERE ShortCode Is Null;")</pre> |
|---|
| 42 |
|
|---|
| 43 |
<p>There are a <b>lot</b> of other things you can do with the builtin |
|---|
| 44 |
Database, Table, Column, and SQLDeparser classes. Feel free to open them |
|---|
| 45 |
up in an interactive session and explore.</p> |
|---|
| 46 |
|
|---|
| 47 |
<a name='custom'><h3>Custom Providers</h3></a> |
|---|
| 48 |
<p>In most cases, you will add new functionality to Geniusql itself by |
|---|
| 49 |
creating a custom "provider" (set of Database/Table/Column/etc objects) |
|---|
| 50 |
for a new backend. Providers must conform to simple interfaces which are |
|---|
| 51 |
specific to each class. They are free to implement that functionality |
|---|
| 52 |
however they like.</p> |
|---|
| 53 |
|
|---|
| 54 |
|
|---|
| 55 |
<h4>Generic Database Wrappers</h4> |
|---|
| 56 |
<p>Writing a provider for a database is relatively straightforward, |
|---|
| 57 |
mostly because Geniusql doesn't have complicated storage interfaces or |
|---|
| 58 |
demands. If you find your application depends heavily upon using advanced |
|---|
| 59 |
features of a particular database, or upon hand-crafted SQL, then Geniusql |
|---|
| 60 |
is not for you or your application. A Geniusql provider module for a database |
|---|
| 61 |
usually includes:</p> |
|---|
| 62 |
<ol> |
|---|
| 63 |
<li>DatabaseType objects, which model database types and their properties. |
|---|
| 64 |
Base classes can be found in <tt>geniusql.dbtypes</tt>.</li> |
|---|
| 65 |
<li>Adapters, which coerce values from Python types to database types |
|---|
| 66 |
and back again. Base classes can be found in |
|---|
| 67 |
<tt>geniusql.adapters</tt>.</li> |
|---|
| 68 |
<li>An SQLDeparser, which converts geniusql <tt>Expression</tt> objects |
|---|
| 69 |
(essentially, Python lambdas) into SQL.</li> |
|---|
| 70 |
<li>Subclasses of <tt>geniusql.Database</tt>, <tt>Schema</tt>, |
|---|
| 71 |
<tt>Table</tt>, <tt>Column</tt>, and <tt>IndexSet</tt>. |
|---|
| 72 |
These handle requests to SELECT data using |
|---|
| 73 |
the above components, as well as ALTER TABLE, etc.</li> |
|---|
| 74 |
<li>A <tt>ConnectionManager</tt>, which provides pooling (or not), plus |
|---|
| 75 |
transaction management.</li> |
|---|
| 76 |
</ol> |
|---|
| 77 |
|
|---|
| 78 |
|
|---|
| 79 |
<h5>Type Mapping</h5> |
|---|
| 80 |
<p>Any database code in a general-purpose programming language will |
|---|
| 81 |
eventually have to come to terms with the gap between native code types |
|---|
| 82 |
and native storage types. For us, this means matching Python types |
|---|
| 83 |
(like int and datetime) to database types (like INT8 and TEXT). |
|---|
| 84 |
Geniusql provides this layer for databases by using a mapping layer |
|---|
| 85 |
between your model code (Tables and Columns) and the underlying tables |
|---|
| 86 |
and columns. The implementation of that is unimportant (and possibly |
|---|
| 87 |
storage-dependent), but Geniusql needs to know the database types |
|---|
| 88 |
in effect in order to translate data safely.</p> |
|---|
| 89 |
|
|---|
| 90 |
<p>If your application has created all of its own tables using Geniusql, |
|---|
| 91 |
then there is generally nothing to worry about in terms of the "type gap"; |
|---|
| 92 |
Geniusql will default to creating columns of the types it knows best. |
|---|
| 93 |
But if you are building a Geniusql interface onto an existing database, |
|---|
| 94 |
or if you customize/optimize your database by hand, then you should |
|---|
| 95 |
look into creating your own DatabaseType and/or Adapter subclasses.</p> |
|---|
| 96 |
|
|---|
| 97 |
|
|---|
| 98 |
<h5>DatabaseTypes</h5> |
|---|
| 99 |
|
|---|
| 100 |
<p>You model a database by making new subclasses of the base classes |
|---|
| 101 |
in <tt>geniusql.dbtypes</tt>, one new subclass for each named type |
|---|
| 102 |
your database exposes. Mostly, this means giving new names to SQL92 |
|---|
| 103 |
types (that geniusql already provides); sometimes, you will add metadata |
|---|
| 104 |
to a type like 'bytes' or 'synonyms'. For example, here are the DECIMAL |
|---|
| 105 |
types for MySQL: |
|---|
| 106 |
<pre> |
|---|
| 107 |
class DECIMAL(dbtypes.SQL92DECIMAL): |
|---|
| 108 |
synonyms = ['DEC', 'NUMERIC'] |
|---|
| 109 |
|
|---|
| 110 |
default_adapters = dbtypes.SQL92DECIMAL.default_adapters.copy() |
|---|
| 111 |
default_adapters[datetime.timedelta] = MySQL_timedelta_to_DECIMAL() |
|---|
| 112 |
|
|---|
| 113 |
# "Before 3.23.6, precision and scale both must be specified explicitly." |
|---|
| 114 |
_precision = 10 |
|---|
| 115 |
max_precision = 16 |
|---|
| 116 |
|
|---|
| 117 |
class DECIMAL503(DECIMAL): |
|---|
| 118 |
max_precision = 64 |
|---|
| 119 |
|
|---|
| 120 |
class DECIMAL505(DECIMAL): |
|---|
| 121 |
max_precision = 65 |
|---|
| 122 |
</pre> |
|---|
| 123 |
Here, we declare that "DEC" and "NUMERIC" are synonyms for "DECIMAL" when |
|---|
| 124 |
using MySQL. We also attached a custom adapter for Python <tt>timedelta</tt>, |
|---|
| 125 |
so we could use MySQL's builtin "INTERVAL" keyword to store timedeltas. |
|---|
| 126 |
There's also a separate DECIMAL503 for MySQL version 5.0.3 (which increased |
|---|
| 127 |
the maximum precision of the DECIMAL type), and another for version 5.0.5 |
|---|
| 128 |
(which increased it again).</p> |
|---|
| 129 |
|
|---|
| 130 |
<h5>Adapters</h5> |
|---|
| 131 |
<p>Adapters (subclasses of <tt>adapters.Adapter</tt>) all need to do four |
|---|
| 132 |
things: <tt class='def'>push(value, dbtype)</tt> converts Geniusql values |
|---|
| 133 |
to properly-escaped SQL, <tt class='def'>pull(value, dbtype)</tt> |
|---|
| 134 |
does the reverse, <tt class='def'>binary_op(op1, op, sqlop, op2)</tt> |
|---|
| 135 |
translates binary operations from Python into equivalent SQL, and |
|---|
| 136 |
<tt class='def'>compare_op(op1, op, sqlop, op2)</tt> does the same for |
|---|
| 137 |
comparison operations. In general, you will define a separate adapter |
|---|
| 138 |
for each pytype-dbtype pair (e.g. float_to_SQL92DOUBLE), and then attach |
|---|
| 139 |
them to the <tt>dbtype.default_adapters</tt> dict.</p> |
|---|
| 140 |
|
|---|
| 141 |
|
|---|
| 142 |
<h5>Deparser</h5> |
|---|
| 143 |
<p>The SQLDeparser is the tricky bit of any provider. You must receive |
|---|
| 144 |
a set of tables and an Expression, and produce valid SQL for your database |
|---|
| 145 |
from both. For example, given: |
|---|
| 146 |
<pre>tables = (schema['Things'], schema['Places']) |
|---|
| 147 |
expr = logic.Expression(lambda x, y: x.Category == y.Category)</pre> |
|---|
| 148 |
...your deparser should produce something like: |
|---|
| 149 |
<pre>"[djvThings].[Category] = [djvPlaces].[Category]"</pre> |
|---|
| 150 |
...which the caller can then use in a WHERE or ON clause.</p> |
|---|
| 151 |
|
|---|
| 152 |
<p>The above example may seem trivial to you, but add in proper quoting, |
|---|
| 153 |
diverse datatypes (like dates and decimals), complex operators (like 'in', |
|---|
| 154 |
'Like', and 'ieq'), logic functions (like today() and iscurrentweek()), |
|---|
| 155 |
binary operators, null queries, and just-in-time keyword args, and it |
|---|
| 156 |
becomes complex very quickly. You are, in effect, writing a mini-ORM.</p> |
|---|
| 157 |
|
|---|
| 158 |
<p><b>But</b>, don't despair. Geniusql provides you with tools to make this |
|---|
| 159 |
task easier:</p> |
|---|
| 160 |
<ol> |
|---|
| 161 |
<li>The most important tool is <tt>geniusql.deparse.SQLDeparser</tt>, |
|---|
| 162 |
a complete base class. You should be able to tweak it for most |
|---|
| 163 |
databases with a couple of SQL syntax changes.</li> |
|---|
| 164 |
<li><tt>SQLDeparser</tt> is built on a simple Visitor-style base |
|---|
| 165 |
class, <tt>astwalk.LambdaDeparser</tt>. More complicated |
|---|
| 166 |
extensions are easily added to this base class; each node in |
|---|
| 167 |
the Expression (Python lambda) AST gets its own method call.</li> |
|---|
| 168 |
<li>You don't have to handle globals or cell references within the |
|---|
| 169 |
lambda--when the lambda gets wrapped in an Expression, all free |
|---|
| 170 |
variables are converted to constants.</li> |
|---|
| 171 |
<li>You aren't <i>forced</i> to handle every possible operator, function, |
|---|
| 172 |
or term in SQL. The base SQLDeparser doesn't; when it encounters a |
|---|
| 173 |
function it can't handle, for example, it punts by flagging the SQL |
|---|
| 174 |
as <i>imperfect</i>. This signals the caller to run each row through |
|---|
| 175 |
the restriction expression (in pure Python) before yielding it back |
|---|
| 176 |
to the caller. In fact, you can start writing your provider |
|---|
| 177 |
without a deparser at all! Just return all rows of the |
|---|
| 178 |
given tables and use the Expression to filter whole Units. Then, |
|---|
| 179 |
when your provider works, add a deparser.</li> |
|---|
| 180 |
</ol> |
|---|
| 181 |
|
|---|
| 182 |
|
|---|
| 183 |
<h5>Database/Table/IndexSet</h5> |
|---|
| 184 |
<p>You'll need a subclass of <tt>geniusql.Database</tt>. Override the |
|---|
| 185 |
container methods (like <tt>__setitem__</tt> and <tt>__delitem__</tt>). |
|---|
| 186 |
For most popular databases, these are pretty straightforward. Some notes:</p> |
|---|
| 187 |
<ul> |
|---|
| 188 |
<li><b>select/where</b>: If no Expression is supplied, return all rows. |
|---|
| 189 |
Otherwise, use a deparser to produce SQL which you can then use |
|---|
| 190 |
to grab data from storage. Use each value to populate a row |
|---|
| 191 |
(use an Adapter for type coercion), and yield each row back to the |
|---|
| 192 |
caller. In general, it's faster to slurp all the data in at once |
|---|
| 193 |
than to make a separate call for each row.</li> |
|---|
| 194 |
<li><b>_get_tables/_columns/_indices</b>: use your database's |
|---|
| 195 |
schema-inspection tools to tell Geniusql the names, datatypes, |
|---|
| 196 |
and other metadata that actually exists in each deployed |
|---|
| 197 |
database.</li> |
|---|
| 198 |
</ul> |
|---|
| 199 |
|
|---|
| 200 |
<p>Use <tt>geniusql/test/zoo_fixture.py</tt> to test your new provider. |
|---|
| 201 |
Copy one of the (<i>very</i> short) test_* modules for the other |
|---|
| 202 |
providers, and make the necessary changes for yours. All of the heavy |
|---|
| 203 |
lifting of the tests is done in zoo_fixture.</p> |
|---|
| 204 |
|
|---|
| 205 |
<h4>Legacy Database Wrappers</h4> |
|---|
| 206 |
<p>Sometimes you do not have complete control over the database you want to |
|---|
| 207 |
reference. In that case, you should probably still write a custom Database, |
|---|
| 208 |
Schema, Adapters, and a Deparser. Often, you can get away with providing |
|---|
| 209 |
a simple column mapping to use as you deparse. I've built one, for example, |
|---|
| 210 |
to wrap <a href='http://www.blackbaud.com/solutions/raisersedge.asp'>The |
|---|
| 211 |
Raiser's Edge</a> (third-party fundraising software). My Geniusql model |
|---|
| 212 |
manages directory records and income without regard for the underlying |
|---|
| 213 |
database; a custom provider maps between that ideal model and the Raiser's |
|---|
| 214 |
Edge API. This allows me to integrate data from RE with our custom |
|---|
| 215 |
inventory, invoice, and scheduling software.</p> |
|---|
| 216 |
|
|---|
| 217 |
<p>One of the more important parts of wrapping existing tables is getting |
|---|
| 218 |
your pretty Python names mapped to ugly database names. Do this by making |
|---|
| 219 |
a custom Schema: override the <tt>_column_name</tt> and |
|---|
| 220 |
<tt>table_name</tt> methods to do the mapping.</p> |
|---|
| 221 |
|
|---|
| 222 |
</body> |
|---|
| 223 |
</html> |
|---|