Contact: fumanchu@aminus.org

Log in as guest/geniusql to create tickets

root/trunk/geniusql/doc/advanced.html

Revision 136 (checked in by fumanchu, 5 years ago)

Doc updates.

Line 
1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2    "http://www.w3.org/TR/xhtml1/DTD/strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4
5 <head>
6     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
7     <title>Geniusql: Advanced Topics</title>
8     <link href='geniusql.css' rel='stylesheet' type='text/css' />
9 </head>
10
11 <body>
12
13 <h2>Advanced Topics</h2>
14 <p>As with all frameworks, Geniusql can't cover every need out-of-the-box.
15 However, Geniusql has been specifically designed to be hackable. In particular,
16 the addition of new providers is a well-defined process. Read on if
17 there's a feature you need that you might consider building yourself.</p>
18
19
20 <a name='sql'><h3>Passing through SQL</h3></a>
21 <p>If you <i>want</i> to keep writing SQL, there's nothing stopping you
22 from doing so. If nothing else, it can be a handy way to prototype or
23 migrate an application, and then replace the SQL with Geniusql API calls
24 later on. You'll need to make your deployers aware that you're using SQL
25 directly (and which DB's your SQL runs on), so they don't try to deploy
26 your application with an unsupported store.</p>
27
28 <p>Use the extension methods built into the Database class to get data:</p>
29
30 <pre>>>> rows, cols = db.fetch("SELECT djvFields.Value, Count(djvCity.ID) AS NumCities "
31                           "FROM djvFields LEFT JOIN djvCity ON djvFields.ID = "
32                           "djvCity.Field GROUP BY djvFields.Value")
33 >>> cols     # [(name, type), (name, type), ...]
34 [(u'Value', 202), (u'NumCities', 3)]
35 >>> rows
36 [(u'Baja California', 3), (u'Ciudad Juarez', 1), (u'Puerto Penasco', 0), (u'Yucatan Peninsula', 1)]
37 </pre>
38
39 <p>...or update:</p>
40
41 <pre>>>> db.execute("UPDATE djvFields SET ShortCode = Left(Value, 1) WHERE ShortCode Is Null;")</pre>
42
43 <p>There are a <b>lot</b> of other things you can do with the builtin
44 Database, Table, Column, and SQLDeparser classes. Feel free to open them
45 up in an interactive session and explore.</p>
46
47 <a name='custom'><h3>Custom Providers</h3></a>
48 <p>In most cases, you will add new functionality to Geniusql itself by
49 creating a custom "provider" (set of Database/Table/Column/etc objects)
50 for a new backend. Providers must conform to simple interfaces which are
51 specific to each class. They are free to implement that functionality
52 however they like.</p>
53
54
55 <h4>Generic Database Wrappers</h4>
56 <p>Writing a provider for a database is relatively straightforward,
57 mostly because Geniusql doesn't have complicated storage interfaces or
58 demands. If you find your application depends heavily upon using advanced
59 features of a particular database, or upon hand-crafted SQL, then Geniusql
60 is not for you or your application. A Geniusql provider module for a database
61 usually includes:</p>
62 <ol>
63     <li>DatabaseType objects, which model database types and their properties.
64         Base classes can be found in <tt>geniusql.dbtypes</tt>.</li>
65     <li>Adapters, which coerce values from Python types to database types
66         and back again. Base classes can be found in
67         <tt>geniusql.adapters</tt>.</li>
68     <li>An SQLDeparser, which converts geniusql <tt>Expression</tt> objects
69         (essentially, Python lambdas) into SQL.</li>
70     <li>Subclasses of <tt>geniusql.Database</tt>, <tt>Schema</tt>,
71         <tt>Table</tt>, <tt>Column</tt>, and <tt>IndexSet</tt>.
72         These handle requests to SELECT data using
73         the above components, as well as ALTER TABLE, etc.</li>
74     <li>A <tt>ConnectionManager</tt>, which provides pooling (or not), plus
75         transaction management.</li>
76 </ol>
77
78
79 <h5>Type Mapping</h5>
80 <p>Any database code in a general-purpose programming language will
81 eventually have to come to terms with the gap between native code types
82 and native storage types. For us, this means matching Python types
83 (like int and datetime) to database types (like INT8 and TEXT).
84 Geniusql provides this layer for databases by using a mapping layer
85 between your model code (Tables and Columns) and the underlying tables
86 and columns. The implementation of that is unimportant (and possibly
87 storage-dependent), but Geniusql needs to know the database types
88 in effect in order to translate data safely.</p>
89
90 <p>If your application has created all of its own tables using Geniusql,
91 then there is generally nothing to worry about in terms of the "type gap";
92 Geniusql will default to creating columns of the types it knows best.
93 But if you are building a Geniusql interface onto an existing database,
94 or if you customize/optimize your database by hand, then you should
95 look into creating your own DatabaseType and/or Adapter subclasses.</p>
96
97
98 <h5>DatabaseTypes</h5>
99
100 <p>You model a database by making new subclasses of the base classes
101 in <tt>geniusql.dbtypes</tt>, one new subclass for each named type
102 your database exposes. Mostly, this means giving new names to SQL92
103 types (that geniusql already provides); sometimes, you will add metadata
104 to a type like 'bytes' or 'synonyms'. For example, here are the DECIMAL
105 types for MySQL:
106 <pre>
107 class DECIMAL(dbtypes.SQL92DECIMAL):
108     synonyms = ['DEC', 'NUMERIC']
109    
110     default_adapters = dbtypes.SQL92DECIMAL.default_adapters.copy()
111     default_adapters[datetime.timedelta] = MySQL_timedelta_to_DECIMAL()
112    
113     # "Before 3.23.6, precision and scale both must be specified explicitly."
114     _precision = 10
115     max_precision = 16
116
117 class DECIMAL503(DECIMAL):
118     max_precision = 64
119
120 class DECIMAL505(DECIMAL):
121     max_precision = 65
122 </pre>
123 Here, we declare that "DEC" and "NUMERIC" are synonyms for "DECIMAL" when
124 using MySQL. We also attached a custom adapter for Python <tt>timedelta</tt>,
125 so we could use MySQL's builtin "INTERVAL" keyword to store timedeltas.
126 There's also a separate DECIMAL503 for MySQL version 5.0.3 (which increased
127 the maximum precision of the DECIMAL type), and another for version 5.0.5
128 (which increased it again).</p>
129
130 <h5>Adapters</h5>
131 <p>Adapters (subclasses of <tt>adapters.Adapter</tt>) all need to do four
132 things: <tt class='def'>push(value, dbtype)</tt> converts Geniusql values
133 to properly-escaped SQL, <tt class='def'>pull(value, dbtype)</tt>
134 does the reverse, <tt class='def'>binary_op(op1, op, sqlop, op2)</tt>
135 translates binary operations from Python into equivalent SQL, and
136 <tt class='def'>compare_op(op1, op, sqlop, op2)</tt> does the same for
137 comparison operations. In general, you will define a separate adapter
138 for each pytype-dbtype pair (e.g. float_to_SQL92DOUBLE), and then attach
139 them to the <tt>dbtype.default_adapters</tt> dict.</p>
140
141
142 <h5>Deparser</h5>
143 <p>The SQLDeparser is the tricky bit of any provider. You must receive
144 a set of tables and an Expression, and produce valid SQL for your database
145 from both. For example, given:
146 <pre>tables = (schema['Things'], schema['Places'])
147 expr = logic.Expression(lambda x, y: x.Category == y.Category)</pre>
148 ...your deparser should produce something like:
149 <pre>"[djvThings].[Category] = [djvPlaces].[Category]"</pre>
150 ...which the caller can then use in a WHERE or ON clause.</p>
151
152 <p>The above example may seem trivial to you, but add in proper quoting,
153 diverse datatypes (like dates and decimals), complex operators (like 'in',
154 'Like', and 'ieq'), logic functions (like today() and iscurrentweek()),
155 binary operators, null queries, and just-in-time keyword args, and it
156 becomes complex very quickly. You are, in effect, writing a mini-ORM.</p>
157
158 <p><b>But</b>, don't despair. Geniusql provides you with tools to make this
159 task easier:</p>
160 <ol>
161     <li>The most important tool is <tt>geniusql.deparse.SQLDeparser</tt>,
162         a complete base class. You should be able to tweak it for most
163         databases with a couple of SQL syntax changes.</li>
164     <li><tt>SQLDeparser</tt> is built on a simple Visitor-style base
165         class, <tt>astwalk.LambdaDeparser</tt>. More complicated
166         extensions are easily added to this base class; each node in
167         the Expression (Python lambda) AST gets its own method call.</li>
168     <li>You don't have to handle globals or cell references within the
169         lambda--when the lambda gets wrapped in an Expression, all free
170         variables are converted to constants.</li>
171     <li>You aren't <i>forced</i> to handle every possible operator, function,
172         or term in SQL. The base SQLDeparser doesn't; when it encounters a
173         function it can't handle, for example, it punts by flagging the SQL
174         as <i>imperfect</i>. This signals the caller to run each row through
175         the restriction expression (in pure Python) before yielding it back
176         to the caller. In fact, you can start writing your provider
177         without a deparser at all! Just return all rows of the
178         given tables and use the Expression to filter whole Units. Then,
179         when your provider works, add a deparser.</li>
180 </ol>
181
182
183 <h5>Database/Table/IndexSet</h5>
184 <p>You'll need a subclass of <tt>geniusql.Database</tt>. Override the
185 container methods (like <tt>__setitem__</tt> and <tt>__delitem__</tt>).
186 For most popular databases, these are pretty straightforward. Some notes:</p>
187 <ul>
188     <li><b>select/where</b>: If no Expression is supplied, return all rows.
189         Otherwise, use a deparser to produce SQL which you can then use
190         to grab data from storage. Use each value to populate a row
191         (use an Adapter for type coercion), and yield each row back to the
192         caller. In general, it's faster to slurp all the data in at once
193         than to make a separate call for each row.</li>
194     <li><b>_get_tables/_columns/_indices</b>: use your database's
195         schema-inspection tools to tell Geniusql the names, datatypes,
196         and other metadata that actually exists in each deployed
197         database.</li>
198 </ul>
199
200 <p>Use <tt>geniusql/test/zoo_fixture.py</tt> to test your new provider.
201 Copy one of the (<i>very</i> short) test_* modules for the other
202 providers, and make the necessary changes for yours. All of the heavy
203 lifting of the tests is done in zoo_fixture.</p>
204
205 <h4>Legacy Database Wrappers</h4>
206 <p>Sometimes you do not have complete control over the database you want to
207 reference. In that case, you should probably still write a custom Database,
208 Schema, Adapters, and a Deparser. Often, you can get away with providing
209 a simple column mapping to use as you deparse. I've built one, for example,
210 to wrap <a href='http://www.blackbaud.com/solutions/raisersedge.asp'>The
211 Raiser's Edge</a> (third-party fundraising software). My Geniusql model
212 manages directory records and income without regard for the underlying
213 database; a custom provider maps between that ideal model and the Raiser's
214 Edge API. This allows me to integrate data from RE with our custom
215 inventory, invoice, and scheduling software.</p>
216
217 <p>One of the more important parts of wrapping existing tables is getting
218 your pretty Python names mapped to ugly database names. Do this by making
219 a custom Schema: override the <tt>_column_name</tt> and
220 <tt>table_name</tt> methods to do the mapping.</p>
221
222 </body>
223 </html>
Note: See TracBrowser for help on using the browser.