Contact: fumanchu@aminus.org

Log in as guest/geniusql to create tickets

root/trunk/geniusql/doc/modeling.html

Revision 279 (checked in by fumanchu, 3 years ago)

Doc update.

Line 
1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2    "http://www.w3.org/TR/xhtml1/DTD/strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4
5 <head>
6     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
7     <title>Geniusql: Modeling your Application</title>
8     <link href='geniusql.css' rel='stylesheet' type='text/css' />
9 </head>
10
11 <body>
12
13 <h2>Application Designers: Using Geniusql to Construct a Domain Model</h2>
14
15 <div style='text-align: center'><img src='gobjects.gif' alt='Geniusql Component Diagram' /></div>
16
17 <a name='databases'><h3>The Database Object</h3></a>
18 <p>The topmost class in Geniusql is the <tt>Database</tt> class. When
19 building a Geniusql application, you must first create an instance of it.
20 In general, you should ask geniusql to create a Database object for
21 you by using the top-level <tt class='def'>db(provider, **opts)</tt> function.
22 This allows you to pick a backend by name from <tt>providers.registry</tt>
23 (a map from short names to class names). For example, you can open an
24 SQLite database via:
25 <pre>import geniusql
26 db = geniusql.db("sqlite", name=':memory:')
27 db.create()</pre>
28
29 <p>In addition to <tt class='def'>create()</tt>, all Databases
30 also have <tt class='def'>drop()</tt> and <tt class='def'>exists()</tt> methods.</p>
31
32 <a name='schemas'><h3>Schemas</h3></a>
33 <p>Closely tied to the Database object are one or more corresponding
34 <tt>Schema</tt> objects. Usually, you ask the Database for a compatible
35 Schema object via its <tt class='def'>schema(name=None)</tt> method. For example:
36 <pre>schema = db.schema()
37 schema.create()</pre>
38 </p>
39
40 <p>Some database systems, like Firebird and SQLite, use a separate file
41 on disk for each Schema, and require you to refer to the database you want
42 by passing the name of the file. Consequently, they need a separate
43 Database object for each Schema. For such systems, their Database object
44 possesses a <tt>multischema</tt> attribute which is <tt>False</tt>. Many
45 other providers allow you to use a single Database with multiple schemas;
46 these have their <tt>multischema</tt> attribute set to <tt>True</tt>.
47
48 <a name='autoclass'><h4>Automatic Table Classes</h4></a>
49
50 <p>When you create your first Geniusql model, you might be forming it
51 to match some existing database schema. If so, Geniusql has methods to
52 help you. The <tt class='def'>schema.discover(tablename, conn=None)</tt>
53 method will create a Table class for the named database table, or, you can
54 discover all Tables at once via <tt class='def'>discover_all(conn=None)</tt>.
55 These methods automatically attach the tables they find in the database
56 to the schema. [This is fine for development purposes, but when you go
57 into production, it may result in slow startup times. It's usually faster
58 to write complete models yourself using Table and Column declarations.]</p>
59
60 <p>If you want to use a different key for the discovered object(s), use the
61 <tt class='def'>alias(oldname, newname)</tt> method of any Schema, Table, or
62 IndexSet container:
63 <pre>table = self.schema.discover(tablename)
64 self.schema.alias(table.name, newname)</pre>
65 </p>
66
67 <a name='tables'><h3>Tables</h3></a>
68 <p>Unlike more implicit ORM's, Geniusql models actual database objects
69 very explicitly. You begin persisting data, therefore, by creating
70 instances of <tt>geniusql.Table</tt>, one for each table in your
71 database:
72 <pre>Printer = schema.table('Printer')</pre>
73 This is all you need for a fully-functioning table. There is no base
74 class that you are required to override; simply create a Table object.
75 [We actually ask the Schema to create one for us that's specific to the
76 underlying DB.] However, this is a fairly uninteresting table, since it
77 has no columns! The first thing we will probably want to add to our new
78 table is a set of those (see below).</p>
79
80 <h4>Creating Tables in the Database</h4>
81 A schema is a dict-like set of Table objects. When you add or delete items
82 from a Schema object, appropriate CREATE TABLE/DROP TABLE commands are
83 executed. Therefore, a new Table object should have all of its columns
84 populated before adding it to the Schema. Once we've done that (see below),
85 we only need to add the Table to the Schema:
86 <pre>schema['Printer'] = Printer</pre>
87 The key that you use to refer to the table in the schema dict does not
88 need to be the same as the <tt>Table.name</tt> (or <tt>Table.qname</tt>,
89 the "quoted name" for use by the DB). In particular, this allows you
90 to use separate capitalization rules for various layers.</p>
91
92 <a name='columns'><h3>Columns</h3></a>
93 <p>Once you have defined a table (but before attaching it to the Schema),
94 define columns for it by asking the Schema for backend-specific Column
95 objects.</p>
96
97 <p>We might enhance our Printer example thusly:
98 <pre>Printer = schema.table('Printer')
99 Printer['ID'] = schema.column(int, autoincrement=True, key=True)
100 Printer['Manufacturer'] = schema.column(unicode)
101 Printer['ColorCopies'] = schema.column(bool)
102 Printer['PPM'] = schema.column(float)</pre>
103 This adds four columns to our <tt>Printer</tt> table, each with a
104 different datatype.</p>
105
106 <p>You manipulate rows by using <tt>insert</tt>, <tt>save</tt>, and
107 <tt>delete</tt> methods of the Table. The insert method returns a
108 dict of the inserted data:
109 <pre>>>> p = Printer.insert(PPM=25)</pre>
110 which you can then inspect:
111 <pre>
112 >>> p['PPM']
113 25</pre>
114 However, rest assured that the int value we provided was coerced to a float
115 when we inserted it into the database. This is because we specified the PPM
116 attribute as a 'float' type when we created it. The value of a Column is
117 restricted to the type which you specify. The only other valid value for
118 a Column (of any type) is None; any Column may be None at any time,
119 and in fact, all Column values are None until you assign values to them:
120 <pre>>>> p['ColorCopies'] is None
121 True</pre></p>
122
123 <h4>The datetime.datetime type</h4>
124 <p>If you use <tt>datetime.datetime</tt> for the type of a Column,
125 most providers will throw away the microseconds. This is an
126 unfortunate oversight that may be corrected sometime in the future.</p>
127
128 <h4>ID Columns</h4>
129 <p>All Columns possess a <tt>key</tt> attribute; if True, the Column
130 is used to define the uniqueness of a row. Typically, you will use a
131 single 'ID' column, so there will be only one primary key. If you wish
132 to use keys of a different number, types, or names, simply set the 'key'
133 attribute to True for each identifying Column:</p>
134
135 <pre>Printer = schema.table('printer')
136 Printer['Model'] = schema.column(unicode, key=True)
137 Printer['UnitNumber'] = schema.column(int, key=True)
138 </pre>
139
140 <p>In rare cases, you may even desire a Table that has no keys.
141 For example, an activity log that only needs inserts and no updates
142 may not need unique identifiers for each row; leaving them out
143 can make inserts faster, since there are no constraints to satisfy
144 and no indices to update.</p>
145
146 <h4>Column Metadata</h4>
147 <p>Columns are custom Python objects. This is significant, because it
148 allows us to store metadata about the columns themselves:
149 <pre>>>> c = Printer['UnitNumber']
150 >>> c.pytype, c.dbtype, c.default, c.key, c.name, c.qname
151 (&lt;type 'int'>, geniusql.providers.mysql.MEDIUMINT(bytes=4, max_bytes=4,
152 signed=True), 0, True, 'unitnumber', '`unitnumber`')</pre>
153
154 When you define a Column instance, you can pass in these extra
155 attributes. You can do it by hand, but in general, you should ask your
156 Schema object to make a Column object for you using its
157 <tt class='def'>column(pytype=unicode, dbtype=None, default=None, key=False,
158 autoincrement=False, hints=None)</tt> method. This ensures that all of the
159 cooperating objects (like indexes, datatypes, and type adapters) use the
160 correct specialized subclasses for your database of choice.</p>
161
162 <p>Supply any, all, or none of the attributes as needed.
163 The <tt>name</tt> attribute is set for you when you bind the column
164 to a Table (i.e. <tt>Table['name'] = schema.column(...)</tt>).
165 The <tt>pytype</tt> attribute limits column values to instances
166 of that type (or <tt>None</tt>). The <tt>key</tt> attribute
167 declares whether the column is part of the table's primary key
168 or not. Finally, the <tt>hints</tt> dictionary provides hints to backend
169 providers to help optimize storage. If you write a custom provider,
170 you may define and use your own hints. Here are the ones that most
171 builtin providers understand:</p>
172
173 <table>
174 <tr><th>Key</th><th>Values</th><th>Description</th></tr>
175 <tr>
176     <td>bytes</td>
177     <td>b &gt;= 0</td>
178     <td>Inform providers that a particular column will never exceed <i>b</i>
179         bytes. This applies to <tt>long</tt> and <tt>int</tt> columns,
180         as well as <tt>str</tt> and <tt>unicode</tt>. A value of 0 implies
181         no limit. If not specified, a default maximum will be used. Many
182         backends default to 255. Check your provider.</td>
183 </tr>
184 <tr>
185     <td>scale</td>
186     <td>s &gt;= 0</td>
187     <td>Scale is the number of digits to the right of the decimal point
188         in a NUMERIC (fixedpoint or decimal) field. This hint informs
189         providers that would usually store such data at a default scale
190         (usually 2), that the column should use a different scale.</td>
191 </tr>
192 <tr>
193     <td>precision</td>
194     <td>p &gt;= 0</td>
195     <td>Precision is the total number of <b>decimal</b> digits in a NUMERIC
196         (<tt>fixedpoint</tt> or <tt>decimal</tt>) field, or the total
197         number of <b>binary</b> digits in a <tt>float</tt> field.
198         This hint informs providers that the column will never exceed
199         <i>p</i> digits. If missing, the provider will supply a
200         default maximum precision. For example, PostgreSQL can handle
201         1000 decimal digits. If explicitly set to 0 (zero), the
202         provider will allow unlimited precision, if possible.
203         Note that the <tt>fixedpoint</tt> module uses the word
204         "precision" where we use the word "scale"; it actually has
205         unlimited precision (as we use the word). The <tt>decimal</tt>
206         module, in contrast, uses limited precision but no scale.</td>
207 </tr>
208 <tr>
209     <td>signed</td>
210     <td>True/False</td>
211     <td>If True, the column will use a signed numeric type
212     (if available).</td>
213 </tr>
214 </table>
215
216 <h4>Sequencing</h4>
217 <p>Most Tables have one or more keys (identifiers). Typically, there is only
218 one, of type <tt>int</tt>; however, you can use as many of whatever types
219 you need. As long as you provide your own identifier values, nothing will
220 break--you can save and select data without problems. However, if you save
221 a row without providing identifier values, the database may need to provide
222 identifier values for it.</p>
223
224 <p><tt>Column</tt> objects possesses three attributes to define sequencing:
225 <ul>
226     <li><tt>autoincrement</tt>: a <tt>bool</tt> which signals whether values
227     should be auto-generated by providers.</li>
228     <li><tt>sequence_name</tt>: for databases that use separate SQL statements
229     to create and drop sequences, this stores the name of the sequence.</li>
230     <li><tt>initial</tt>: holds the initial value for the sequence.</li>
231 </ul>
232 </p>
233
234 <a name='rows'><h3>Rows</h3></a>
235 <p>In Geniusql, rows are always manipulated as normal Python dictionaries or
236 lists. There's just data, no code; nothing to override, no magic behavior,
237 and no hidden effects. If you want a fancy object per row, you can certainly
238 implement that on top of Geniusql
239 (as <a href='http://projects.amor.org/dejavu'>Dejavu</a> does), but if
240 you don't need it, why bother? Geniusql is much more <i>relational</i>
241 than other ORM's, and therefore more powerful: how many other ORM's can
242 generate backend-specific SQL like
243 <tt><a href='managing.html#insert_into'>INSERT INTO</a> <i>newtable</i>
244 SELECT <i>columns</i> FROM <i>join</i></tt> for you?</p>
245
246 <h4>Single Rows</h4>
247 <p>The Table methods <tt class='def'>insert(**kwargs)</tt>,
248 <tt class='def'>save(**kwargs)</tt>, and <tt class='def'>delete(**kwargs)</tt>
249 all take a (keyword-arg) dict of data. Typically,
250 you will use these methods to manipulate a single row:</p>
251
252 <pre>
253 brian = Person.insert(Name='Brian Norris')
254 brian['Age'] = 39
255 Person.save(**brian)
256 ...
257 Person.delete(**brian)
258 </pre>
259 Geniusql knows which columns in the Person table are primary keys
260 (unique identifiers), so you don't have to. Just pass the whole dict
261 for a given row to each method. We could also have written the
262 <tt>insert</tt> call above like this:
263
264 <pre>
265 brian = {'Name': 'Brian Norris'}
266 brian = Person.insert(**brian)
267 </pre>
268
269 It's important to understand that <tt>insert</tt> returns a <i>copy</i>
270 of the data dict you pass to it, often with new autoincrement values.
271 So we make sure to re-bind 'brian' above to the returned dict
272 since we're going to keep manipulating it.
273 </p>
274
275 <h4>Multiple Rows</h4>
276 <p>There are additional Table methods for manipulating multiple rows.
277 The <tt class='def'>save_all(data, restriction=None, **kwargs)</tt> method
278 allows you to update rows based on <i>any</i> criteria, not just the
279 identifier columns Geniusql knows about. So, for example,
280 you could manage the zip code change for Bethesda, MD via:
281 <pre>Address.save_all(data={'Zip': 20814}, Zip=20014)</pre>
282 Likewise, you can delete multiple rows with
283 <tt class='def'>delete_all(restriction=None, **kwargs)</tt>:
284 <pre>Address.delete_all(Zip=20014)</pre>
285 If you need to update or delete rows based on more complicated
286 comparisons, use the <tt>restriction</tt> arg to pass a lambda
287 (or <a href='managing.html#expressions'><tt>logic.Expression</tt></a>):
288 <pre>Address.delete_all(lambda x: x.Zip >= 20014 and x.Zip &lt;= 20018)</pre>
289 </p>
290
291
292 <a name='references'><h3>References between Tables</h3></a>
293 <p>Once you've put together some Tables, chances are you're going to want
294 to join them. Generally, this is accomplished by creating a column in
295 table B (foreign key) which stores IDs from table A (primary key).
296 <pre>arch = schema.table('Archaeologist')
297 arch['ID'] = schema.column(int, autoincrement=True, key=True)
298 arch['Height'] = schema.column(float)
299
300 bio = schema.table('Biography')
301 bio['ID'] = schema.column(int, autoincrement=True, key=True)
302 bio['ArchID'] = schema.column(int)
303 bio['PubDate'] = schema.column(datetime.date)</pre>
304
305 In this example, each <tt>Biography</tt> row will have an <tt>ArchID</tt>
306 value, which will equal, and refer to, the <tt>ID</tt> value of some
307 <tt>Archaeologist</tt>.</p>
308
309 <p>You could stop at this point in your design, and simply remember what
310 these keys are and how they relate, and manipulate them accordingly. But
311 Geniusql allows you to explicitly declare these references:
312 <pre>bio.references['Archaeologist'] = ('ArchID', 'Archaeologist', 'ID')</pre>
313 You provide the near key, the far table name, and the far key.</p>
314
315 <p>What does an explicit reference buy for you? First, you can associate
316 Tables without having to remember which keys are related. Second, joins
317 discover references and automatically connect known paths for you.</p>
318
319 <p>Note, however, that a single Table A might have multiple relationships
320 with Table B. For example, a Biography might be both <i>written about</i>
321 one Archaeologist and <i>written by</i> another Archaeologist. Therefore,
322 the key you use to add a reference tuple to the references dict doesn't
323 have to be the name of the far table:
324 <pre>bio.references['Author'] = ('AuthorID', 'Archaeologist', 'ID')
325 bio.references['Subject'] = ('SubjectID', 'Archaeologist', 'ID')</pre>
326 In this way, we can both use multiple references, and write tools that are
327 aware of them. By default, the join machinery will use the reference which
328 matches the name of the far table (if present), but there are ways to
329 specify you want to join using an alternate reference.<p>
330
331
332 <a name='indices'><h3>Indices</h3></a>
333 <p>All Table instances, in addition to containing Column objects, also
334 possess an <tt>indices</tt> attribute. This is an instance of an
335 <tt>IndexSet</tt>, a dict-like container for <tt>Index</tt> objects.
336 You can add indices to a Table via its
337 <tt class='def'>add_index(columnkey)</tt> method; pass the key for the
338 column you wish to index.</p>
339
340 <hr />
341
342 </body>
343 </html>
Note: See TracBrowser for help on using the browser.