Contact: fumanchu@aminus.org

Log in as guest/dejavu to create tickets

root/tags/1.4.0/doc/managing.html

Revision 113 (checked in by fumanchu, 3 years ago)

Fix for #29 (arbitrary primary keys):

  1. New Unit.identifiers, a tuple of its UnitProperty? attributes which make Units of that class unique.
  2. New Unit.identity(). Returns a tuple of values matching the keys specified in Unit.identifiers.
  3. Arena and other caches now use Unit.identity() as their lookup keys.
  4. engines.UnitCollection?, UnitEngineRule? now use Unit.identity() for their keys. This means existing UnitCollection? objects (i.e., those persisted in storage) will have to be migrated to the new format.
  5. UnitSequencer? API has changed; replace next(ids) with assign(unit, ids).
  6. You can now remove inherited UnitProperties? from a subclass by assigning to None. For example, to remove the default ID property, set "ID = None" inside the definition of the subclass (the metaclass does the erasing for you).
  7. New engine tests in zoo_fixture.py.
Line 
1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2    "http://www.w3.org/TR/xhtml1/DTD/strict.dtd">
3 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
4
5 <head>
6     <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
7     <title>Dejavu: Managing Units</title>
8     <link href='dejavu.css' rel='stylesheet' type='text/css' />
9 </head>
10
11 <body>
12
13 <h2>Application Developers: Managing Units</h2>
14
15 <h3><a name='Querying'>Querying</a></h3>
16 <p>When you retrieve Units, you often don't want to load the entire set for
17 a given class. In Dejavu, you filter the set according to the UnitProperty
18 attributes for each object. Naturally, there must be a way to express
19 the filter you intend. Dejavu actually provides three ways, all in the
20 <tt>dejavu.logic</tt> module: <tt>Expression</tt>,
21 <tt>filter</tt>, and <tt>comparison</tt>.</p>
22
23 <h4>The <tt>Expression</tt> class</h4>
24 <p>Regardless of which technique you use to express your filter, you're
25 going to end up with a <tt>logic.Expression</tt> object. You can build
26 an Expression directly, passing a single lambda as an argument:
27 <pre>>>> from dejavu import logic
28 >>> import datetime
29 >>> f = lambda x: x.Date >= datetime.date(2004, 3, 1)
30 >>> e = logic.Expression(f)
31 >>> e
32 logic.Expression(lambda x: x.Date >= datetime.date(2004, 3, 1))</pre>
33 Neat, eh? I worked hard on that __repr__. ;)</p>
34
35 <p>It may be obvious, but we'll be explicit, here. The lambda which you pass
36 into an Expression must possess a positional argument, which will
37 always be bound to a Unit instance. In the example above, it's named 'x',
38 but you can use any name you like. Using lambdas as a base means that we
39 can simply call <tt>Expression(unit)</tt>, and receive a boolean
40 value indicating whether our Unit "passes the test". Attribute lookups on
41 our 'x' object will apply to Unit Properties for that Unit object.
42 That is, <tt>x.Date</tt> becomes <tt>unit.Date</tt>.</p>
43
44 <p>You can also do fancier things with Expressions (although the vast
45 majority of the time, you won't need to in order to use Dejavu):</p>
46 <pre>>>> logic.Expression(lambda x, y, z: "Dave" in x.Name and y.Age > 65)
47 logic.Expression(lambda x, y, z: ('Dave' in x.Name) and (y.Age > 65))
48 >>> logic.Expression(lambda *units, **kw: units and
49 ...                  (units[0].Width > units[0].Height or
50 ...                   units[0].Color in kw['Colors']))
51 logic.Expression(lambda *units, **kw: (units) and
52                  ((units[0].Width > units[0].Height) or
53                   (units[0].Color in kw['Colors'])))
54 >>>
55 </pre>
56
57 <h4>Early binding</h4>
58 <p>What is not obvious from the above code snippet is perhaps the <b>most
59 important aspect</b> of Expressions: any globals or cell references (from
60 closures) in the supplied lambda get <b>bound early</b>. Compare the
61 following disassemblies:
62 <pre>>>> import dis
63 >>> dis.dis(f)
64   1           0 LOAD_FAST                0 (x)
65               3 LOAD_ATTR                1 (Date)
66               6 LOAD_GLOBAL              2 (datetime)
67               9 LOAD_ATTR                3 (date)
68              12 LOAD_CONST               1 (2004)
69              15 LOAD_CONST               2 (3)
70              18 LOAD_CONST               3 (1)
71              21 CALL_FUNCTION            3
72              24 COMPARE_OP               5 (>=)
73              27 RETURN_VALUE       
74 >>> dis.dis(e.func)
75   1           0 LOAD_FAST                0 (x)
76               3 LOAD_ATTR                1 (Date)
77               6 LOAD_CONST               6 (datetime.date(2004, 3, 1))
78               9 COMPARE_OP               5 (>=)
79              12 RETURN_VALUE       
80 </pre>
81 As you can see, the function itself references the global 'datetime' module.
82 Once we wrap it in the Expression, however, it becomes a constant! Thanks to
83 Raymond Hettinger for inspiring this solution <a href='#hettinger'>[1]</a>.
84 Early binding, however, implies two consequences:</p>
85
86 <p>First, any globals or cell references must be present in the lambda's
87 scope when it is passed into Expression(). This is the norm and shouldn't
88 require too much thought from you when you write Expressions. In the
89 example above, we simply imported <tt>datetime</tt> as you would expect.</p>
90
91 <p>Second, any globals or cell references must <b>also</b> be present in
92 the <tt>logic</tt> module's globals when the Expression is unpickled.
93 Pickling occurs when Expressions are sent over sockets, and also if
94 Expressions are themselves persisted to storage (for example, see
95 <u>Unit Engines</u>, below). This means your application should inject
96 globals into the <tt>logic</tt> module. Note that the <tt>logic</tt> module
97 already tries to import <tt>datetime</tt>, <tt>fixedpoint</tt> and
98 <tt>decimal</tt>.</p>
99
100 <h4>External functions within Expressions</h4>
101 <p>Dejavu provides additional functions which can be used in Expressions.
102 For example, you can construct an Expression like:
103 <pre>logic.Expression(lambda x: x.Size < 3 and x.Date > dejavu.today())</pre>
104 In this example, the <tt>today()</tt> function breaks convention and is
105 actually <b>bound late</b>. That is, if you construct this Expression now
106 and use it six months later, the value of <tt>today()</tt> will change.
107 Storage Managers "know about" these dejavu functions, and can use them
108 to build more appropriate queries. Here are the functions supplied by
109 the <tt>dejavu</tt> module:</p>
110
111 <table>
112 <tr><th>Function</th><th>Late bound?</th><th>Description</th></tr>
113 <tr>
114     <td><tt>icontains(a, b)</tt></td>
115     <td></td>
116     <td>Case-insensitive test b in a. Note the operand order.</td>
117 </tr>
118 <tr>
119     <td><tt>icontainedby(a, b)</tt></td>
120     <td></td>
121     <td>Case-insensitive test a in b. Note the operand order.</td>
122 </tr>
123 <tr>
124     <td><tt>istartswith(a, b)</tt></td>
125     <td></td>
126     <td>True if a starts with b (case-insensitive), False otherwise.</td>
127 </tr>
128 <tr>
129     <td><tt>iendswith(a, b)</tt></td>
130     <td></td>
131     <td>True if a ends with b (case-insensitive), False otherwise.</td>
132 </tr>
133 <tr>
134     <td><tt>ieq(a, b)</tt></td>
135     <td></td>
136     <td>True if a == b (case-insensitive), False otherwise.</td>
137 </tr>
138 <tr>
139     <td><tt>year(value)</tt></td>
140     <td></td>
141     <td>The year attribute of a date. If value is None, return None.</td>
142 </tr>
143 <tr>
144     <td><tt>now()</tt></td>
145     <td>Y</td>
146     <td>datetime.datetime.now()</td>
147 </tr>
148 <tr>
149     <td><tt>today()</tt></td>
150     <td>Y</td>
151     <td>datetime.date.today()</td>
152 </tr>
153 <tr>
154     <td><tt>iscurrentweek(value)</tt></td>
155     <td>Y</td>
156     <td>If value is in the current week, return True, else False.</td>
157 </tr>
158 </table>
159
160 <p>It is possible for you, the application developer, to define your
161 own external functions. However, because Storage Managers are unaware
162 of your new functions, they will not be able to optimize their use;
163 instead, they will simply retrieve a larger set of objects from storage,
164 evaluate each one against the function you provide, and return those
165 Units which match your function. This isn't necessarily a bad thing;
166 it provides the same functionality as if you wrote the test inline
167 within your own code. By making that test a logic function, you allow
168 it to be stored in Engine <i>rules</i> (see <u>Unit Engines</u>,
169 below).</p>
170
171 <h4>Using <tt>filter</tt> to form Expressions</h4>
172 <p>The <tt>logic</tt> module also provides convenient methods to
173 create common types of Expression objects via the <tt>filter</tt> and
174 <tt>comparison</tt> factory functions.</p>
175
176 <p>The <tt class='def'>filter(**kwargs)</tt> function produces an Expression by taking
177 the keyword arguments you supply, and rewriting them in lambda form. The
178 only operator allowed is therefore the equals '==' operator. For example:
179 <pre>>>> logic.filter(Type='Cat', Mutation='Atomic')
180 logic.Expression(lambda x: (x.Type == 'Cat') and (x.Mutation == 'Atomic'))</pre>
181 </p>
182
183 <h4>Using <tt>comparison</tt> to form Expressions</h4>
184 <p>The <tt class='def'>comparison(attr, cmp_op, criteria)</tt> function allows you to
185 form Expressions with dynamic operators. This can come in handy when you
186 are constructing Expressions on the fly from user input. For example, a
187 search page might prompt users for an attribute name, an operator, and an
188 operand (the criteria).</p>
189
190 <p>Borrowing from <tt>opcode.cmp_op</tt>, the allowed values for our cmp_op
191 argument are as follows:</p>
192 <table>
193 <tr><th>Numeric Value (cmp_op)</th><th>Operator</th></tr>
194 <tr><td>0</td><td>&lt;</td></tr>
195 <tr><td>1</td><td>&lt;=</td></tr>
196 <tr><td>2</td><td>==</td></tr>
197 <tr><td>3</td><td>!=</td></tr>
198 <tr><td>4</td><td>&gt;</td></tr>
199 <tr><td>5</td><td>&gt;=</td></tr>
200 <tr><td>6</td><td>in</td></tr>
201 <tr><td>7</td><td>not in</td></tr>
202 <tr><th colspan='2'>Most SM's don't support the following:</th></tr>
203 <tr><td>8</td><td>is</td></tr>
204 <tr><td>9</td><td>is not</td></tr>
205 </table>
206
207 <p>Here's an example of using <tt>comparison</tt>:
208 <pre>>>> logic.comparison('Name', 3, 'Mr. Kamikaze')
209 logic.Expression(lambda x: x.Name != 'Mr. Kamikaze')</pre>
210 Although the comparison function only allows a single comparison at a time,
211 the resulting Expressions can be combined with the <tt>&</tt> and <tt>|</tt>
212 operators (described earlier) to produce more complex Expressions.</p>
213
214 <h4>Combining Expressions</h4>
215 <p>Expressions are combinable; by using the <tt>&</tt> operator, the two
216 expressions are combined with an adjoining logical "and". For example:
217 <pre>>>> a = logic.Expression(lambda x: x.Size > 3)
218 >>> b = logic.Expression(lambda x: x.Size <= 15)
219 >>> c = a & b
220 >>> c
221 logic.Expression(lambda x: (x.Size > 3) and (x.Size <= 15))</pre>
222 The <tt>+</tt> operator works just like the <tt>&</tt> operator. The
223 <tt>|</tt> operator combines the two Expressions with a logical 'or'.</p>
224
225 <p>When you combine two Expressions with dissimilar argument lists,
226 what happens? The Expression class doesn't really care what the argument
227 names are, just their order, so the names might not come out as you might
228 expect; however, the logic is preserved:</p>
229
230 <pre>>>> f = logic.filter(Name='Bruce')
231 >>> f
232 logic.Expression(lambda x: x.Name == 'Bruce')
233 >>> g = logic.Expression(lambda a, b, **kw: a.Name + b.Surname == kw['Full Name'])
234 >>>
235 >>> f + g
236 logic.Expression(lambda x, b, **kw: (x.Name == 'Bruce')
237                  and (x.Name + b.Surname == kw['Full Name']))
238 >>> g + f
239 logic.Expression(lambda a, b, **kw: (a.Name + b.Surname == kw['Full Name'])
240                  and (a.Name == 'Bruce'))
241 </pre>
242
243 <h4>Exporting the <tt>logic</tt> module</h4>
244 <p>The <tt>logic</tt> module (and <tt>codewalk</tt>, on which it is built)
245 isn't limited to Dejavu. Feel free to use it in some other framework or
246 script! The only change you may have to make (if you relocate the module
247 outside of the <tt>dejavu</tt> package) would be to the single line:
248 <tt>from dejavu import codewalk</tt>, to point to the new location.</p>
249
250 <p>In particular, <tt>logic.Expression</tt> objects can operate on <i>any</i>
251 Python objects, not just dejavu <tt>Unit</tt> instances. If you wish to
252 provide additional logic functions (as dejavu does), simply inject them
253 into <tt>logic</tt>'s globals.</p>
254
255 <p>You may also find the underlying <tt>codewalk</tt> module useful for
256 other purposes on its own. The <tt>Visitor</tt> base class can be very
257 convenient for building bytecode hacks.</p>
258
259 <p>To make a long story short, Dejavu depends on <tt>logic</tt> throughout,
260 but the reverse is not true.</p>
261
262
263 <h3><a name='Engines'>Unit Engines</a></h3>
264 <p>Once you've created and associated your Unit classes, you can begin to
265 write "business logic" code (mostly inside those classes, we hope), and
266 "presentation logic" code (mostly outside those classes). In most cases,
267 you will construct Expressions within your own code manually to retrieve
268 Units. Sometimes, however, you need to persist query parameters from your
269 users; in other cases, you might store a list of Units which match a query
270 (regardless of who formed the necessary Expression). Finally, you might
271 wish to manipulate lists of Units as sets: differences, intersections,
272 and unions. The <tt>engines</tt> module addresses all of these needs.</p>
273
274 <h4>Collections: Lists of Units</h4>
275 <p>The <tt>UnitCollection</tt> class provides a means of storing a list
276 of Units, or rather, a list of Unit identifiers. You use its <tt>Type</tt>
277 property to indicate the class of the indexed Units. That value should be
278 the <b>name</b> of the Unit Class, <b>not</b> the class object itself
279 (this is different than most other calls in Dejavu). If you need to
280 retrieve the actual Unit class, call <tt>UnitCollection().unit_class()</tt>.</p>
281
282 <p><tt>UnitCollection</tt> itself subclasses <tt>dejavu.Unit</tt>; you can
283 therefore persist Unit Collections via Dejavu Storage Managers (most SM's,
284 anyway; it's recommended that SM's handle Unit Collections, but not
285 required. Check your SM to see if it does).</p>
286
287 <p>Each Collection has a thread lock (an RLock, actually) which you should
288 <tt class='def'>acquire()</tt> before you add an ID to the set,
289 and <tt class='def'>release()</tt> afterward. If you use the
290 <tt class='def'>add(ID)</tt> method, this locking is done
291 for you.</p>
292
293 <p>When you need to retrieve the actual Units which are indexed by the
294 Collection, call the <tt class='def'>units(quota=None)</tt> method,
295 which will
296 look up the Units and return them in a list. Since the Collection only
297 stores identifiers, it is possible that one of the indexed Units may have
298 been destroyed since the list was built. The <tt>units</tt> method simply
299 passes over these "phantom" Units. You can inspect the full list of IDs
300 in the Collection (whether they reference existing Units or not) with
301 the <tt class='def'>ids()</tt> method.</p>
302
303 <p>Collections also provide a convenience function for grouping Units
304 by attribute: <tt class='def'>xdict(attr)</tt>. This function will look up
305 each Unit
306 in the Collection, inspect the attribute that you specify, and return
307 a dictionary of the form <tt>{attr_val1: [Unit, Unit, ...]}</tt>.
308 Each distinct attribute value will have its own key, with a list of
309 matching Units as the value.</p>
310
311 <h4>Engines</h4>
312 <p>You can form Collections by hand, but a more powerful technique is
313 the <tt>UnitEngine</tt>, a factory for Collections. Engines are very
314 simple: they possess a set of <i>rules</i> which are executed when
315 you want to take a <i>snapshot</i> of Units. The snapshot which is
316 produced is a <tt>UnitCollection</tt> object. Whenever you call
317 <tt class='def'>take_snapshot()</tt>, the Engine will maintain an association
318 to the resulting Collection. You can access past snapshots with the
319 <tt class='def'>snapshots()</tt> method.</p>
320
321 <p>Engines are themselves Units, and can be persisted via Storage Managers.
322 The only properties they possess are: an <tt class='def'>ID</tt>,
323 a <tt class='def'>Name</tt>, an <tt class='def'>Owner</tt>,
324 a <tt class='def'>FinalClassName</tt>, and <tt class='def'>Created</tt>,
325 the creation date of the Engine.</p>
326
327 <p>The <tt>Owner</tt> property should either be a user name, or one of the
328 reserved names: "Public" and "System". By default, the
329 <tt class='def'>permit()</tt>
330 method allows a user read-access to the Engine if they are the Owner, or
331 the Owner is "Public" or "System". Write-access is permitted if the user
332 is the Owner, or the Owner is "Public". Feel free to override
333 <tt>permit()</tt> in a subclass to provide different behaviors.</p>
334
335 <p>The <tt>FinalClassName</tt> is set for you as you add Rules to the
336 Engine. You can use the value of this property, for example, to tell
337 your users, "Engine #23569 is an 'Armadillo' engine," when it produces
338 Collections of <tt>Armadillo</tt> Units. The only time you might want to
339 set this value manually is when you first create the Engine, before you
340 have added any Rules.</p>
341
342 <h4><a name='unitenginerules'>Rules</a></h4>
343 <p>Just like Collections and Engines, <tt>UnitEngineRule</tt> is <i>also</i>
344 a subclass of <tt>Unit</tt>, and can be persisted via Storage Managers. All
345 three work together to provide a complete, dynamic, application-level query
346 generator.</p>
347
348 <p>Okay, so what are Rules? You might say they're a "little language",
349 with the following primitives, or "operations":</p>
350 <table>
351 <tr><th>Operation</th><th>Operand(s)</th><th>Description</th></tr>
352 <tr><th colspan='3'>Operations on a single set</th></tr>
353 <tr>
354     <td>CREATE</td>
355     <td>The classname of the new Type</td>
356     <td>Creates a new Set of the specified Type. All Units of that Type
357         are included in the new Set.</td>
358 </tr>
359 <tr>
360     <td>FILTER</td>
361     <td>A <tt>logic.Expression</tt></td>
362     <td>Removes Units from the current Set which do not match the
363         Expression.</td>
364 </tr>
365 <tr>
366     <td>FUNCTION</td>
367     <td>The name of a function in the <tt>Arena.engine_functions</tt>
368         dict</td>
369     <td>Calls the function, passing the current Set. The function
370         should modify the Set.</td>
371 </tr>
372 <tr>
373     <td>TRANSFORM</td>
374     <td>The classname of the new Type</td>
375     <td>Transform the current Set into a Set of associated Units
376         (of another Type). The association must be present in the
377         <tt>Arena.associations</tt> graph.</td>
378 </tr>
379 <tr>
380     <td>RETURN</td>
381     <td></td>
382     <td>Optional. If omitted, the last Set handled is returned as the
383         snapshot. If supplied, the ID of the Set to return.</td>
384 </tr>
385 <tr><th colspan='3'>Operations on two sets</th></tr>
386 <tr>
387     <td>COPY</td>
388     <td>The Set ID of the new Set</td>
389     <td>Copies the current Set to a new Set. The current Set is unchanged.</td>
390 </tr>
391 <tr>
392     <td>DIFFERENCE</td>
393     <td>The ID of the Set to mix in</td>
394     <td>Removes IDs from the current Set which exist in the second Set.</td>
395 </tr>
396 <tr>
397     <td>INTERSECTION</td>
398     <td>The ID of the Set to mix in</td>
399     <td>Removes IDs from the current Set which <i>do not</i> exist in the
400         second Set.</td>
401 </tr>
402 <tr>
403     <td>UNION</td>
404     <td>The ID of the Set to mix in</td>
405     <td>Adds any IDs to the current Set which exist in the second Set.</td>
406 </tr>
407 </table>
408
409 <p>Each Rule has an <tt>Operation</tt> property (a string, one of the above),
410 a <tt>SetID</tt>, and an <tt>Operand</tt>. Here's an example ruleset:</p>
411 <table>
412 <tr><th>Sequence</th><th>Operation</th><th>SetID</th><th>Operand</th></tr>
413 <tr><td>1</td><td>CREATE</td><td>1</td><td>Invoice</td></tr>
414 <tr><td>2</td><td>FILTER</td><td>1</td><td>(Expression)</td></tr>
415 <tr><td>3</td><td>CREATE</td><td>2</td><td>Inventory</td></tr>
416 <tr><td>4</td><td>FILTER</td><td>2</td><td>(Expression)</td></tr>
417 <tr><td>5</td><td>TRANSFORM</td><td>2</td><td>Invoice</td></tr>
418 <tr><td>6</td><td>DIFFERENCE</td><td>1</td><td>2</td></tr>
419 <tr><td>7</td><td>RETURN</td><td>1</td><td></td></tr>
420 </table>
421
422 <p>As you can see, every Rule operates on a <i>Set</i> of Units. The first
423 rule is always to CREATE a set, declaring it to contain a certain Type
424 of Units. In most cases, you will then FILTER that set. If you simply
425 created a set and then returned it, it would contain all Units of the
426 declared Type. When you filter a set, however, you remove Units from
427 the whole which do not match the filter's Expression.</p>
428
429 <p>In the example above, we CREATE a second Set so that we can eventually
430 obtain the DIFFERENCE between Set 1 and Set 2. The second Set contains
431 Units of a different Type than the first. Once we filter Set 2, we then
432 TRANSFORM it; for each Inventory Unit, we look up associated Invoice
433 Units. Then, we find the difference between the two Invoice sets and
434 RETURN it.</p>
435
436 <p>Rules are executed in order according to their <tt>Sequence</tt>
437 attribute (lowest first). When you use the <tt>Engine.add_rule</tt> method,
438 the next <tt>Sequence</tt> value is retrieved for you. Notice that each
439 Rule belongs to one and only one Engine; they are not shared between
440 Engines. Each Rule has its own <tt>EngineID</tt> attribute.</p>
441
442 <h4>Engine Functions</h4>
443 <p>The FUNCTION rule deserves special mention. The Operand of a FUNCTION
444 rule is a string, a key in the <tt>Arena.engine_functions</tt> dictionary.
445 When the rule is executed, that key is used to look up the function, which
446 is then called, passing <tt>(sandbox, set)</tt>. The function should
447 mutate the set directly. Use FUNCTION rules to mutate sets in ways which
448 are more complex than those provided by FILTER and TRANSFORM. For example,
449 you might provide a function which removes all but the first Unit in the
450 Set (according to some ordering algorithm).</p>
451
452
453 <h3>Analysis Tools</h3>
454 <p>Dejavu includes various tools to help you manipulate groups of Units.</p>
455
456 <h4>Sorting Units</h4>
457 <p>When you recall Units, you receive a list. However, the <tt>recall</tt>
458 method doesn't do any sorting; you must sort your list in your Python code.
459 Dejavu provides a <tt class='def'>sort(attrs, descending=False)</tt>
460 function to assist
461 you in sorting Units. It returns a function, which you can then use in
462 Python's sort function (which operates in place). Continuing our example:
463 <pre>people.sort(dejavu.sort('Size', 'Name'))</pre>
464 The most important issue (and the reason we don't just use 2.4's attrgetter),
465 is that any Unit property must allow values of None, which tends to raise
466 errors when compared to values of other types. The function which
467 <tt>sort</tt> creates for you treats None as "less than" any other value.</p>
468
469 <h4>Cross-tabulation</h4>
470 <p>Cross-tabs (also called <i>aggregate tables</i> or <i>pivot tables</i>)
471 display aggregate information about objects by category. For example,
472 rather than show a list of Safari records, one row per trip, you might
473 wish to show a table where each row represents a Destination, and each
474 column shows the count of Safaris to that Destination for each distinct
475 Year. In this example, we say that the Safaris are "grouped by" their
476 Destination values, and that we "pivot" on the Year values.</p>
477
478 <p>Dejavu helps you form such a table via the <tt>CrossTab</tt> class.
479 You need to specify the group(s) you wish to use, and the pivot attribute.
480 Finally, you must specify the aggregate function. Here's a code example:
481 <pre>
482 >>> data = ["a", "b", "cc", "bddd", "a4", "b6"]
483 >>> group = lambda x: x.isalpha()
484 >>> pivot = lambda x: x[0]
485 >>> ctab = analysis.CrossTab(data, [group], pivot, dejavu.COUNT)
486 >>> data, columns = ctab.results()
487 >>> data
488 {(True,): {"a": 1, "b": 2, "c": 1},
489  (False,): {"a": 1, "b": 1}}
490 >>> columns
491 ["a", "b", "c"]</pre>
492 You may notice that we're not using Units in our example; the
493 <tt>CrossTab</tt> class is designed to work with any objects. Here's one
494 way to lay out that data:</p>
495 <table>
496 <tr><th>Is Alpha</th><th>a</th><th>b</th><th>c</th></tr>
497 <tr><td>Y</td><td>1</td><td>2</td><td>1</td></tr>
498 <tr><td>N</td><td>1</td><td>1</td><td>0</td></tr>
499 </table>
500
501 <p>The <tt>results</tt> method returns two values. First, the table
502 itself in the form of a dictionary; each key is a tuple of group values,
503 and the corresponding value is a sub-dictionary. Each sub-dict has keys
504 which are the pivot attribute, and values which equal the aggregates.
505 I know, that was confusing; look at the example. The second value to
506 be returned is a list of the pivot column values; you'll notice they're
507 sorted.</p>
508
509 <p>The groups and pivot arguments may be either strings or functions.
510 If strings, they must be the names of attributes of the source objects.
511 The final aggfunc argument defaults to COUNT, but may also be SUM.
512 More aggfuncs may arrive in the future.</p>
513
514 <hr />
515
516 <p><a name='hettinger'>[1]</a> Python Cookbook,
517 <a href='http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277940'>Binding
518 Constants at compile time</a><br />
519 </p>
520
521 </body>
522 </html>
Note: See TracBrowser for help on using the browser.