| 1 |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" |
|---|
| 2 |
"http://www.w3.org/TR/xhtml1/DTD/strict.dtd"> |
|---|
| 3 |
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
|---|
| 4 |
|
|---|
| 5 |
<head> |
|---|
| 6 |
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> |
|---|
| 7 |
<title>Dejavu: Managing Units</title> |
|---|
| 8 |
<link href='dejavu.css' rel='stylesheet' type='text/css' /> |
|---|
| 9 |
</head> |
|---|
| 10 |
|
|---|
| 11 |
<body> |
|---|
| 12 |
|
|---|
| 13 |
<h2>Application Developers: Managing Units</h2> |
|---|
| 14 |
|
|---|
| 15 |
<h3><a name='Querying'>Querying</a></h3> |
|---|
| 16 |
<p>When you retrieve Units, you often don't want to load the entire set for |
|---|
| 17 |
a given class. In Dejavu, you filter the set according to the UnitProperty |
|---|
| 18 |
attributes for each object. Naturally, there must be a way to express |
|---|
| 19 |
the filter you intend. Dejavu actually provides three ways, all in the |
|---|
| 20 |
<tt>dejavu.logic</tt> module: <tt>Expression</tt>, |
|---|
| 21 |
<tt>filter</tt>, and <tt>comparison</tt>.</p> |
|---|
| 22 |
|
|---|
| 23 |
<h4>The <tt>Expression</tt> class</h4> |
|---|
| 24 |
<p>Regardless of which technique you use to express your filter, you're |
|---|
| 25 |
going to end up with a <tt>logic.Expression</tt> object. You can build |
|---|
| 26 |
an Expression directly, passing a single lambda as an argument: |
|---|
| 27 |
<pre>>>> from dejavu import logic |
|---|
| 28 |
>>> import datetime |
|---|
| 29 |
>>> f = lambda x: x.Date >= datetime.date(2004, 3, 1) |
|---|
| 30 |
>>> e = logic.Expression(f) |
|---|
| 31 |
>>> e |
|---|
| 32 |
logic.Expression(lambda x: x.Date >= datetime.date(2004, 3, 1))</pre> |
|---|
| 33 |
Neat, eh? I worked hard on that __repr__. ;)</p> |
|---|
| 34 |
|
|---|
| 35 |
<p>It may be obvious, but we'll be explicit, here. The lambda which you pass |
|---|
| 36 |
into an Expression must possess a positional argument, which will |
|---|
| 37 |
always be bound to a Unit instance. In the example above, it's named 'x', |
|---|
| 38 |
but you can use any name you like. Using lambdas as a base means that we |
|---|
| 39 |
can simply call <tt>Expression(unit)</tt>, and receive a boolean |
|---|
| 40 |
value indicating whether our Unit "passes the test". Attribute lookups on |
|---|
| 41 |
our 'x' object will apply to Unit Properties for that Unit object. |
|---|
| 42 |
That is, <tt>x.Date</tt> becomes <tt>unit.Date</tt>.</p> |
|---|
| 43 |
|
|---|
| 44 |
<p>You can also do fancier things with Expressions (although the vast |
|---|
| 45 |
majority of the time, you won't need to in order to use Dejavu):</p> |
|---|
| 46 |
<pre>>>> logic.Expression(lambda x, y, z: "Dave" in x.Name and y.Age > 65) |
|---|
| 47 |
logic.Expression(lambda x, y, z: ('Dave' in x.Name) and (y.Age > 65)) |
|---|
| 48 |
>>> logic.Expression(lambda *units, **kw: units and |
|---|
| 49 |
... (units[0].Width > units[0].Height or |
|---|
| 50 |
... units[0].Color in kw['Colors'])) |
|---|
| 51 |
logic.Expression(lambda *units, **kw: (units) and |
|---|
| 52 |
((units[0].Width > units[0].Height) or |
|---|
| 53 |
(units[0].Color in kw['Colors']))) |
|---|
| 54 |
>>> |
|---|
| 55 |
</pre> |
|---|
| 56 |
|
|---|
| 57 |
<h4>Early binding</h4> |
|---|
| 58 |
<p>What is not obvious from the above code snippet is perhaps the <b>most |
|---|
| 59 |
important aspect</b> of Expressions: any globals or cell references (from |
|---|
| 60 |
closures) in the supplied lambda get <b>bound early</b>. Compare the |
|---|
| 61 |
following disassemblies: |
|---|
| 62 |
<pre>>>> import dis |
|---|
| 63 |
>>> dis.dis(f) |
|---|
| 64 |
1 0 LOAD_FAST 0 (x) |
|---|
| 65 |
3 LOAD_ATTR 1 (Date) |
|---|
| 66 |
6 LOAD_GLOBAL 2 (datetime) |
|---|
| 67 |
9 LOAD_ATTR 3 (date) |
|---|
| 68 |
12 LOAD_CONST 1 (2004) |
|---|
| 69 |
15 LOAD_CONST 2 (3) |
|---|
| 70 |
18 LOAD_CONST 3 (1) |
|---|
| 71 |
21 CALL_FUNCTION 3 |
|---|
| 72 |
24 COMPARE_OP 5 (>=) |
|---|
| 73 |
27 RETURN_VALUE |
|---|
| 74 |
>>> dis.dis(e.func) |
|---|
| 75 |
1 0 LOAD_FAST 0 (x) |
|---|
| 76 |
3 LOAD_ATTR 1 (Date) |
|---|
| 77 |
6 LOAD_CONST 6 (datetime.date(2004, 3, 1)) |
|---|
| 78 |
9 COMPARE_OP 5 (>=) |
|---|
| 79 |
12 RETURN_VALUE |
|---|
| 80 |
</pre> |
|---|
| 81 |
As you can see, the function itself references the global 'datetime' module. |
|---|
| 82 |
Once we wrap it in the Expression, however, it becomes a constant! Thanks to |
|---|
| 83 |
Raymond Hettinger for inspiring this solution <a href='#hettinger'>[1]</a>. |
|---|
| 84 |
Early binding, however, implies two consequences:</p> |
|---|
| 85 |
|
|---|
| 86 |
<p>First, any globals or cell references must be present in the lambda's |
|---|
| 87 |
scope when it is passed into Expression(). This is the norm and shouldn't |
|---|
| 88 |
require too much thought from you when you write Expressions. In the |
|---|
| 89 |
example above, we simply imported <tt>datetime</tt> as you would expect.</p> |
|---|
| 90 |
|
|---|
| 91 |
<p>Second, any globals or cell references must <b>also</b> be present in |
|---|
| 92 |
the <tt>logic</tt> module's globals when the Expression is unpickled. |
|---|
| 93 |
Pickling occurs when Expressions are sent over sockets, and also if |
|---|
| 94 |
Expressions are themselves persisted to storage (for example, see |
|---|
| 95 |
<u>Unit Engines</u>, below). This means your application should inject |
|---|
| 96 |
globals into the <tt>logic</tt> module. Note that the <tt>logic</tt> module |
|---|
| 97 |
already tries to import <tt>datetime</tt>, <tt>fixedpoint</tt> and |
|---|
| 98 |
<tt>decimal</tt>.</p> |
|---|
| 99 |
|
|---|
| 100 |
<h4>External functions within Expressions</h4> |
|---|
| 101 |
<p>Dejavu provides additional functions which can be used in Expressions. |
|---|
| 102 |
For example, you can construct an Expression like: |
|---|
| 103 |
<pre>logic.Expression(lambda x: x.Size < 3 and x.Date > dejavu.today())</pre> |
|---|
| 104 |
In this example, the <tt>today()</tt> function breaks convention and is |
|---|
| 105 |
actually <b>bound late</b>. That is, if you construct this Expression now |
|---|
| 106 |
and use it six months later, the value of <tt>today()</tt> will change. |
|---|
| 107 |
Storage Managers "know about" these dejavu functions, and can use them |
|---|
| 108 |
to build more appropriate queries. Here are the functions supplied by |
|---|
| 109 |
the <tt>dejavu</tt> module:</p> |
|---|
| 110 |
|
|---|
| 111 |
<table> |
|---|
| 112 |
<tr><th>Function</th><th>Late bound?</th><th>Description</th></tr> |
|---|
| 113 |
<tr> |
|---|
| 114 |
<td><tt>icontains(a, b)</tt></td> |
|---|
| 115 |
<td></td> |
|---|
| 116 |
<td>Case-insensitive test b in a. Note the operand order.</td> |
|---|
| 117 |
</tr> |
|---|
| 118 |
<tr> |
|---|
| 119 |
<td><tt>icontainedby(a, b)</tt></td> |
|---|
| 120 |
<td></td> |
|---|
| 121 |
<td>Case-insensitive test a in b. Note the operand order.</td> |
|---|
| 122 |
</tr> |
|---|
| 123 |
<tr> |
|---|
| 124 |
<td><tt>istartswith(a, b)</tt></td> |
|---|
| 125 |
<td></td> |
|---|
| 126 |
<td>True if a starts with b (case-insensitive), False otherwise.</td> |
|---|
| 127 |
</tr> |
|---|
| 128 |
<tr> |
|---|
| 129 |
<td><tt>iendswith(a, b)</tt></td> |
|---|
| 130 |
<td></td> |
|---|
| 131 |
<td>True if a ends with b (case-insensitive), False otherwise.</td> |
|---|
| 132 |
</tr> |
|---|
| 133 |
<tr> |
|---|
| 134 |
<td><tt>ieq(a, b)</tt></td> |
|---|
| 135 |
<td></td> |
|---|
| 136 |
<td>True if a == b (case-insensitive), False otherwise.</td> |
|---|
| 137 |
</tr> |
|---|
| 138 |
<tr> |
|---|
| 139 |
<td><tt>year(value)</tt></td> |
|---|
| 140 |
<td></td> |
|---|
| 141 |
<td>The year attribute of a date. If value is None, return None.</td> |
|---|
| 142 |
</tr> |
|---|
| 143 |
<tr> |
|---|
| 144 |
<td><tt>now()</tt></td> |
|---|
| 145 |
<td>Y</td> |
|---|
| 146 |
<td>datetime.datetime.now()</td> |
|---|
| 147 |
</tr> |
|---|
| 148 |
<tr> |
|---|
| 149 |
<td><tt>today()</tt></td> |
|---|
| 150 |
<td>Y</td> |
|---|
| 151 |
<td>datetime.date.today()</td> |
|---|
| 152 |
</tr> |
|---|
| 153 |
<tr> |
|---|
| 154 |
<td><tt>iscurrentweek(value)</tt></td> |
|---|
| 155 |
<td>Y</td> |
|---|
| 156 |
<td>If value is in the current week, return True, else False.</td> |
|---|
| 157 |
</tr> |
|---|
| 158 |
</table> |
|---|
| 159 |
|
|---|
| 160 |
<p>It is possible for you, the application developer, to define your |
|---|
| 161 |
own external functions. However, because Storage Managers are unaware |
|---|
| 162 |
of your new functions, they will not be able to optimize their use; |
|---|
| 163 |
instead, they will simply retrieve a larger set of objects from storage, |
|---|
| 164 |
evaluate each one against the function you provide, and return those |
|---|
| 165 |
Units which match your function. This isn't necessarily a bad thing; |
|---|
| 166 |
it provides the same functionality as if you wrote the test inline |
|---|
| 167 |
within your own code. By making that test a logic function, you allow |
|---|
| 168 |
it to be stored in Engine <i>rules</i> (see <u>Unit Engines</u>, |
|---|
| 169 |
below).</p> |
|---|
| 170 |
|
|---|
| 171 |
<h4>Using <tt>filter</tt> to form Expressions</h4> |
|---|
| 172 |
<p>The <tt>logic</tt> module also provides convenient methods to |
|---|
| 173 |
create common types of Expression objects via the <tt>filter</tt> and |
|---|
| 174 |
<tt>comparison</tt> factory functions.</p> |
|---|
| 175 |
|
|---|
| 176 |
<p>The <tt class='def'>filter(**kwargs)</tt> function produces an Expression by taking |
|---|
| 177 |
the keyword arguments you supply, and rewriting them in lambda form. The |
|---|
| 178 |
only operator allowed is therefore the equals '==' operator. For example: |
|---|
| 179 |
<pre>>>> logic.filter(Type='Cat', Mutation='Atomic') |
|---|
| 180 |
logic.Expression(lambda x: (x.Type == 'Cat') and (x.Mutation == 'Atomic'))</pre> |
|---|
| 181 |
</p> |
|---|
| 182 |
|
|---|
| 183 |
<h4>Using <tt>comparison</tt> to form Expressions</h4> |
|---|
| 184 |
<p>The <tt class='def'>comparison(attr, cmp_op, criteria)</tt> function allows you to |
|---|
| 185 |
form Expressions with dynamic operators. This can come in handy when you |
|---|
| 186 |
are constructing Expressions on the fly from user input. For example, a |
|---|
| 187 |
search page might prompt users for an attribute name, an operator, and an |
|---|
| 188 |
operand (the criteria).</p> |
|---|
| 189 |
|
|---|
| 190 |
<p>Borrowing from <tt>opcode.cmp_op</tt>, the allowed values for our cmp_op |
|---|
| 191 |
argument are as follows:</p> |
|---|
| 192 |
<table> |
|---|
| 193 |
<tr><th>Numeric Value (cmp_op)</th><th>Operator</th></tr> |
|---|
| 194 |
<tr><td>0</td><td><</td></tr> |
|---|
| 195 |
<tr><td>1</td><td><=</td></tr> |
|---|
| 196 |
<tr><td>2</td><td>==</td></tr> |
|---|
| 197 |
<tr><td>3</td><td>!=</td></tr> |
|---|
| 198 |
<tr><td>4</td><td>></td></tr> |
|---|
| 199 |
<tr><td>5</td><td>>=</td></tr> |
|---|
| 200 |
<tr><td>6</td><td>in</td></tr> |
|---|
| 201 |
<tr><td>7</td><td>not in</td></tr> |
|---|
| 202 |
<tr><th colspan='2'>Most SM's don't support the following:</th></tr> |
|---|
| 203 |
<tr><td>8</td><td>is</td></tr> |
|---|
| 204 |
<tr><td>9</td><td>is not</td></tr> |
|---|
| 205 |
</table> |
|---|
| 206 |
|
|---|
| 207 |
<p>Here's an example of using <tt>comparison</tt>: |
|---|
| 208 |
<pre>>>> logic.comparison('Name', 3, 'Mr. Kamikaze') |
|---|
| 209 |
logic.Expression(lambda x: x.Name != 'Mr. Kamikaze')</pre> |
|---|
| 210 |
Although the comparison function only allows a single comparison at a time, |
|---|
| 211 |
the resulting Expressions can be combined with the <tt>&</tt> and <tt>|</tt> |
|---|
| 212 |
operators (described earlier) to produce more complex Expressions.</p> |
|---|
| 213 |
|
|---|
| 214 |
<h4>Combining Expressions</h4> |
|---|
| 215 |
<p>Expressions are combinable; by using the <tt>&</tt> operator, the two |
|---|
| 216 |
expressions are combined with an adjoining logical "and". For example: |
|---|
| 217 |
<pre>>>> a = logic.Expression(lambda x: x.Size > 3) |
|---|
| 218 |
>>> b = logic.Expression(lambda x: x.Size <= 15) |
|---|
| 219 |
>>> c = a & b |
|---|
| 220 |
>>> c |
|---|
| 221 |
logic.Expression(lambda x: (x.Size > 3) and (x.Size <= 15))</pre> |
|---|
| 222 |
The <tt>+</tt> operator works just like the <tt>&</tt> operator. The |
|---|
| 223 |
<tt>|</tt> operator combines the two Expressions with a logical 'or'.</p> |
|---|
| 224 |
|
|---|
| 225 |
<p>When you combine two Expressions with dissimilar argument lists, |
|---|
| 226 |
what happens? The Expression class doesn't really care what the argument |
|---|
| 227 |
names are, just their order, so the names might not come out as you might |
|---|
| 228 |
expect; however, the logic is preserved:</p> |
|---|
| 229 |
|
|---|
| 230 |
<pre>>>> f = logic.filter(Name='Bruce') |
|---|
| 231 |
>>> f |
|---|
| 232 |
logic.Expression(lambda x: x.Name == 'Bruce') |
|---|
| 233 |
>>> g = logic.Expression(lambda a, b, **kw: a.Name + b.Surname == kw['Full Name']) |
|---|
| 234 |
>>> |
|---|
| 235 |
>>> f + g |
|---|
| 236 |
logic.Expression(lambda x, b, **kw: (x.Name == 'Bruce') |
|---|
| 237 |
and (x.Name + b.Surname == kw['Full Name'])) |
|---|
| 238 |
>>> g + f |
|---|
| 239 |
logic.Expression(lambda a, b, **kw: (a.Name + b.Surname == kw['Full Name']) |
|---|
| 240 |
and (a.Name == 'Bruce')) |
|---|
| 241 |
</pre> |
|---|
| 242 |
|
|---|
| 243 |
<h4>Exporting the <tt>logic</tt> module</h4> |
|---|
| 244 |
<p>The <tt>logic</tt> module (and <tt>codewalk</tt>, on which it is built) |
|---|
| 245 |
isn't limited to Dejavu. Feel free to use it in some other framework or |
|---|
| 246 |
script! The only change you may have to make (if you relocate the module |
|---|
| 247 |
outside of the <tt>dejavu</tt> package) would be to the single line: |
|---|
| 248 |
<tt>from dejavu import codewalk</tt>, to point to the new location.</p> |
|---|
| 249 |
|
|---|
| 250 |
<p>In particular, <tt>logic.Expression</tt> objects can operate on <i>any</i> |
|---|
| 251 |
Python objects, not just dejavu <tt>Unit</tt> instances. If you wish to |
|---|
| 252 |
provide additional logic functions (as dejavu does), simply inject them |
|---|
| 253 |
into <tt>logic</tt>'s globals.</p> |
|---|
| 254 |
|
|---|
| 255 |
<p>You may also find the underlying <tt>codewalk</tt> module useful for |
|---|
| 256 |
other purposes on its own. The <tt>Visitor</tt> base class can be very |
|---|
| 257 |
convenient for building bytecode hacks.</p> |
|---|
| 258 |
|
|---|
| 259 |
<p>To make a long story short, Dejavu depends on <tt>logic</tt> throughout, |
|---|
| 260 |
but the reverse is not true.</p> |
|---|
| 261 |
|
|---|
| 262 |
|
|---|
| 263 |
<h3><a name='Engines'>Unit Engines</a></h3> |
|---|
| 264 |
<p>Once you've created and associated your Unit classes, you can begin to |
|---|
| 265 |
write "business logic" code (mostly inside those classes, we hope), and |
|---|
| 266 |
"presentation logic" code (mostly outside those classes). In most cases, |
|---|
| 267 |
you will construct Expressions within your own code manually to retrieve |
|---|
| 268 |
Units. Sometimes, however, you need to persist query parameters from your |
|---|
| 269 |
users; in other cases, you might store a list of Units which match a query |
|---|
| 270 |
(regardless of who formed the necessary Expression). Finally, you might |
|---|
| 271 |
wish to manipulate lists of Units as sets: differences, intersections, |
|---|
| 272 |
and unions. The <tt>engines</tt> module addresses all of these needs.</p> |
|---|
| 273 |
|
|---|
| 274 |
<h4>Collections: Lists of Units</h4> |
|---|
| 275 |
<p>The <tt>UnitCollection</tt> class provides a means of storing a list |
|---|
| 276 |
of Units, or rather, a list of Unit identifiers. You use its <tt>Type</tt> |
|---|
| 277 |
property to indicate the class of the indexed Units. That value should be |
|---|
| 278 |
the <b>name</b> of the Unit Class, <b>not</b> the class object itself |
|---|
| 279 |
(this is different than most other calls in Dejavu). If you need to |
|---|
| 280 |
retrieve the actual Unit class, call <tt>UnitCollection().unit_class()</tt>.</p> |
|---|
| 281 |
|
|---|
| 282 |
<p><tt>UnitCollection</tt> itself subclasses <tt>dejavu.Unit</tt>; you can |
|---|
| 283 |
therefore persist Unit Collections via Dejavu Storage Managers (most SM's, |
|---|
| 284 |
anyway; it's recommended that SM's handle Unit Collections, but not |
|---|
| 285 |
required. Check your SM to see if it does).</p> |
|---|
| 286 |
|
|---|
| 287 |
<p>Each Collection has a thread lock (an RLock, actually) which you should |
|---|
| 288 |
<tt class='def'>acquire()</tt> before you add an ID to the set, |
|---|
| 289 |
and <tt class='def'>release()</tt> afterward. If you use the |
|---|
| 290 |
<tt class='def'>add(ID)</tt> method, this locking is done |
|---|
| 291 |
for you.</p> |
|---|
| 292 |
|
|---|
| 293 |
<p>When you need to retrieve the actual Units which are indexed by the |
|---|
| 294 |
Collection, call the <tt class='def'>units(quota=None)</tt> method, |
|---|
| 295 |
which will |
|---|
| 296 |
look up the Units and return them in a list. Since the Collection only |
|---|
| 297 |
stores identifiers, it is possible that one of the indexed Units may have |
|---|
| 298 |
been destroyed since the list was built. The <tt>units</tt> method simply |
|---|
| 299 |
passes over these "phantom" Units. You can inspect the full list of IDs |
|---|
| 300 |
in the Collection (whether they reference existing Units or not) with |
|---|
| 301 |
the <tt class='def'>ids()</tt> method.</p> |
|---|
| 302 |
|
|---|
| 303 |
<p>Collections also provide a convenience function for grouping Units |
|---|
| 304 |
by attribute: <tt class='def'>xdict(attr)</tt>. This function will look up |
|---|
| 305 |
each Unit |
|---|
| 306 |
in the Collection, inspect the attribute that you specify, and return |
|---|
| 307 |
a dictionary of the form <tt>{attr_val1: [Unit, Unit, ...]}</tt>. |
|---|
| 308 |
Each distinct attribute value will have its own key, with a list of |
|---|
| 309 |
matching Units as the value.</p> |
|---|
| 310 |
|
|---|
| 311 |
<h4>Engines</h4> |
|---|
| 312 |
<p>You can form Collections by hand, but a more powerful technique is |
|---|
| 313 |
the <tt>UnitEngine</tt>, a factory for Collections. Engines are very |
|---|
| 314 |
simple: they possess a set of <i>rules</i> which are executed when |
|---|
| 315 |
you want to take a <i>snapshot</i> of Units. The snapshot which is |
|---|
| 316 |
produced is a <tt>UnitCollection</tt> object. Whenever you call |
|---|
| 317 |
<tt class='def'>take_snapshot()</tt>, the Engine will maintain an association |
|---|
| 318 |
to the resulting Collection. You can access past snapshots with the |
|---|
| 319 |
<tt class='def'>snapshots()</tt> method.</p> |
|---|
| 320 |
|
|---|
| 321 |
<p>Engines are themselves Units, and can be persisted via Storage Managers. |
|---|
| 322 |
The only properties they possess are: an <tt class='def'>ID</tt>, |
|---|
| 323 |
a <tt class='def'>Name</tt>, an <tt class='def'>Owner</tt>, |
|---|
| 324 |
a <tt class='def'>FinalClassName</tt>, and <tt class='def'>Created</tt>, |
|---|
| 325 |
the creation date of the Engine.</p> |
|---|
| 326 |
|
|---|
| 327 |
<p>The <tt>Owner</tt> property should either be a user name, or one of the |
|---|
| 328 |
reserved names: "Public" and "System". By default, the |
|---|
| 329 |
<tt class='def'>permit()</tt> |
|---|
| 330 |
method allows a user read-access to the Engine if they are the Owner, or |
|---|
| 331 |
the Owner is "Public" or "System". Write-access is permitted if the user |
|---|
| 332 |
is the Owner, or the Owner is "Public". Feel free to override |
|---|
| 333 |
<tt>permit()</tt> in a subclass to provide different behaviors.</p> |
|---|
| 334 |
|
|---|
| 335 |
<p>The <tt>FinalClassName</tt> is set for you as you add Rules to the |
|---|
| 336 |
Engine. You can use the value of this property, for example, to tell |
|---|
| 337 |
your users, "Engine #23569 is an 'Armadillo' engine," when it produces |
|---|
| 338 |
Collections of <tt>Armadillo</tt> Units. The only time you might want to |
|---|
| 339 |
set this value manually is when you first create the Engine, before you |
|---|
| 340 |
have added any Rules.</p> |
|---|
| 341 |
|
|---|
| 342 |
<h4><a name='unitenginerules'>Rules</a></h4> |
|---|
| 343 |
<p>Just like Collections and Engines, <tt>UnitEngineRule</tt> is <i>also</i> |
|---|
| 344 |
a subclass of <tt>Unit</tt>, and can be persisted via Storage Managers. All |
|---|
| 345 |
three work together to provide a complete, dynamic, application-level query |
|---|
| 346 |
generator.</p> |
|---|
| 347 |
|
|---|
| 348 |
<p>Okay, so what are Rules? You might say they're a "little language", |
|---|
| 349 |
with the following primitives, or "operations":</p> |
|---|
| 350 |
<table> |
|---|
| 351 |
<tr><th>Operation</th><th>Operand(s)</th><th>Description</th></tr> |
|---|
| 352 |
<tr><th colspan='3'>Operations on a single set</th></tr> |
|---|
| 353 |
<tr> |
|---|
| 354 |
<td>CREATE</td> |
|---|
| 355 |
<td>The classname of the new Type</td> |
|---|
| 356 |
<td>Creates a new Set of the specified Type. All Units of that Type |
|---|
| 357 |
are included in the new Set.</td> |
|---|
| 358 |
</tr> |
|---|
| 359 |
<tr> |
|---|
| 360 |
<td>FILTER</td> |
|---|
| 361 |
<td>A <tt>logic.Expression</tt></td> |
|---|
| 362 |
<td>Removes Units from the current Set which do not match the |
|---|
| 363 |
Expression.</td> |
|---|
| 364 |
</tr> |
|---|
| 365 |
<tr> |
|---|
| 366 |
<td>FUNCTION</td> |
|---|
| 367 |
<td>The name of a function in the <tt>Arena.engine_functions</tt> |
|---|
| 368 |
dict</td> |
|---|
| 369 |
<td>Calls the function, passing the current Set. The function |
|---|
| 370 |
should modify the Set.</td> |
|---|
| 371 |
</tr> |
|---|
| 372 |
<tr> |
|---|
| 373 |
<td>TRANSFORM</td> |
|---|
| 374 |
<td>The classname of the new Type</td> |
|---|
| 375 |
<td>Transform the current Set into a Set of associated Units |
|---|
| 376 |
(of another Type). The association must be present in the |
|---|
| 377 |
<tt>Arena.associations</tt> graph.</td> |
|---|
| 378 |
</tr> |
|---|
| 379 |
<tr> |
|---|
| 380 |
<td>RETURN</td> |
|---|
| 381 |
<td></td> |
|---|
| 382 |
<td>Optional. If omitted, the last Set handled is returned as the |
|---|
| 383 |
snapshot. If supplied, the ID of the Set to return.</td> |
|---|
| 384 |
</tr> |
|---|
| 385 |
<tr><th colspan='3'>Operations on two sets</th></tr> |
|---|
| 386 |
<tr> |
|---|
| 387 |
<td>COPY</td> |
|---|
| 388 |
<td>The Set ID of the new Set</td> |
|---|
| 389 |
<td>Copies the current Set to a new Set. The current Set is unchanged.</td> |
|---|
| 390 |
</tr> |
|---|
| 391 |
<tr> |
|---|
| 392 |
<td>DIFFERENCE</td> |
|---|
| 393 |
<td>The ID of the Set to mix in</td> |
|---|
| 394 |
<td>Removes IDs from the current Set which exist in the second Set.</td> |
|---|
| 395 |
</tr> |
|---|
| 396 |
<tr> |
|---|
| 397 |
<td>INTERSECTION</td> |
|---|
| 398 |
<td>The ID of the Set to mix in</td> |
|---|
| 399 |
<td>Removes IDs from the current Set which <i>do not</i> exist in the |
|---|
| 400 |
second Set.</td> |
|---|
| 401 |
</tr> |
|---|
| 402 |
<tr> |
|---|
| 403 |
<td>UNION</td> |
|---|
| 404 |
<td>The ID of the Set to mix in</td> |
|---|
| 405 |
<td>Adds any IDs to the current Set which exist in the second Set.</td> |
|---|
| 406 |
</tr> |
|---|
| 407 |
</table> |
|---|
| 408 |
|
|---|
| 409 |
<p>Each Rule has an <tt>Operation</tt> property (a string, one of the above), |
|---|
| 410 |
a <tt>SetID</tt>, and an <tt>Operand</tt>. Here's an example ruleset:</p> |
|---|
| 411 |
<table> |
|---|
| 412 |
<tr><th>Sequence</th><th>Operation</th><th>SetID</th><th>Operand</th></tr> |
|---|
| 413 |
<tr><td>1</td><td>CREATE</td><td>1</td><td>Invoice</td></tr> |
|---|
| 414 |
<tr><td>2</td><td>FILTER</td><td>1</td><td>(Expression)</td></tr> |
|---|
| 415 |
<tr><td>3</td><td>CREATE</td><td>2</td><td>Inventory</td></tr> |
|---|
| 416 |
<tr><td>4</td><td>FILTER</td><td>2</td><td>(Expression)</td></tr> |
|---|
| 417 |
<tr><td>5</td><td>TRANSFORM</td><td>2</td><td>Invoice</td></tr> |
|---|
| 418 |
<tr><td>6</td><td>DIFFERENCE</td><td>1</td><td>2</td></tr> |
|---|
| 419 |
<tr><td>7</td><td>RETURN</td><td>1</td><td></td></tr> |
|---|
| 420 |
</table> |
|---|
| 421 |
|
|---|
| 422 |
<p>As you can see, every Rule operates on a <i>Set</i> of Units. The first |
|---|
| 423 |
rule is always to CREATE a set, declaring it to contain a certain Type |
|---|
| 424 |
of Units. In most cases, you will then FILTER that set. If you simply |
|---|
| 425 |
created a set and then returned it, it would contain all Units of the |
|---|
| 426 |
declared Type. When you filter a set, however, you remove Units from |
|---|
| 427 |
the whole which do not match the filter's Expression.</p> |
|---|
| 428 |
|
|---|
| 429 |
<p>In the example above, we CREATE a second Set so that we can eventually |
|---|
| 430 |
obtain the DIFFERENCE between Set 1 and Set 2. The second Set contains |
|---|
| 431 |
Units of a different Type than the first. Once we filter Set 2, we then |
|---|
| 432 |
TRANSFORM it; for each Inventory Unit, we look up associated Invoice |
|---|
| 433 |
Units. Then, we find the difference between the two Invoice sets and |
|---|
| 434 |
RETURN it.</p> |
|---|
| 435 |
|
|---|
| 436 |
<p>Rules are executed in order according to their <tt>Sequence</tt> |
|---|
| 437 |
attribute (lowest first). When you use the <tt>Engine.add_rule</tt> method, |
|---|
| 438 |
the next <tt>Sequence</tt> value is retrieved for you. Notice that each |
|---|
| 439 |
Rule belongs to one and only one Engine; they are not shared between |
|---|
| 440 |
Engines. Each Rule has its own <tt>EngineID</tt> attribute.</p> |
|---|
| 441 |
|
|---|
| 442 |
<h4>Engine Functions</h4> |
|---|
| 443 |
<p>The FUNCTION rule deserves special mention. The Operand of a FUNCTION |
|---|
| 444 |
rule is a string, a key in the <tt>Arena.engine_functions</tt> dictionary. |
|---|
| 445 |
When the rule is executed, that key is used to look up the function, which |
|---|
| 446 |
is then called, passing <tt>(sandbox, set)</tt>. The function should |
|---|
| 447 |
mutate the set directly. Use FUNCTION rules to mutate sets in ways which |
|---|
| 448 |
are more complex than those provided by FILTER and TRANSFORM. For example, |
|---|
| 449 |
you might provide a function which removes all but the first Unit in the |
|---|
| 450 |
Set (according to some ordering algorithm).</p> |
|---|
| 451 |
|
|---|
| 452 |
|
|---|
| 453 |
<h3>Analysis Tools</h3> |
|---|
| 454 |
<p>Dejavu includes various tools to help you manipulate groups of Units.</p> |
|---|
| 455 |
|
|---|
| 456 |
<h4>Sorting Units</h4> |
|---|
| 457 |
<p>When you recall Units, you receive a list. However, the <tt>recall</tt> |
|---|
| 458 |
method doesn't do any sorting; you must sort your list in your Python code. |
|---|
| 459 |
Dejavu provides a <tt class='def'>sort(attrs, descending=False)</tt> |
|---|
| 460 |
function to assist |
|---|
| 461 |
you in sorting Units. It returns a function, which you can then use in |
|---|
| 462 |
Python's sort function (which operates in place). Continuing our example: |
|---|
| 463 |
<pre>people.sort(dejavu.sort('Size', 'Name'))</pre> |
|---|
| 464 |
The most important issue (and the reason we don't just use 2.4's attrgetter), |
|---|
| 465 |
is that any Unit property must allow values of None, which tends to raise |
|---|
| 466 |
errors when compared to values of other types. The function which |
|---|
| 467 |
<tt>sort</tt> creates for you treats None as "less than" any other value.</p> |
|---|
| 468 |
|
|---|
| 469 |
<h4>Cross-tabulation</h4> |
|---|
| 470 |
<p>Cross-tabs (also called <i>aggregate tables</i> or <i>pivot tables</i>) |
|---|
| 471 |
display aggregate information about objects by category. For example, |
|---|
| 472 |
rather than show a list of Safari records, one row per trip, you might |
|---|
| 473 |
wish to show a table where each row represents a Destination, and each |
|---|
| 474 |
column shows the count of Safaris to that Destination for each distinct |
|---|
| 475 |
Year. In this example, we say that the Safaris are "grouped by" their |
|---|
| 476 |
Destination values, and that we "pivot" on the Year values.</p> |
|---|
| 477 |
|
|---|
| 478 |
<p>Dejavu helps you form such a table via the <tt>CrossTab</tt> class. |
|---|
| 479 |
You need to specify the group(s) you wish to use, and the pivot attribute. |
|---|
| 480 |
Finally, you must specify the aggregate function. Here's a code example: |
|---|
| 481 |
<pre> |
|---|
| 482 |
>>> data = ["a", "b", "cc", "bddd", "a4", "b6"] |
|---|
| 483 |
>>> group = lambda x: x.isalpha() |
|---|
| 484 |
>>> pivot = lambda x: x[0] |
|---|
| 485 |
>>> ctab = analysis.CrossTab(data, [group], pivot, dejavu.COUNT) |
|---|
| 486 |
>>> data, columns = ctab.results() |
|---|
| 487 |
>>> data |
|---|
| 488 |
{(True,): {"a": 1, "b": 2, "c": 1}, |
|---|
| 489 |
(False,): {"a": 1, "b": 1}} |
|---|
| 490 |
>>> columns |
|---|
| 491 |
["a", "b", "c"]</pre> |
|---|
| 492 |
You may notice that we're not using Units in our example; the |
|---|
| 493 |
<tt>CrossTab</tt> class is designed to work with any objects. Here's one |
|---|
| 494 |
way to lay out that data:</p> |
|---|
| 495 |
<table> |
|---|
| 496 |
<tr><th>Is Alpha</th><th>a</th><th>b</th><th>c</th></tr> |
|---|
| 497 |
<tr><td>Y</td><td>1</td><td>2</td><td>1</td></tr> |
|---|
| 498 |
<tr><td>N</td><td>1</td><td>1</td><td>0</td></tr> |
|---|
| 499 |
</table> |
|---|
| 500 |
|
|---|
| 501 |
<p>The <tt>results</tt> method returns two values. First, the table |
|---|
| 502 |
itself in the form of a dictionary; each key is a tuple of group values, |
|---|
| 503 |
and the corresponding value is a sub-dictionary. Each sub-dict has keys |
|---|
| 504 |
which are the pivot attribute, and values which equal the aggregates. |
|---|
| 505 |
I know, that was confusing; look at the example. The second value to |
|---|
| 506 |
be returned is a list of the pivot column values; you'll notice they're |
|---|
| 507 |
sorted.</p> |
|---|
| 508 |
|
|---|
| 509 |
<p>The groups and pivot arguments may be either strings or functions. |
|---|
| 510 |
If strings, they must be the names of attributes of the source objects. |
|---|
| 511 |
The final aggfunc argument defaults to COUNT, but may also be SUM. |
|---|
| 512 |
More aggfuncs may arrive in the future.</p> |
|---|
| 513 |
|
|---|
| 514 |
<hr /> |
|---|
| 515 |
|
|---|
| 516 |
<p><a name='hettinger'>[1]</a> Python Cookbook, |
|---|
| 517 |
<a href='http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277940'>Binding |
|---|
| 518 |
Constants at compile time</a><br /> |
|---|
| 519 |
</p> |
|---|
| 520 |
|
|---|
| 521 |
</body> |
|---|
| 522 |
</html> |
|---|