| 1 |
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" |
|---|
| 2 |
"http://www.w3.org/TR/xhtml1/DTD/strict.dtd"> |
|---|
| 3 |
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
|---|
| 4 |
|
|---|
| 5 |
<head> |
|---|
| 6 |
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> |
|---|
| 7 |
<title>Dejavu: Modeling your Application</title> |
|---|
| 8 |
<link href='dejavu.css' rel='stylesheet' type='text/css' /> |
|---|
| 9 |
</head> |
|---|
| 10 |
|
|---|
| 11 |
<body> |
|---|
| 12 |
|
|---|
| 13 |
<h2>Application Developers: Using Dejavu to Construct a Domain Model</h2> |
|---|
| 14 |
|
|---|
| 15 |
<h3>Units</h3> |
|---|
| 16 |
<p>When constructing a Domain Model for your application, you will want |
|---|
| 17 |
to distinguish between objects that will be persisted and objects that |
|---|
| 18 |
will not. By registering a subclass of <tt>dejavu.Unit</tt>, you allow |
|---|
| 19 |
instances of that subclass to be persisted.</p> |
|---|
| 20 |
|
|---|
| 21 |
<p>Before you can register your Unit class, you must create it: |
|---|
| 22 |
<pre>import dejavu |
|---|
| 23 |
class Printer(dejavu.Unit): pass</pre> |
|---|
| 24 |
This is all you need for a fully-functioning Unit class. There are |
|---|
| 25 |
no methods or attributes that you are required to override; simply |
|---|
| 26 |
subclass from <tt>Unit</tt>. However, this is a fairly uninteresting |
|---|
| 27 |
class. It doesn't provide any functionality other than what <tt>Unit</tt> |
|---|
| 28 |
already provides. The first thing we will probably want to add to our |
|---|
| 29 |
new class is persistent data.</p> |
|---|
| 30 |
|
|---|
| 31 |
<h4>UnitProperty</h4> |
|---|
| 32 |
<p>Once you have defined a persistent class (by subclassing <tt>Unit</tt>), |
|---|
| 33 |
you need to make another decision. Rather than persist the entire object |
|---|
| 34 |
<tt>dict</tt>, you specify a subset of persistent attributes by using |
|---|
| 35 |
<tt>UnitProperty</tt>, a data descriptor. If you've used Python's builtin |
|---|
| 36 |
property() construct, you've used descriptors before.</p> |
|---|
| 37 |
|
|---|
| 38 |
<p>We might enhance our Printer example thusly: |
|---|
| 39 |
<pre>from dejavu import Unit, UnitProperty |
|---|
| 40 |
class Printer(Unit): |
|---|
| 41 |
Manufacturer = UnitProperty(unicode) |
|---|
| 42 |
ColorCopies = UnitProperty(bool) |
|---|
| 43 |
PPM = UnitProperty(float)</pre> |
|---|
| 44 |
This adds three persistent attributes to our <tt>Printer</tt> objects, |
|---|
| 45 |
each with a different datatype. In addition, every subclass of <tt>Unit</tt> |
|---|
| 46 |
inherits an 'ID' property, an int.</p> |
|---|
| 47 |
|
|---|
| 48 |
<p>When you get and set <tt>UnitProperty</tt> attributes, they behave just |
|---|
| 49 |
like any other attributes: |
|---|
| 50 |
<pre>>>> p = Printer() |
|---|
| 51 |
>>> p.PPM = 25 |
|---|
| 52 |
>>> p.PPM |
|---|
| 53 |
25.0</pre> |
|---|
| 54 |
However, you will notice right away that the int value we provided has been |
|---|
| 55 |
coerced to a float behind the scenes. This is because we specified the PPM |
|---|
| 56 |
attribute as a 'float' type when we created it. The value of a Unit |
|---|
| 57 |
Property is restricted to the type which you specify. The only other valid |
|---|
| 58 |
value for a Unit Property is None; any Property may be None at any time, |
|---|
| 59 |
and in fact, all Properties are None until you assign values to them: |
|---|
| 60 |
<pre>>>> p.ColorCopies is None |
|---|
| 61 |
True</pre></p> |
|---|
| 62 |
|
|---|
| 63 |
<h4>Unit ID's</h4> |
|---|
| 64 |
<p>The <tt>Unit</tt> base class possesses a single Unit Property, an int |
|---|
| 65 |
named 'ID'. If you wish to use ID's of a different type, simply override |
|---|
| 66 |
the ID attribute in your subclass: |
|---|
| 67 |
<pre>class Printer(Unit): |
|---|
| 68 |
ID = UnitProperty(unicode)</pre> |
|---|
| 69 |
Every Unit must possess an ID property. This ensures that each Unit within |
|---|
| 70 |
the system is unique.</p> |
|---|
| 71 |
|
|---|
| 72 |
<h4>Creating and Populating Properties</h4> |
|---|
| 73 |
<p>In addition to defining Unit Properties within your class body, |
|---|
| 74 |
you can define them after the class body has been executed via |
|---|
| 75 |
the classmethod <tt>set_property()</tt>. For example, the following |
|---|
| 76 |
two classes are equivalent: |
|---|
| 77 |
<pre>class Publication(Unit): |
|---|
| 78 |
Content = UnitProperty(unicode) |
|---|
| 79 |
|
|---|
| 80 |
class Publication(Unit): pass |
|---|
| 81 |
Publication.set_property('Content', unicode)</pre> |
|---|
| 82 |
|
|---|
| 83 |
Declarations outside of the class body allow more dynamic setting of |
|---|
| 84 |
Unit properties. You can define multiple properties at once via |
|---|
| 85 |
the <tt>set_properties()</tt> classmethod: |
|---|
| 86 |
|
|---|
| 87 |
<pre>class Publication(Unit): pass |
|---|
| 88 |
Publication.set_properties({'Content': unicode, |
|---|
| 89 |
'Publisher': unicode, |
|---|
| 90 |
'Year': int, |
|---|
| 91 |
})</pre> |
|---|
| 92 |
</p> |
|---|
| 93 |
|
|---|
| 94 |
<p>You also have options when populating Unit Properties. The standard way |
|---|
| 95 |
is simply to reference them as normal Python instance attributes. However, |
|---|
| 96 |
you may also use the <tt>adjust()</tt> method to modify multiple properties |
|---|
| 97 |
at once; pass in keyword arguments which match the properties you wish to |
|---|
| 98 |
modify. Keyword arguments also work when instantiating the object. For |
|---|
| 99 |
example, the following three code snippets are equivalent: |
|---|
| 100 |
|
|---|
| 101 |
<pre>pub = Publication() |
|---|
| 102 |
pub.Publisher = 'Walter J. Black' |
|---|
| 103 |
pub.Year = 1928 |
|---|
| 104 |
|
|---|
| 105 |
pub = Publication() |
|---|
| 106 |
pub.adjust(Publisher='Walter J. Black', Year=1928) |
|---|
| 107 |
|
|---|
| 108 |
pub = Publication(Publisher='Walter J. Black', Year=1928)</pre> |
|---|
| 109 |
</p> |
|---|
| 110 |
|
|---|
| 111 |
<h4>Unit Properties are First-Class Objects</h4> |
|---|
| 112 |
<p>Like many descriptors, Unit Properties behave differently when you access |
|---|
| 113 |
them from the class, rather than from an instance as above. When calling |
|---|
| 114 |
them from the class, you receive the <tt>UnitProperty</tt> object itself, |
|---|
| 115 |
rather than its value for a given instance. That is, |
|---|
| 116 |
<pre>>>> c = Printer.ColorCopies |
|---|
| 117 |
>>> c |
|---|
| 118 |
<dejavu.UnitProperty object at 0x01112970></pre> |
|---|
| 119 |
This is significant, because it allows us to store metadata about the |
|---|
| 120 |
property itself: |
|---|
| 121 |
<pre>>>> c.key, c.index, c.type, c.hints |
|---|
| 122 |
('ColorCopies', False, <type 'bool'>, {})</pre> |
|---|
| 123 |
The <tt>key</tt> attribute is merely the property's canonical name. The |
|---|
| 124 |
<tt>index</tt> value tells Storage Managers whether or not to index the |
|---|
| 125 |
column. The <tt>type</tt> attribute limits property values to instances |
|---|
| 126 |
of that type (or <tt>None</tt>). Finally, the <tt>hints</tt> dictionary |
|---|
| 127 |
provides hints to Storage Managers to help optimize storage. A common use, |
|---|
| 128 |
for example, is to inform Managers that would usually store unicode strings |
|---|
| 129 |
as strings of length 255, that a particular value should be a larger object; |
|---|
| 130 |
this is done with a 'Size' mapping, such as <tt>hints = {u'Size': 0}</tt>, |
|---|
| 131 |
where 0 implies no limit.</p> |
|---|
| 132 |
|
|---|
| 133 |
<p>When you define a UnitProperty instance, you can pass in these extra |
|---|
| 134 |
attributes. The signature for UnitProperty is <tt>(type=unicode, |
|---|
| 135 |
index=False, hints={}, key=None)</tt>. Supply any, all, or none of them |
|---|
| 136 |
as needed.</p> |
|---|
| 137 |
|
|---|
| 138 |
<h4>Triggers</h4> |
|---|
| 139 |
<p>In addition, each UnitProperty has a <tt>pre</tt> and <tt>post</tt> |
|---|
| 140 |
attribute, which default to None. If you override these with methods |
|---|
| 141 |
in a subclass of <tt>UnitProperty</tt>, they will be called when setting |
|---|
| 142 |
a new value for that property, either before (pre) or after (post) the |
|---|
| 143 |
new value is set. For example: |
|---|
| 144 |
<pre>class DatedProperty(UnitProperty): |
|---|
| 145 |
def post(self, unit, value): |
|---|
| 146 |
unit.Date = datetime.datetime.now().replace(microsecond=0) |
|---|
| 147 |
parent = unit.first(Forum) |
|---|
| 148 |
if parent: |
|---|
| 149 |
parent.Date = unit.Date |
|---|
| 150 |
|
|---|
| 151 |
class Topic(Unit): |
|---|
| 152 |
Date = UnitProperty(datetime.date) |
|---|
| 153 |
Content = DatedProperty() |
|---|
| 154 |
ForumID = UnitProperty(int) |
|---|
| 155 |
|
|---|
| 156 |
class Forum(Unit): |
|---|
| 157 |
Date = UnitProperty(datetime.date) |
|---|
| 158 |
|
|---|
| 159 |
associate(Topic, 'ForumID', Forum, 'ID')</pre> |
|---|
| 160 |
In this example, whenever Topic().Content is set, the <tt>post</tt> |
|---|
| 161 |
method will be called and the object's <tt>Date</tt> attribute will |
|---|
| 162 |
be modified. Then, the Topic's parent Forum is looked up and <i>its</i> |
|---|
| 163 |
<tt>Date</tt> is modified.</p> |
|---|
| 164 |
|
|---|
| 165 |
<p>As with any trigger system, you need to be careful not to have triggers |
|---|
| 166 |
called out of order. For example, if a user changes both the ForumID and |
|---|
| 167 |
Content properties in a single operation (like a web page submit), the old |
|---|
| 168 |
Forum will be incorrectly modified if the Content property is applied |
|---|
| 169 |
first. I don't have any cool tools built into Dejavu to help you with |
|---|
| 170 |
this, but I'm open to suggestions.</p> |
|---|
| 171 |
|
|---|
| 172 |
<h4>Registration of Unit Classes</h4> |
|---|
| 173 |
<p>In addition to defining your Unit class, you must also register that |
|---|
| 174 |
class with your application's <tt>Arena</tt> object. Each class which |
|---|
| 175 |
you want Dejavu to manage must be passed to <tt>Arena.register(cls)</tt>. |
|---|
| 176 |
If you create a module with multiple classes, you can register them all |
|---|
| 177 |
at once with <tt>Arena.register_all(globals())</tt>. It will grab any |
|---|
| 178 |
Unit subclasses out of your module's globals() (or any other mapping |
|---|
| 179 |
you pass to <tt>register_all</tt>) and register them.</p> |
|---|
| 180 |
|
|---|
| 181 |
<h3>Sandboxes</h3> |
|---|
| 182 |
<p>During the life of a client connection, your application should create |
|---|
| 183 |
and use a <tt>Sandbox</tt> to manage the set of "live" Units. A Sandbox |
|---|
| 184 |
manages the in-memory lifecycle of Units: creation, identity, mutation, and |
|---|
| 185 |
destruction. Sandboxes route persistence operations on Units to the correct |
|---|
| 186 |
Storage Manager.</p> |
|---|
| 187 |
|
|---|
| 188 |
<p>You can create Sandbox objects directly. They take a single argument, the |
|---|
| 189 |
top-level <tt>Arena</tt> object. Arenas also provide a convenience function, |
|---|
| 190 |
<tt>new_sandbox</tt>, which does this for you. The following lines are |
|---|
| 191 |
equivalent: |
|---|
| 192 |
<pre>box = Sandbox(myArena) |
|---|
| 193 |
|
|---|
| 194 |
box = myArena.new_sandbox()</pre> |
|---|
| 195 |
You might often choose the latter when you have a reference to the Arena |
|---|
| 196 |
object, and would rather avoid importing dejavu yet again just to obtain |
|---|
| 197 |
the Sandbox class.</p> |
|---|
| 198 |
|
|---|
| 199 |
<h4>Memorizing Units</h4> |
|---|
| 200 |
<p>When you create a Unit instance, it exists in isolation. There is no |
|---|
| 201 |
connection between that Unit and storage; your Unit will not be persisted, |
|---|
| 202 |
because Dejavu doesn't yet possess a reference to your Unit. To provide |
|---|
| 203 |
that link, you <i>memorize</i> your Unit (or rather, you tell your Sandbox |
|---|
| 204 |
to memorize it): |
|---|
| 205 |
<pre>class Publisher(Unit): |
|---|
| 206 |
City = UnitProperty(unicode) |
|---|
| 207 |
|
|---|
| 208 |
p = Publisher(ID='Walter J. Black') |
|---|
| 209 |
box.memorize(p)</pre></p> |
|---|
| 210 |
|
|---|
| 211 |
<p>Memorization does several things. First, it places your new Unit into |
|---|
| 212 |
your Arena. That Unit instance will now be persisted by the appropriate |
|---|
| 213 |
Storage Manager. It can be recalled from storage when needed, using the |
|---|
| 214 |
built-in Expression syntax. It may have been given an ID (see |
|---|
| 215 |
<u>Sequencing</u>, below). Memorization also makes your Unit |
|---|
| 216 |
<i>concrete</i>; that is, your Unit will now possess a <tt>sandbox</tt> |
|---|
| 217 |
attribute. Units whose <tt>sandbox</tt> attribute is not set (is None) |
|---|
| 218 |
have no relationships, and their Unit Property triggers (if any) will |
|---|
| 219 |
not fire.</p> |
|---|
| 220 |
|
|---|
| 221 |
<p>You may define special methods on your Units to provide start-of-life |
|---|
| 222 |
behaviors. If a Unit possesses an <tt>on_memorize</tt> method, it will |
|---|
| 223 |
be called after the Unit has been 'reserved' in storage, and after the |
|---|
| 224 |
Unit has ben placed in the Sandbox cache.</p> |
|---|
| 225 |
|
|---|
| 226 |
<h4>Sequencing</h4> |
|---|
| 227 |
<p>Every <tt>Unit</tt> has an <tt>ID</tt> property. The default ID property |
|---|
| 228 |
is of type <tt>int</tt>; however, you can override that to whatever type |
|---|
| 229 |
you like. As long as you provide your own IDs for Units, nothing will |
|---|
| 230 |
break--you can memorize and recall Units without problems. However, if |
|---|
| 231 |
you memorize a Unit with an ID of <tt>None</tt>, the Sandbox may attempt |
|---|
| 232 |
to provide an ID for it.</p> |
|---|
| 233 |
|
|---|
| 234 |
<p>The <tt>Unit</tt> base class possesses a <tt>sequencer</tt> attribute |
|---|
| 235 |
to help Sandboxes generate new IDs. The default value is an instance of |
|---|
| 236 |
<tt>UnitSequencerInteger</tt>, which examines all existing Units, finds |
|---|
| 237 |
the maximum integer ID, adds 1, and uses that value for the new ID.</p> |
|---|
| 238 |
|
|---|
| 239 |
<p>The other useful Sequencer is <tt>UnitSequencerNull</tt>, which simply |
|---|
| 240 |
raises an error when asked to generate an ID. If your ID's are strings, |
|---|
| 241 |
you'll probably want to make that class' <tt>.sequencer</tt> one of |
|---|
| 242 |
these, and form ID values in your own code.</p> |
|---|
| 243 |
|
|---|
| 244 |
<h4>Recalling</h4> |
|---|
| 245 |
<p>Once you have memorized a Unit or two, you will probably want to |
|---|
| 246 |
recall them at some point. Sandboxes possess two member functions to |
|---|
| 247 |
accomplish this.</p> |
|---|
| 248 |
|
|---|
| 249 |
<h5>recall()</h5> |
|---|
| 250 |
<p>First, the appropriately named <tt>recall(cls, expr)</tt> function. |
|---|
| 251 |
This is the full-blown query method. As a first argument, you pass it the |
|---|
| 252 |
class (<b>not</b> the name of the class, but the actual class) of which you |
|---|
| 253 |
expect to retrieve instances. The second argument should be an instance |
|---|
| 254 |
of <tt>dejavu.logic.Expression</tt>, an object which encapsulates your |
|---|
| 255 |
specific query (see <u>Expressions</u>, next). An example recall operation: |
|---|
| 256 |
<pre>>>> e = logic.Expression(lambda x: x.Year == 1928) |
|---|
| 257 |
>>> units = box.recall(Publication, e) |
|---|
| 258 |
>>> [x.Title for x in units] |
|---|
| 259 |
[u'The Giant Horse of Oz', u'Kai Lung Unrolls His Mat', |
|---|
| 260 |
u'Tarzan, The Lord of the Jungle'] |
|---|
| 261 |
</pre> |
|---|
| 262 |
If you do not supply an Expression, all Units of the given Unit class |
|---|
| 263 |
will be retrieved. Notice that the return value is *not* a list; it is a |
|---|
| 264 |
generator (or other iterable). You must iterate over it to retrieve all |
|---|
| 265 |
values. By returning an iterator, we allow some Storage Managers to load |
|---|
| 266 |
Units in a more lazy fashion. If this is a huge burden for you, let me |
|---|
| 267 |
know; I might be convinced to add a <tt>recall_list</tt> method.</p> |
|---|
| 268 |
|
|---|
| 269 |
<p>The <tt>recall</tt> method will take additional arguments in pairs of |
|---|
| 270 |
<tt>cls</tt>, <tt>expr</tt>. This feature isn't fully developed yet. |
|---|
| 271 |
It's designed to emulate JOINs, returning units which match each expr |
|---|
| 272 |
and are related.</p> |
|---|
| 273 |
|
|---|
| 274 |
<p>If your Unit class defines an <tt>on_recall()</tt> method, it will be |
|---|
| 275 |
called when each Unit has been loaded from storage (at the end of the |
|---|
| 276 |
recall process). Once the unit is loaded into a Sandbox, however, |
|---|
| 277 |
<tt>on_recall</tt> will not be called; it's only called at the Sandbox/SM |
|---|
| 278 |
boundary. If <tt>on_recall</tt> raises <tt>UnrecallableError</tt>, the |
|---|
| 279 |
unit will not be yielded back to the caller, nor placed in the Sandbox |
|---|
| 280 |
cache.</p> |
|---|
| 281 |
|
|---|
| 282 |
<h5>unit()</h5> |
|---|
| 283 |
<p>The <tt>recall</tt> method can be verbose. When you want a one-liner |
|---|
| 284 |
and only expect a single Unit, use the <tt>unit(cls, **kw)</tt> method |
|---|
| 285 |
of Sandboxes. Again, you pass the class of Units you wish to retrieve |
|---|
| 286 |
as the first argument. Then, supply keyword arguments of the form |
|---|
| 287 |
"property_name=value". The method will form an equivalent Expression |
|---|
| 288 |
for you from the keyword args. For example: |
|---|
| 289 |
<pre>>>> book = box.unit(Publication, ID=1) |
|---|
| 290 |
>>> if book: |
|---|
| 291 |
... print book.Title |
|---|
| 292 |
u'Ladies in Hades'</pre> |
|---|
| 293 |
If a Unit is not found that matches the criteria, None is returned. |
|---|
| 294 |
If multiple Units match the criteria, only the first one is returned |
|---|
| 295 |
(although the rest are probably loaded into memory).</p> |
|---|
| 296 |
|
|---|
| 297 |
<h4>Forgetting and Repressing</h4> |
|---|
| 298 |
<p>To <i>forget</i> a Unit is to destroy it forever. You have two options |
|---|
| 299 |
for forgetting Units: you can call <tt>Sandbox().forget(unit)</tt> or |
|---|
| 300 |
the simpler version, <tt>Unit().forget()</tt>. Either of these will clear |
|---|
| 301 |
the Unit from the Sandbox' cache, and the Sandbox will tell the appropriate |
|---|
| 302 |
Storage Manager to destroy the stored Unit data. If a Unit has not yet |
|---|
| 303 |
been memorized, you do not need to forget it.</p> |
|---|
| 304 |
|
|---|
| 305 |
<p>In some circumstances, you may wish to only clear the Unit from the |
|---|
| 306 |
Sandbox without destroying it. You can do this by calling either |
|---|
| 307 |
<tt>Sandbox().repress(unit)</tt> or the simpler version, |
|---|
| 308 |
<tt>Unit().repress()</tt>.</p> |
|---|
| 309 |
|
|---|
| 310 |
<p>You may define special methods on your Units to provide end-of-life |
|---|
| 311 |
behaviors. If a Unit possesses an <tt>on_forget</tt> method, it will |
|---|
| 312 |
be called after the Unit has been destroyed. If a Unit possesses an |
|---|
| 313 |
<tt>on_repress</tt> method, it will be called <i>before</i> the Unit |
|---|
| 314 |
has been repressed. I'm sure there was a good reason for this |
|---|
| 315 |
disparity, but I've forgotten (or perhaps repressed) it.</p> |
|---|
| 316 |
|
|---|
| 317 |
<h4>Flushing Sandboxes</h4> |
|---|
| 318 |
<p>When the client connection has closed, you should <i>flush</i> the |
|---|
| 319 |
Sandbox caches. In general, a single call to <tt>Sandbox().flush_all()</tt> |
|---|
| 320 |
will do the trick. Notice that flushing calls <tt>repress()</tt> for each |
|---|
| 321 |
Unit in the Sandbox, and any <tt>on_repress()</tt> triggers will be |
|---|
| 322 |
executed.</p> |
|---|
| 323 |
|
|---|
| 324 |
|
|---|
| 325 |
<h4>Aggregate Functions</h4> |
|---|
| 326 |
<p>Sandboxes also provide a <tt>distinct(cls, attrs, expr=None)</tt> |
|---|
| 327 |
function. This returns values, rather than Units. Put simply, it returns |
|---|
| 328 |
all distinct values for the given attribute(s) of the Unit class provided. |
|---|
| 329 |
If only one attribute is specified, a list of values will be returned. |
|---|
| 330 |
If more than one attribute is specified, a zipped list will be returned |
|---|
| 331 |
of all distinct existing combinations. Providing an expr argument (an |
|---|
| 332 |
<tt>Expression</tt> object, see below) will filter the set of Units before |
|---|
| 333 |
obtaining distinct values.</p> |
|---|
| 334 |
|
|---|
| 335 |
<p>The <tt>distinct</tt> function can also be used as a <tt>count</tt> |
|---|
| 336 |
function by passing attrs = ['ID']. Sandboxes provide a |
|---|
| 337 |
<tt>count(cls, expr)</tt> method which does just this.</p> |
|---|
| 338 |
|
|---|
| 339 |
<h3>Querying</h3> |
|---|
| 340 |
<p>When you retrieve Units, you often don't want to load the entire set for |
|---|
| 341 |
a given class. In Dejavu, you filter the set according to the UnitProperty |
|---|
| 342 |
attributes for each object. Naturally, there must be a way to express |
|---|
| 343 |
the filter you intend. Dejavu actually provides three ways, all in the |
|---|
| 344 |
<tt>dejavu.logic</tt> module: <tt>Expression</tt>, |
|---|
| 345 |
<tt>filter</tt>, and <tt>comparison</tt>.</p> |
|---|
| 346 |
|
|---|
| 347 |
<h4>The <tt>Expression</tt> class</h4> |
|---|
| 348 |
<p>Regardless of which technique you use to express your filter, you're |
|---|
| 349 |
going to end up with a <tt>logic.Expression</tt> object. You can build |
|---|
| 350 |
an Expression directly, passing a single lambda as an argument: |
|---|
| 351 |
<pre>>>> from dejavu import logic |
|---|
| 352 |
>>> import datetime |
|---|
| 353 |
>>> f = lambda x: x.Date >= datetime.date(2004, 3, 1) |
|---|
| 354 |
>>> e = logic.Expression(f) |
|---|
| 355 |
>>> e |
|---|
| 356 |
logic.Expression(lambda x: x.Date >= datetime.date(2004, 3, 1))</pre> |
|---|
| 357 |
Neat, eh? I worked hard on that __repr__. ;)</p> |
|---|
| 358 |
|
|---|
| 359 |
<p>It may be obvious, but we'll be explicit, here. The lambda which you pass |
|---|
| 360 |
into an Expression must possess a single positional argument, which will |
|---|
| 361 |
always be bound to a Unit instance. In the example above, it's named 'x', |
|---|
| 362 |
but you can use any name you like. Using lambdas as a base means that we |
|---|
| 363 |
can simply call <tt>Expression.evaluate(unit)</tt>, and receive a boolean |
|---|
| 364 |
value indicating whether our Unit "passes the test". Attribute lookups on |
|---|
| 365 |
our 'x' object will apply to Unit Properties for that Unit object. |
|---|
| 366 |
That is, <tt>x.Date</tt> becomes <tt>unit.Date</tt>.</p> |
|---|
| 367 |
|
|---|
| 368 |
<h4>Early binding</h4> |
|---|
| 369 |
<p>What is not obvious from the above code snippet is perhaps the <b>most |
|---|
| 370 |
important aspect</b> of Expressions: any globals or cell references (from |
|---|
| 371 |
closures) in the supplied lambda get <b>bound early</b>. Compare the |
|---|
| 372 |
following disassemblies: |
|---|
| 373 |
<pre>>>> import dis |
|---|
| 374 |
>>> dis.dis(f) |
|---|
| 375 |
1 0 LOAD_FAST 0 (x) |
|---|
| 376 |
3 LOAD_ATTR 1 (Date) |
|---|
| 377 |
6 LOAD_GLOBAL 2 (datetime) |
|---|
| 378 |
9 LOAD_ATTR 3 (date) |
|---|
| 379 |
12 LOAD_CONST 1 (2004) |
|---|
| 380 |
15 LOAD_CONST 2 (3) |
|---|
| 381 |
18 LOAD_CONST 3 (1) |
|---|
| 382 |
21 CALL_FUNCTION 3 |
|---|
| 383 |
24 COMPARE_OP 5 (>=) |
|---|
| 384 |
27 RETURN_VALUE |
|---|
| 385 |
>>> dis.dis(e.func) |
|---|
| 386 |
1 0 LOAD_FAST 0 (x) |
|---|
| 387 |
3 LOAD_ATTR 1 (Date) |
|---|
| 388 |
6 LOAD_CONST 6 (datetime.date(2004, 3, 1)) |
|---|
| 389 |
9 COMPARE_OP 5 (>=) |
|---|
| 390 |
12 RETURN_VALUE |
|---|
| 391 |
</pre> |
|---|
| 392 |
As you can see, the function itself references the global 'datetime' module. |
|---|
| 393 |
Once we wrap it in the Expression, however, it becomes a constant! Thanks to |
|---|
| 394 |
Raymond Hettinger for inspiring this solution <a href='#hettinger'>[1]</a>. |
|---|
| 395 |
Early binding, however, implies two consequences:</p> |
|---|
| 396 |
|
|---|
| 397 |
<p>First, any globals or cell references must be present in the lambda's |
|---|
| 398 |
scope when it is passed into Expression(). This is the norm and shouldn't |
|---|
| 399 |
require too much thought from you when you write Expressions. In the |
|---|
| 400 |
example above, we simply imported <tt>datetime</tt> as you would expect.</p> |
|---|
| 401 |
|
|---|
| 402 |
<p>Second, any globals or cell references must <b>also</b> be present in |
|---|
| 403 |
the <tt>logic</tt> module's globals when the Expression is unpickled. |
|---|
| 404 |
Pickling occurs when Expressions are sent over sockets, and also if |
|---|
| 405 |
Expressions are themselves persisted to storage (for example, see |
|---|
| 406 |
<u>Unit Engines</u>, below). This means your application should inject |
|---|
| 407 |
globals into the <tt>logic</tt> module. Note that the <tt>logic</tt> module |
|---|
| 408 |
already tries to import <tt>datetime</tt>, <tt>fixedpoint</tt> and |
|---|
| 409 |
<tt>decimal</tt>.</p> |
|---|
| 410 |
|
|---|
| 411 |
<h4>External functions within Expressions</h4> |
|---|
| 412 |
<p>Dejavu provides additional functions which can be used in Expressions. |
|---|
| 413 |
For example, you can construct an Expression like: |
|---|
| 414 |
<pre>logic.Expression(lambda x: x.Size < 3 and x.Date > dejavu.today())</pre> |
|---|
| 415 |
In this example, the <tt>today()</tt> function breaks convention and is |
|---|
| 416 |
actually <b>bound late</b>. That is, if you construct this Expression now |
|---|
| 417 |
and use it six months later, the value of <tt>today()</tt> will change. |
|---|
| 418 |
Storage Managers "know about" these dejavu functions, and can use them |
|---|
| 419 |
to build more appropriate queries. Here are the functions supplied by |
|---|
| 420 |
the <tt>dejavu</tt> module:</p> |
|---|
| 421 |
|
|---|
| 422 |
<table> |
|---|
| 423 |
<tr><th>Function</th><th>Late bound?</th><th>Description</th></tr> |
|---|
| 424 |
<tr> |
|---|
| 425 |
<td><tt>icontains(a, b)</tt></td> |
|---|
| 426 |
<td></td> |
|---|
| 427 |
<td>Case-insensitive test b in a. Note the operand order.</td> |
|---|
| 428 |
</tr> |
|---|
| 429 |
<tr> |
|---|
| 430 |
<td><tt>icontainedby(a, b)</tt></td> |
|---|
| 431 |
<td></td> |
|---|
| 432 |
<td>Case-insensitive test a in b. Note the operand order.</td> |
|---|
| 433 |
</tr> |
|---|
| 434 |
<tr> |
|---|
| 435 |
<td><tt>istartswith(a, b)</tt></td> |
|---|
| 436 |
<td></td> |
|---|
| 437 |
<td>True if a starts with b (case-insensitive), False otherwise.</td> |
|---|
| 438 |
</tr> |
|---|
| 439 |
<tr> |
|---|
| 440 |
<td><tt>iendswith(a, b)</tt></td> |
|---|
| 441 |
<td></td> |
|---|
| 442 |
<td>True if a ends with b (case-insensitive), False otherwise.</td> |
|---|
| 443 |
</tr> |
|---|
| 444 |
<tr> |
|---|
| 445 |
<td><tt>ieq(a, b)</tt></td> |
|---|
| 446 |
<td></td> |
|---|
| 447 |
<td>True if a == b (case-insensitive), False otherwise.</td> |
|---|
| 448 |
</tr> |
|---|
| 449 |
<tr> |
|---|
| 450 |
<td><tt>year(value)</tt></td> |
|---|
| 451 |
<td></td> |
|---|
| 452 |
<td>The year attribute of a date. If value is None, return None.</td> |
|---|
| 453 |
</tr> |
|---|
| 454 |
<tr> |
|---|
| 455 |
<td><tt>now()</tt></td> |
|---|
| 456 |
<td>Y</td> |
|---|
| 457 |
<td>datetime.datetime.now()</td> |
|---|
| 458 |
</tr> |
|---|
| 459 |
<tr> |
|---|
| 460 |
<td><tt>today()</tt></td> |
|---|
| 461 |
<td>Y</td> |
|---|
| 462 |
<td>datetime.date.today()</td> |
|---|
| 463 |
</tr> |
|---|
| 464 |
<tr> |
|---|
| 465 |
<td><tt>iscurrentweek(value)</tt></td> |
|---|
| 466 |
<td>Y</td> |
|---|
| 467 |
<td>If value is in the current week, return True, else False.</td> |
|---|
| 468 |
</tr> |
|---|
| 469 |
</table> |
|---|
| 470 |
|
|---|
| 471 |
<p>It is possible for you, the application developer, to define your |
|---|
| 472 |
own external functions. However, because Storage Managers are unaware |
|---|
| 473 |
of your new functions, they will not be able to optimize their use; |
|---|
| 474 |
instead, they will simply retrieve a larger set of objects from storage, |
|---|
| 475 |
evaluate each one against the function you provide, and return those |
|---|
| 476 |
Units which match your function. This isn't necessarily a bad thing; |
|---|
| 477 |
it provides the same functionality as if you wrote the test inline |
|---|
| 478 |
within your own code. By making that test a logic function, you allow |
|---|
| 479 |
it to be stored in Engine <i>rules</i> (see <u>Unit Engines</u>, |
|---|
| 480 |
below).</p> |
|---|
| 481 |
|
|---|
| 482 |
<h4>Combining Expressions</h4> |
|---|
| 483 |
<p>Expressions are combinable; by using the <tt>&</tt> operator, the two |
|---|
| 484 |
expressions are combined with an adjoining logical "and". For example: |
|---|
| 485 |
<pre>>>> a = logic.Expression(lambda x: x.Size > 3) |
|---|
| 486 |
>>> b = logic.Expression(lambda x: x.Size <= 15) |
|---|
| 487 |
>>> c = a & b |
|---|
| 488 |
>>> c |
|---|
| 489 |
logic.Expression(lambda x: (x.Size > 3) and (x.Size <= 15))</pre> |
|---|
| 490 |
The <tt>+</tt> operator works just like the <tt>&</tt> operator. The |
|---|
| 491 |
<tt>|</tt> operator combines the two Expressions with a logical 'or'.</p> |
|---|
| 492 |
|
|---|
| 493 |
<h4>Using <tt>filter</tt> to form Expressions</h4> |
|---|
| 494 |
<p>The <tt>logic</tt> module also provides convenient methods to |
|---|
| 495 |
create common types of Expression objects via the <tt>filter</tt> and |
|---|
| 496 |
<tt>comparison</tt> factory functions.</p> |
|---|
| 497 |
|
|---|
| 498 |
<p>The <tt>filter(**kwargs)</tt> function produces an Expression by taking |
|---|
| 499 |
the keyword arguments you supply, and rewriting them in lambda form. The |
|---|
| 500 |
only operator allowed is therefore the equals '==' operator. For example: |
|---|
| 501 |
<pre>>>> logic.filter(Type='Cat', Mutation='Atomic') |
|---|
| 502 |
logic.Expression(lambda x: (x.Type == 'Cat') and (x.Mutation == 'Atomic'))</pre> |
|---|
| 503 |
</p> |
|---|
| 504 |
|
|---|
| 505 |
<h4>Using <tt>comparison</tt> to form Expressions</h4> |
|---|
| 506 |
<p>The <tt>comparison(attr, cmp_op, criteria)</tt> function allows you to |
|---|
| 507 |
form Expressions with dynamic operators. This can come in handy when you |
|---|
| 508 |
are constructing Expressions on the fly from user input. For example, a |
|---|
| 509 |
search page might prompt users for an attribute name, an operator, and an |
|---|
| 510 |
operand (the criteria).</p> |
|---|
| 511 |
|
|---|
| 512 |
<p>Borrowing from <tt>opcode.cmp_op</tt>, the allowed values for our cmp_op |
|---|
| 513 |
argument are as follows:</p> |
|---|
| 514 |
<table> |
|---|
| 515 |
<tr><th>Numeric Value (cmp_op)</th><th>Operator</th></tr> |
|---|
| 516 |
<tr><td>0</td><td><</td></tr> |
|---|
| 517 |
<tr><td>1</td><td><=</td></tr> |
|---|
| 518 |
<tr><td>2</td><td>==</td></tr> |
|---|
| 519 |
<tr><td>3</td><td>!=</td></tr> |
|---|
| 520 |
<tr><td>4</td><td>></td></tr> |
|---|
| 521 |
<tr><td>5</td><td>>=</td></tr> |
|---|
| 522 |
<tr><td>6</td><td>in</td></tr> |
|---|
| 523 |
<tr><td>7</td><td>not in</td></tr> |
|---|
| 524 |
<tr><td>8</td><td>is</td></tr> |
|---|
| 525 |
<tr><td>9</td><td>is not</td></tr> |
|---|
| 526 |
</table> |
|---|
| 527 |
|
|---|
| 528 |
<p>Here's an example of using <tt>comparison</tt>: |
|---|
| 529 |
<pre>>>> logic.comparison('Name', 3, 'Mr. Kamikaze') |
|---|
| 530 |
logic.Expression(lambda x: x.Name != 'Mr. Kamikaze')</pre> |
|---|
| 531 |
Although the comparison function only allows a single comparison at a time, |
|---|
| 532 |
the resulting Expressions can be combined with the <tt>&</tt> and <tt>|</tt> |
|---|
| 533 |
operators (described earlier) to produce more complex Expressions.</p> |
|---|
| 534 |
|
|---|
| 535 |
<h4>Exporting the <tt>logic</tt> module</h4> |
|---|
| 536 |
<p>The <tt>logic</tt> module (and <tt>codewalk</tt>, on which it is built) |
|---|
| 537 |
isn't limited to Dejavu. Feel free to use it in some other framework or |
|---|
| 538 |
script! The only change you may have to make (if you relocate the module |
|---|
| 539 |
outside of the <tt>dejavu</tt> package) would be to the single line: |
|---|
| 540 |
<tt>from dejavu import codewalk</tt>, to point to the new location.</p> |
|---|
| 541 |
|
|---|
| 542 |
<p>In particular, <tt>logic.Expression</tt> objects can operate on <i>any</i> |
|---|
| 543 |
Python object, not just dejavu <tt>Unit</tt> instances. If you wish to |
|---|
| 544 |
provide additional logic functions (as dejavu does), simply inject them |
|---|
| 545 |
into <tt>logic</tt>'s globals.</p> |
|---|
| 546 |
|
|---|
| 547 |
<p>You may also find the underlying <tt>codewalk</tt> module useful for |
|---|
| 548 |
other purposes on its own. The <tt>Visitor</tt> base class can be very |
|---|
| 549 |
convenient for building bytecode hacks.</p> |
|---|
| 550 |
|
|---|
| 551 |
<p>To make a long story short, Dejavu depends on <tt>logic</tt> throughout, |
|---|
| 552 |
but the reverse is not true.</p> |
|---|
| 553 |
|
|---|
| 554 |
|
|---|
| 555 |
<h3>Associations between Unit Classes</h3> |
|---|
| 556 |
<p>Once you've put together some Unit classes, chances are you're going to |
|---|
| 557 |
want to associate them. Generally, this is accomplished by creating a |
|---|
| 558 |
property in the Unit_B class which stores IDs of Unit_A objects (which |
|---|
| 559 |
might be called <i>foreign keys</i> in a database context). |
|---|
| 560 |
<pre>class Archaeologist(Unit): |
|---|
| 561 |
Height = UnitProperty(float) |
|---|
| 562 |
|
|---|
| 563 |
class Biography(Unit): |
|---|
| 564 |
ArchID = UnitProperty(int)</pre> |
|---|
| 565 |
In this example, each <tt>Biography</tt> object will have an <tt>ArchID</tt> |
|---|
| 566 |
attribute, which will equal the <tt>ID</tt> of some <tt>Archaeologist</tt>. |
|---|
| 567 |
In Dejavu terms, we say that there is a <i>near class</i> (with a <i>near |
|---|
| 568 |
key</i>) and a <i>far class</i> (with a <i>far key</i>). Associations in |
|---|
| 569 |
Dejavu are not one-way, so it doesn't matter which class you choose for the |
|---|
| 570 |
"near" one and which for the "far" one.</p> |
|---|
| 571 |
|
|---|
| 572 |
<p>You could stop at this point in your design, and simply remember what |
|---|
| 573 |
these keys are and how they relate, and manipulate them accordingly. But |
|---|
| 574 |
Dejavu allows you to explicitly declare these associations: |
|---|
| 575 |
<pre>dejavu.associate(Archaeologist, 'ID', Biography, 'ArchID')</pre> |
|---|
| 576 |
You pass in the near class, the near key, the far class, and the far key. |
|---|
| 577 |
</p> |
|---|
| 578 |
|
|---|
| 579 |
<p>What does an explicit association buy for you? First, Arenas discover them |
|---|
| 580 |
and fill the <tt>Arena.associations</tt> registry, so that smart consumer |
|---|
| 581 |
code (like Unit Engine Rules, below) can automatically follow association |
|---|
| 582 |
paths for you. Second, each Unit class has a private <tt>_associations</tt> |
|---|
| 583 |
attribute, a <tt>dict</tt>. Each Unit involved in in the association gains |
|---|
| 584 |
an entry in that dict: the key is the far class itself (not the class name), |
|---|
| 585 |
and the value is a tuple of (near key, far key).</p> |
|---|
| 586 |
|
|---|
| 587 |
<h4><tt>related_units</tt> methods</h4> |
|---|
| 588 |
<p>In addition, each of the two Unit classes will gain a new |
|---|
| 589 |
<i>related_units</i> method which simplifies looking up related instances |
|---|
| 590 |
of the other class. The new method for Unit_B will have the name of Unit_A, |
|---|
| 591 |
and vice-versa. In our example: |
|---|
| 592 |
<pre>>>> Archaeologist.Biography |
|---|
| 593 |
<unbound method Archaeologist.related_units> |
|---|
| 594 |
>>> Eversley = Archaeologist(Height=(6.417)) |
|---|
| 595 |
>>> Eversley.Biography |
|---|
| 596 |
<bound method Archaeologist.related_units of <__main__.Archaeologist |
|---|
| 597 |
object at 0x011A1930>> |
|---|
| 598 |
>>> bios = Eversley.Biography() |
|---|
| 599 |
>>> bios |
|---|
| 600 |
<listiterator object at 0x012150D0> |
|---|
| 601 |
>>> list(bios) |
|---|
| 602 |
[] |
|---|
| 603 |
</pre> |
|---|
| 604 |
We haven't created any Biographies, so there aren't any to be recalled, |
|---|
| 605 |
which is why we get an empty iterator at this point. At the other extreme |
|---|
| 606 |
(when you have hundreds of Biographies to filter), you can pass an optional |
|---|
| 607 |
<tt>Expression</tt> object to the related_units method. When you do, the |
|---|
| 608 |
list of associated Units will be filtered accordingly.</p> |
|---|
| 609 |
|
|---|
| 610 |
<p>Because the related_units method names are formed automatically, you need |
|---|
| 611 |
to take care not to use the names of Unit classes for your Unit properties. |
|---|
| 612 |
In our example, we used "ArchID" for the name of our "foreign key". |
|---|
| 613 |
If we had used "Archaeologist" instead, we would have had problems; |
|---|
| 614 |
when we associated the classes, the <i>property</i> named "Archaeologist" |
|---|
| 615 |
would have collided with the <i>related_units method</i> named |
|---|
| 616 |
"Archaeologist". Be careful when naming your properties, and plan for the |
|---|
| 617 |
future.</p> |
|---|
| 618 |
|
|---|
| 619 |
<p>Unlike some other ORM's, Dejavu doesn't cache far Units within the near |
|---|
| 620 |
Unit. Each time you call the related_units method, the data is recalled |
|---|
| 621 |
from your Sandbox. It is quite probable that those far Units are still |
|---|
| 622 |
sitting in memory in the Sandbox, but they're not going to persist in |
|---|
| 623 |
the near Unit itself in any way.</p> |
|---|
| 624 |
|
|---|
| 625 |
<p>Finally, some of you may want to override the default related_units |
|---|
| 626 |
methods. Feel free; <tt>associate</tt> takes two optional arguments, which |
|---|
| 627 |
should be callables that create and return the new method(s). See the source |
|---|
| 628 |
code of <tt>dejavu</tt> and the method <tt>dejavu.relation_factory</tt> |
|---|
| 629 |
for more information.</p> |
|---|
| 630 |
|
|---|
| 631 |
<h4><tt>Unit.first()</tt></h4> |
|---|
| 632 |
<p>Associations also enable the <tt>first</tt> method of Units. It's an |
|---|
| 633 |
easy way to get a single related unit. Call it with a far Class and, |
|---|
| 634 |
optionally, keyword arguments. The method will look up the related |
|---|
| 635 |
properties and call sandbox.unit() for you, returning either the first |
|---|
| 636 |
such far Unit or None if not found.</p> |
|---|
| 637 |
|
|---|
| 638 |
<h3>Unit Engines</h3> |
|---|
| 639 |
<p>Once you've created and associated your Unit classes, you can begin to |
|---|
| 640 |
write "business logic" code (mostly inside those classes, we hope), and |
|---|
| 641 |
"presentation logic" code (mostly outside those classes). In most cases, |
|---|
| 642 |
you will construct Expressions within your own code manually to retrieve |
|---|
| 643 |
Units. Sometimes, however, you need to persist query parameters from your |
|---|
| 644 |
users; in other cases, you might store a list of Units which match a query |
|---|
| 645 |
(regardless of who formed the necessary Expression). Finally, you might |
|---|
| 646 |
wish to manipulate lists of Units as sets: differences, intersections, |
|---|
| 647 |
and unions. The <tt>engines</tt> module addresses all of these needs.</p> |
|---|
| 648 |
|
|---|
| 649 |
<h4>Collections: Lists of Units</h4> |
|---|
| 650 |
<p>The <tt>UnitCollection</tt> class provides a means of storing a list |
|---|
| 651 |
of Units, or rather, a list of Unit ID's. You use its <tt>Type</tt> |
|---|
| 652 |
property to indicate the class of the indexed Units. That value should be |
|---|
| 653 |
the <b>name</b> of the Unit Class, <b>not</b> the class object itself |
|---|
| 654 |
(this is different than most other calls in Dejavu). If you need to |
|---|
| 655 |
retrieve the actual Unit class, call <tt>UnitCollection().unit_class()</tt>.</p> |
|---|
| 656 |
|
|---|
| 657 |
<p><tt>UnitCollection</tt> itself subclasses <tt>dejavu.Unit</tt>; you can |
|---|
| 658 |
therefore persist Unit Collections via Dejavu Storage Managers (most SM's, |
|---|
| 659 |
anyway; it's recommended that SM's handle Unit Collections, but not |
|---|
| 660 |
required. Check your SM to see if it does).</p> |
|---|
| 661 |
|
|---|
| 662 |
<p>Each Collection has a thread lock (an RLock, actually) which you should |
|---|
| 663 |
<tt>acquire()</tt> before you add an ID to the set, and <tt>release()</tt> |
|---|
| 664 |
afterward. If you use the <tt>add(ID)</tt> method, this locking is done |
|---|
| 665 |
for you.</p> |
|---|
| 666 |
|
|---|
| 667 |
<p>When you need to retrieve the actual Units which are indexed by the |
|---|
| 668 |
Collection, call the <tt>units(quota=None)</tt> method, which will |
|---|
| 669 |
look up the Units and return them in a list. Since the Collection only |
|---|
| 670 |
stores ID's, it is possible that one of the indexed Units may have been |
|---|
| 671 |
destroyed since the list was built. The <tt>units</tt> method simply |
|---|
| 672 |
passes over these "phantom" Units. You can inspect the full list of IDs |
|---|
| 673 |
in the Collection (whether they reference existing Units or not) with |
|---|
| 674 |
the <tt>ids()</tt> method.</p> |
|---|
| 675 |
|
|---|
| 676 |
<p>Collections also provide a convenience function for grouping Units |
|---|
| 677 |
by attribute: <tt>xdict(attr)</tt>. This function will look up each Unit |
|---|
| 678 |
in the Collection, inspect the attribute that you specify, and return |
|---|
| 679 |
a dictionary of the form <tt>{attr_val1: [Unit, Unit, ...]}</tt>. |
|---|
| 680 |
Each distinct attribute value will have its own key, with a list of |
|---|
| 681 |
matching Units as the value.</p> |
|---|
| 682 |
|
|---|
| 683 |
<h4>Engines</h4> |
|---|
| 684 |
<p>You can form Collections by hand, but a more powerful technique is |
|---|
| 685 |
the <tt>UnitEngine</tt>, a factory for Collections. Engines are very |
|---|
| 686 |
simple: they possess a set of <i>rules</i> which are executed when |
|---|
| 687 |
you want to take a <i>snapshot</i> of Units. The snapshot which is |
|---|
| 688 |
produced is a <tt>UnitCollection</tt> object. Whenever you call |
|---|
| 689 |
<tt>take_snapshot()</tt>, the Engine will maintain an association |
|---|
| 690 |
to the resulting Collection. You can access past snapshots with the |
|---|
| 691 |
<tt>snapshots()</tt> method.</p> |
|---|
| 692 |
|
|---|
| 693 |
<p>Engines are themselves Units, and can be persisted via Storage Managers. |
|---|
| 694 |
The only properties they possess are: an <tt>ID</tt>, a <tt>Name</tt>, |
|---|
| 695 |
an <tt>Owner</tt>, a <tt>FinalClassName</tt>, and <tt>Created</tt>, |
|---|
| 696 |
the creation date of the Engine.</p> |
|---|
| 697 |
|
|---|
| 698 |
<p>The <tt>Owner</tt> property should either be a user name, or one of the |
|---|
| 699 |
reserved names: "Public" and "System". By default, the <tt>permit()</tt> |
|---|
| 700 |
method allows a user read-access to the Engine if they are the Owner, or |
|---|
| 701 |
the Owner is "Public" or "System". Write-access is permitted if the user |
|---|
| 702 |
is the Owner, or the Owner is "Public". Feel free to override |
|---|
| 703 |
<tt>permit()</tt> in a subclass to provide different behaviors.</p> |
|---|
| 704 |
|
|---|
| 705 |
<p>The <tt>FinalClassName</tt> is set for you as you add Rules to the |
|---|
| 706 |
Engine. You can use the value of this property, for example, to tell |
|---|
| 707 |
your users, "Engine #23569 is an 'Armadillo' engine," when it produces |
|---|
| 708 |
Collections of <tt>Armadillo</tt> Units. The only time you might want to |
|---|
| 709 |
set this value is when you first create the Engine, before you have added |
|---|
| 710 |
any Rules.</p> |
|---|
| 711 |
|
|---|
| 712 |
<h4>Rules</h4> |
|---|
| 713 |
<p>Just like Collections and Engines, <tt>UnitEngineRule</tt> is <i>also</i> |
|---|
| 714 |
a subclass of <tt>Unit</tt>, and can be persisted via Storage Managers. All |
|---|
| 715 |
three work together to provide a complete, dynamic, application-level query |
|---|
| 716 |
generator.</p> |
|---|
| 717 |
|
|---|
| 718 |
<p>Okay, so what are Rules? You might say they're a "little language", |
|---|
| 719 |
with the following primitives, or "operations":</p> |
|---|
| 720 |
<table> |
|---|
| 721 |
<tr><th>Operation</th><th>Operand(s)</th><th>Description</th></tr> |
|---|
| 722 |
<tr><th colspan='3'>Operations on a single set</th></tr> |
|---|
| 723 |
<tr> |
|---|
| 724 |
<td>CREATE</td> |
|---|
| 725 |
<td>The classname of the new Type</td> |
|---|
| 726 |
<td>Creates a new Set of the specified Type. All Units of that Type |
|---|
| 727 |
are included in the new Set.</td> |
|---|
| 728 |
</tr> |
|---|
| 729 |
<tr> |
|---|
| 730 |
<td>FILTER</td> |
|---|
| 731 |
<td>A <tt>logic.Expression</tt></td> |
|---|
| 732 |
<td>Removes Units from the current Set which do not match the |
|---|
| 733 |
Expression.</td> |
|---|
| 734 |
</tr> |
|---|
| 735 |
<tr> |
|---|
| 736 |
<td>FUNCTION</td> |
|---|
| 737 |
<td>The name of a function in the <tt>Arena.engine_functions</tt> |
|---|
| 738 |
dict</td> |
|---|
| 739 |
<td>Calls the function, passing the current Set. The function |
|---|
| 740 |
should modify the Set.</td> |
|---|
| 741 |
</tr> |
|---|
| 742 |
<tr> |
|---|
| 743 |
<td>TRANSFORM</td> |
|---|
| 744 |
<td>The classname of the new Type</td> |
|---|
| 745 |
<td>Transform the current Set into a Set of associated Units |
|---|
| 746 |
(of another Type). The association must be present in the |
|---|
| 747 |
<tt>Arena.associations</tt> graph.</td> |
|---|
| 748 |
</tr> |
|---|
| 749 |
<tr> |
|---|
| 750 |
<td>RETURN</td> |
|---|
| 751 |
<td></td> |
|---|
| 752 |
<td>Optional. If omitted, the last Set handled is returned as the |
|---|
| 753 |
snapshot. If supplied, the ID of the Set to return.</td> |
|---|
| 754 |
</tr> |
|---|
| 755 |
<tr><th colspan='3'>Operations on two sets</th></tr> |
|---|
| 756 |
<tr> |
|---|
| 757 |
<td>COPY</td> |
|---|
| 758 |
<td>The Set ID of the new Set</td> |
|---|
| 759 |
<td>Copies the current Set to a new Set. The current Set is unchanged.</td> |
|---|
| 760 |
</tr> |
|---|
| 761 |
<tr> |
|---|
| 762 |
<td>DIFFERENCE</td> |
|---|
| 763 |
<td>The ID of the Set to mix in</td> |
|---|
| 764 |
<td>Removes IDs from the current Set which exist in the second Set.</td> |
|---|
| 765 |
</tr> |
|---|
| 766 |
<tr> |
|---|
| 767 |
<td>INTERSECTION</td> |
|---|
| 768 |
<td>The ID of the Set to mix in</td> |
|---|
| 769 |
<td>Removes IDs from the current Set which <i>do not</i> exist in the |
|---|
| 770 |
second Set.</td> |
|---|
| 771 |
</tr> |
|---|
| 772 |
<tr> |
|---|
| 773 |
<td>UNION</td> |
|---|
| 774 |
<td>The ID of the Set to mix in</td> |
|---|
| 775 |
<td>Adds any IDs to the current Set which exist in the second Set.</td> |
|---|
| 776 |
</tr> |
|---|
| 777 |
</table> |
|---|
| 778 |
|
|---|
| 779 |
<p>Each Rule has an <tt>Operation</tt> property (a string, one of the above), |
|---|
| 780 |
a <tt>SetID</tt>, and an <tt>Operand</tt>. Here's an example ruleset:</p> |
|---|
| 781 |
<table> |
|---|
| 782 |
<tr><th>Sequence</th><th>Operation</th><th>SetID</th><th>Operand</th></tr> |
|---|
| 783 |
<tr><td>1</td><td>CREATE</td><td>1</td><td>Invoice</td></tr> |
|---|
| 784 |
<tr><td>2</td><td>FILTER</td><td>1</td><td>(Expression)</td></tr> |
|---|
| 785 |
<tr><td>3</td><td>CREATE</td><td>2</td><td>Inventory</td></tr> |
|---|
| 786 |
<tr><td>4</td><td>FILTER</td><td>2</td><td>(Expression)</td></tr> |
|---|
| 787 |
<tr><td>5</td><td>TRANSFORM</td><td>2</td><td>Invoice</td></tr> |
|---|
| 788 |
<tr><td>6</td><td>DIFFERENCE</td><td>1</td><td>2</td></tr> |
|---|
| 789 |
<tr><td>7</td><td>RETURN</td><td>1</td><td></td></tr> |
|---|
| 790 |
</table> |
|---|
| 791 |
|
|---|
| 792 |
<p>As you can see, every Rule operates on a <i>Set</i> of Units. The first |
|---|
| 793 |
rule is always to CREATE a set, declaring it to contain a certain Type |
|---|
| 794 |
of Units. In most cases, you will then FILTER that set. If you simply |
|---|
| 795 |
created a set and then returned it, it would contain all Units of the |
|---|
| 796 |
declared Type. When you filter a set, however, you remove Units from |
|---|
| 797 |
the whole which do not match the filter's Expression.</p> |
|---|
| 798 |
|
|---|
| 799 |
<p>In the example above, we CREATE a second Set so that we can eventually |
|---|
| 800 |
obtain the DIFFERENCE between Set 1 and Set 2. The second Set contains |
|---|
| 801 |
Units of a different Type than the first. Once we filter Set 2, we then |
|---|
| 802 |
TRANSFORM it; for each Inventory Unit, we look up associated Invoice |
|---|
| 803 |
Units. Then, we find the difference between the two Invoice sets and |
|---|
| 804 |
RETURN it.</p> |
|---|
| 805 |
|
|---|
| 806 |
<p>Rules are executed in order according to their <tt>Sequence</tt> |
|---|
| 807 |
attribute (lowest first). When you use the <tt>Engine.add_rule</tt> method, |
|---|
| 808 |
the next <tt>Sequence</tt> value is retrieved for you. Notice that each |
|---|
| 809 |
Rule belongs to one and only one Engine; they are not shared between |
|---|
| 810 |
Engines. Each Rule has its own <tt>EngineID</tt> attribute.</p> |
|---|
| 811 |
|
|---|
| 812 |
<h4>Engine Functions</h4> |
|---|
| 813 |
<p>The FUNCTION rule deserves special mention. The Operand of a FUNCTION |
|---|
| 814 |
rule is a string, a key in the <tt>Arena.engine_functions</tt> dictionary. |
|---|
| 815 |
When the rule is executed, that key is used to look up the function, which |
|---|
| 816 |
is then called, passing <tt>(sandbox, set)</tt>. The function should |
|---|
| 817 |
mutate the set directly. Use FUNCTION rules to mutate sets in ways which |
|---|
| 818 |
are more complex than those provided by FILTER and TRANSFORM. For example, |
|---|
| 819 |
you might provide a function which removes all but the first Unit in the |
|---|
| 820 |
Set (according to some ordering algorithm).</p> |
|---|
| 821 |
|
|---|
| 822 |
|
|---|
| 823 |
<h3>Analysis Tools</h3> |
|---|
| 824 |
<p>Dejavu includes various tools to help you manipulate groups of Units.</p> |
|---|
| 825 |
|
|---|
| 826 |
<h4>Sorting Units</h4> |
|---|
| 827 |
<p>When you recall Units, you receive a generator, and must iterate over |
|---|
| 828 |
the values in some way. Often, this is accomplished with a list |
|---|
| 829 |
comprehension: |
|---|
| 830 |
<pre>f = logic.Expression(lambda x: 'Aa' in x.Name) |
|---|
| 831 |
people = [x for x in sandbox.recall(Person, f)] |
|---|
| 832 |
</pre> |
|---|
| 833 |
However, the <tt>recall</tt> method doesn't do any sorting; you must sort |
|---|
| 834 |
your list in your Python code. Dejavu provides a <tt>sort(attrs, |
|---|
| 835 |
descending=False)</tt> function to assist you. It returns a function, which |
|---|
| 836 |
you can then use in Python's sort function. Continuing our example: |
|---|
| 837 |
<pre>sorted_people = people.sort(dejavu.sort('Size', 'Name'))</pre> |
|---|
| 838 |
The most important issue (and the reason we don't just use 2.4's attrgetter), |
|---|
| 839 |
is that any Unit property must allow values of None, which tends to raise |
|---|
| 840 |
errors when compared to values of other types. The function which |
|---|
| 841 |
<tt>sort</tt> creates for you treats None as "less than" any other value.</p> |
|---|
| 842 |
|
|---|
| 843 |
<h4>Cross-tabulation</h4> |
|---|
| 844 |
<p>Cross-tabs (also called <i>aggregate tables</i> or <i>pivot tables</i>) |
|---|
| 845 |
display aggregate information about objects by category. For example, |
|---|
| 846 |
rather than show a list of Safari records, one row per trip, you might |
|---|
| 847 |
wish to show a table where each row represents a Destination, and each |
|---|
| 848 |
column shows the count of Safaris to that Destination for each distinct |
|---|
| 849 |
Year. In this example, we say that the Safaris are "grouped by" their |
|---|
| 850 |
Destination values, and that we "pivot" on the Year values.</p> |
|---|
| 851 |
|
|---|
| 852 |
<p>Dejavu helps you form such a table via the <tt>CrossTab</tt> class. |
|---|
| 853 |
You need to specify the group(s) you wish to use, and the pivot attribute. |
|---|
| 854 |
Finally, you must specify the aggregate function. Here's a code example: |
|---|
| 855 |
<pre> |
|---|
| 856 |
>>> data = ["a", "b", "cc", "bddd", "a4", "b6"] |
|---|
| 857 |
>>> group = lambda x: x.isalpha() |
|---|
| 858 |
>>> pivot = lambda x: x[0] |
|---|
| 859 |
>>> ctab = analysis.CrossTab(data, [group], pivot, dejavu.COUNT) |
|---|
| 860 |
>>> data, columns = ctab.results() |
|---|
| 861 |
>>> data |
|---|
| 862 |
{(True,): {"a": 1, "b": 2, "c": 1}, |
|---|
| 863 |
(False,): {"a": 1, "b": 1}} |
|---|
| 864 |
>>> columns |
|---|
| 865 |
["a", "b", "c"]</pre> |
|---|
| 866 |
You may notice that we're not using Units in our example; the |
|---|
| 867 |
<tt>CrossTab</tt> class is designed to work with any objects. Here's one |
|---|
| 868 |
way to lay out that data:</p> |
|---|
| 869 |
<table> |
|---|
| 870 |
<tr><th>Is Alpha</th><th>a</th><th>b</th><th>c</th></tr> |
|---|
| 871 |
<tr><td>Y</td><td>1</td><td>2</td><td>1</td></tr> |
|---|
| 872 |
<tr><td>N</td><td>1</td><td>1</td><td>0</td></tr> |
|---|
| 873 |
</table> |
|---|
| 874 |
|
|---|
| 875 |
<p>The <tt>results</tt> method returns two values. First, the table |
|---|
| 876 |
itself in the form of a dictionary; each key is a tuple of group values, |
|---|
| 877 |
and the corresponding value is a sub-dictionary. Each sub-dict has keys |
|---|
| 878 |
which are the pivot attribute, and values which equal the aggregates. |
|---|
| 879 |
I know, that was confusing; look at the example. The second value to |
|---|
| 880 |
be returned is a list of the pivot column values; you'll notice they're |
|---|
| 881 |
sorted.</p> |
|---|
| 882 |
|
|---|
| 883 |
<p>The groups and pivot arguments may be either strings or functions. |
|---|
| 884 |
If strings, they must be the names of attributes of the source objects. |
|---|
| 885 |
The final aggfunc argument defaults to COUNT, but may also be SUM. |
|---|
| 886 |
More aggfuncs may arrive in the future.</p> |
|---|
| 887 |
|
|---|
| 888 |
<h3>The Arena Object</h3> |
|---|
| 889 |
<p>The topmost class in Dejavu is the <tt>Arena</tt> class. When building |
|---|
| 890 |
a Dejavu application, you must first create an instance of this class, |
|---|
| 891 |
and must find a way to persist this object across client connections. |
|---|
| 892 |
This can be achieved in multiple ways; web applications, for example, |
|---|
| 893 |
will typically create a single process to serve all requests. Desktop |
|---|
| 894 |
applications will probably create a single Arena object for each |
|---|
| 895 |
running instance of the program.</p> |
|---|
| 896 |
|
|---|
| 897 |
<h4>Loading Stores</h4> |
|---|
| 898 |
<p>You <b>may</b> manually set up Storage Managers by calling |
|---|
| 899 |
<tt>Arena().add_store(name, store, unitClasses)</tt>. But, you |
|---|
| 900 |
probably shouldn't. Instead, allow your deployers to decide for |
|---|
| 901 |
themselves which storage solution(s) to use. You can do this by calling |
|---|
| 902 |
<tt>load(filename)</tt>; pass it the filename of an INI-style file |
|---|
| 903 |
which your deployers can tweak without screwing up your Python code. |
|---|
| 904 |
The next chapter in this reference is completely devoted to educating |
|---|
| 905 |
deployers; point them to it or copy/modify it in your own release docs.</p> |
|---|
| 906 |
|
|---|
| 907 |
<h4>Registering Unit Classes</h4> |
|---|
| 908 |
<p>The <tt>Arena</tt> object maintains a registry of Unit classes called a |
|---|
| 909 |
<tt>roster</tt>. A roster is like a three-way map between Unit classes, |
|---|
| 910 |
their names, and their assigned StorageManagers. You shouldn't manipulate |
|---|
| 911 |
this structure on your own; instead, use the <tt>register</tt> or |
|---|
| 912 |
<tt>register_all</tt> methods to register each Unit class.</p> |
|---|
| 913 |
|
|---|
| 914 |
<p>The <tt>Arena</tt> object also manages the associations between Unit |
|---|
| 915 |
classes in its <tt>associations</tt> attribute, which is a simple, |
|---|
| 916 |
unweighted, undirected graph. Whenever you register a class, the Arena |
|---|
| 917 |
will add its associations to this graph. The only other common operation |
|---|
| 918 |
is to call <tt>.associations.shortest_path(start, end)</tt>, to retrieve |
|---|
| 919 |
the chain of associations between two Unit classes.</p> |
|---|
| 920 |
|
|---|
| 921 |
<hr /> |
|---|
| 922 |
|
|---|
| 923 |
<p><a name='hettinger'>[1]</a> Python Cookbook, |
|---|
| 924 |
<a href='http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277940'>Binding |
|---|
| 925 |
Constants at compile time</a><br /> |
|---|
| 926 |
</p> |
|---|
| 927 |
|
|---|
| 928 |
</body> |
|---|
| 929 |
</html> |
|---|