Contact: fumanchu@aminus.org

Log in as guest/dejavu to create tickets

I think I've seen this ORM somewhere before...

root/trunk/logic.py

Revision 15 (checked in by fumanchu, 9 years ago)

1. Doc updates (modeling.html)
2. Added or_combine to logic.Aggregator, or to logic.Expression

Line 
1 """First-class Expression objects.
2
3 This work, including the source code, documentation
4 and related data, is placed into the public domain.
5
6 The original author is Robert Brewer, Amor Ministries.
7
8 THIS SOFTWARE IS PROVIDED AS-IS, WITHOUT WARRANTY
9 OF ANY KIND, NOT EVEN THE IMPLIED WARRANTY OF
10 MERCHANTABILITY. THE AUTHOR OF THIS SOFTWARE
11 ASSUMES _NO_ RESPONSIBILITY FOR ANY CONSEQUENCE
12 RESULTING FROM THE USE, MODIFICATION, OR
13 REDISTRIBUTION OF THIS SOFTWARE.
14
15 Python evaluates expressions like any other language; however,
16 the expression itself cannot be 'passed around' easily--that is,
17 the expression itself is a code block, not a callable. In most cases,
18 this is not an issue: if an evaluation step needs to be 'first-class',
19 it's usually wrapped up in a function (sometimes anonymous), and
20 that function is passed. This allows lazy evaluation, for example.
21
22 In some cases, however, we wish to manipulate the actual logic of the
23 expression:
24     1. Inspection. Code might form an expression from user input,
25            then take secondary actions depending upon the operands.
26     2. Modification. For example, correction of an expression
27            if it raises an Exception.
28     3. Translation. A common case is converting Python expressions
29            into SQL.
30
31 It is possible to provide these benefits through some combination
32 of the standard modules parser and compiler, and/or via the builtins
33 eval() and exec(). However, these approaches require placing the
34 expression in a string, which introduces problems of substituting
35 user data; for example, ("x.Name == r'%s'" % user_data) will break if
36 user_data contains quote-marks. This is by far not the only example of
37 the abuses of eval(). Solutions using parser and compiler also tend to
38 be quite slow in pure Python.
39
40 This module takes the approach that the Python developer should be
41 able to form first-class Expressions directly from Python code.
42
43     "This, even if the rest were true, which it isn't, is patently
44      impossible, say the doubters."
45        -- The Restaurant at the End of the Universe, Douglas Adams
46
47 But we can come close.
48
49
50 Expression formation:
51
52     >>> import logic
53     >>> e = logic.Expression(lambda x: not (x.a == 3 and (x.b > 1 or x.b < -10)))
54     >>> e
55     logic.Expression(lambda x: not ((x.a == 3) and ((x.b > 1) or (x.b < -10))))
56     
57     You'll notice, in this first example, some extra parentheses in the final
58     lambda. The lambda has already undergone an explicit compile/decompile
59     step. These differences don't affect the logic in any way, but it's
60     impossible to guess the exact original syntax when decompiling.
61     
62     However, be advised of this IMPORTANT point. When you form an Expression
63     from a lambda, that lambda goes through a transformer which EARLY BINDS
64     everything it can. If we had included global or free variables in our
65     lambda, those would have been replaced with constants when the Expression
66     was formed. See codewalk.EarlyBinder for more details.
67     
68     We *can*, however, use and define arbitrary comparison functions,
69     such as containedby and startswith.
70
71
72 Lazy Evaluation:
73     >>> e = logic.Expression(lambda x: (x.a == 3) and (x.b > 1 or x.b < -10))
74     >>> class DumbObject(object):
75     ...     a = 3
76     ...     b = 5
77     ...     
78     >>> pass # Do some other things...
79     >>> e.evaluate(DumbObject())
80     True
81     
82     The evaluate() method of an Expression accepts any object instance, and
83     returns the truth value of itself, getting any named attributes from
84     the passed-in object. Notice that the passed-in object does not need
85     to be instantiated prior to the formation of the Expression.
86
87
88 Late binding of arguments (lazier yet!):
89     >>> e = logic.Expression(lambda x, **kw: x.a == kw['Size'])
90     >>> class DumbObject(object):
91     ...     a = 3
92     ...     
93     >>> pass # Do some other things...
94     >>> e.bind_args(Size=3)
95     >>> e.evaluate(DumbObject())
96     True
97     
98     If the lambda possesses a **kwargs argument in its signature, that
99     dictionary may be used to pass in late-bound locals. Before calling
100     Expression.evaluate, callers should call .bind_args, passing a dict.
101     Each subscript of kwargs will be looked up in that dictionary, and
102     the mapped value will replace the operand in the evaluation step.
103
104
105 Derivation (Decompilation) and Translation:
106     'Deriving' is the opposite of 'parsing'. The codewalk.LambdaDecompiler
107     class walks a function or code object and produces equivalent Python
108     code in a string.
109     
110     >>> e = logic.Expression(lambda x: x.a == 3 and (x.b > 1 or x.b < -10))
111     >>> codewalk.LambdaDecompiler(e.func).code()
112     'lambda x: not ((x.a == 3) and ((x.b > 1) or (x.b < -10)))'
113     
114     However, we are not limited to Python statements of our Expression!
115     Another decompiler might produce our Expression in another language;
116     this example produces a WHERE clause for SQL (a declarative language!):
117     
118     >>> e = logic.Expression(lambda x: x.Group == '3' and
119                              x.Date > datetime.date(2004, 2, 14) and
120                              x.Name.endswith('_'))
121     >>> ADOSQLDecompiler(e).code()
122     "([Group] = '3' and [Date] > #2/14/2004#) and [Name] Like '%\\_'"
123
124 Pickling:
125     The Expression object includes custom pickling code (__getstate__ and
126     __setstate__). You might notice that the function itself is *not*
127     pickled; instead, its code() method is called, which produces a
128     string representation of the function (decompilation). This makes
129     pickled Expressions much more stable across Python versions than,
130     say, storing the function's co_code. However, this presents a problem
131     when the Expression is unpickled: the function must be eval'ed and
132     run through an EarlyBinder again. When this occurs (in __setstate__),
133     some of the free variables which were present in func_globals at the
134     time of pickling may not be present when the Expression is unpickled.
135     For example, an Expression which is built in myapp.py may include
136     a datetime.date object in its co_consts. When that Expression is
137     unpickled, its function is eval'ed within this module, not within
138     myapp.py; since this module does not import 'datetime', it will not
139     be included in the func_globals of the reconstituted function, and
140     codewalk.EarlyBinder will fail on LOAD_GLOBAL.
141     
142     Therefore, code which uses this module must determine which objects
143     will be referenced as Expressions are unpickled. Any that are neither
144     builtins nor in this module's globals() need to be injected into this
145     module, so they can be referenced in eval() when the Expression is
146     unpickled.
147
148 """
149
150 from dejavu import codewalk
151 from types import CodeType, FunctionType
152
153 # Globals which assist in unpickling. If they're not present (can't be
154 # imported), that's OK--someone might want to build an app which
155 # doesn't use fixedpoints, for example.
156 import datetime
157
158 try:
159     import fixedpoint
160     from fixedpoint import FixedPoint
161 except ImportError:
162     pass
163
164 try:
165     import decimal
166     from decimal import Decimal
167 except ImportError:
168     pass
169
170
171 class Aggregator(codewalk.Rewriter):
172     """Combine two code objects into one."""
173    
174     def __init__(self, obj):
175         codewalk.Rewriter.__init__(self, obj)
176         self.instr_index = [None] * len(self._bytecode)
177    
178     def and_combine(self, obj):
179         obj = codewalk.Rewriter(obj)
180         bytecode = map(ord, obj.co_code)
181         newtarget = len(bytecode)
182        
183         self._bytecode.pop()
184         self._bytecode.extend([111, newtarget & 0xFF, newtarget >> 8,
185                                1])
186         self._bytecode.extend(bytecode)
187         self.instr_index[-1:] = [obj] * (newtarget + 4)
188    
189     def or_combine(self, obj):
190         obj = codewalk.Rewriter(obj)
191         bytecode = map(ord, obj.co_code)
192         newtarget = len(bytecode)
193        
194         self._bytecode.pop()
195         self._bytecode.extend([112, newtarget & 0xFF, newtarget >> 8,
196                                1])
197         self._bytecode.extend(bytecode)
198         self.instr_index[-1:] = [obj] * (newtarget + 4)
199    
200     def visit_LOAD_ATTR(self, lo, hi):
201         src = self.instr_index[self.cursor]
202         if src:
203             value = src.co_names[lo + (hi << 8)]
204             newindex = self.name_index(value)
205             self.newcode[-2:] = [newindex & 0xFF, newindex >> 8]
206    
207     def visit_LOAD_CONST(self, lo, hi):
208         src = self.instr_index[self.cursor]
209         if src:
210             value = src.co_consts[lo + (hi << 8)]
211             newindex = self.const_index(value)
212             self.newcode[-2:] = [newindex & 0xFF, newindex >> 8]
213
214
215 class Expression(object):
216     """A filter for objects."""
217    
218     def __init__(self, func=None, kwtypes=None):
219         """Expression(func, [kwtypes]={}). func(obj, [**kw]) must return bool.
220         
221         func: a function, with one positional arg and optional keyword args,
222             which must return bool. If func is None, it is initialized to
223             lambda x: True
224         kwtypes: a dictionary of {keyword: type} pairs.
225         """
226         if func is None:
227             func = lambda x: True
228         self._load_func(func)
229         if kwtypes is None:
230             kwtypes = {}
231         self.kwtypes = kwtypes
232         self.kwargs = {}
233    
234     def _load_func(self, func):
235         # Early-bind as much as possible.
236         binder = codewalk.EarlyBinder(func, taintlist=['now', 'today',
237                                                        'iscurrentweek'])
238         self.func = binder.function()
239    
240     def code(self):
241         return codewalk.LambdaDecompiler(self.func).code()
242    
243     def __repr__(self):
244         return 'logic.Expression(%s)' % self.code()
245    
246     def __and__(self, other):
247         """Logical-and this Expression with another."""
248         assert isinstance(other, Expression)
249         ag = Aggregator(self.func)
250         ag.and_combine(other.func)
251         agfunc = ag.function()
252         return Expression(agfunc)
253     __add__ = __and__
254    
255     def __or__(self, other):
256         """Logical-or this Expression with another."""
257         assert isinstance(other, Expression)
258         ag = Aggregator(self.func)
259         ag.or_combine(other.func)
260         agfunc = ag.function()
261         return Expression(agfunc)
262    
263     def bind_args(self, **kwargs):
264         self.kwargs = {}
265         self.kwargs.update(kwargs)
266    
267     def evaluate(self, obj):
268         kw = self.kwargs
269         return self.func(obj, **kw)
270    
271     def __getstate__(self):
272         return (self.code(), self.kwtypes, self.kwargs)
273    
274     def __setstate__(self, state):
275         if len(state) == 2:
276             # Older versions of Expression had a 2-tuple.
277             func, self.kwtypes = state
278             self.kwargs = {}
279         else:
280             func, self.kwtypes, self.kwargs = state
281         # The most difficult thing about Expressions is unpickling.
282         # Any func_globals at the time of pickling are lost, so any
283         # late-bound objects must be available at this point. Any
284         # such objects need to be injected into logic's globals()
285         # if you want them to be available here.
286         f = eval(func)
287         self._load_func(f)
288
289
290 def filter(**kwargs):
291     """Form an Expression from keyword arguments.
292     
293     Allows you to write:
294         e = logic.filter(a=3, b=1)
295     ...instead of:
296         e = logic.Expression(lambda x: x.a == 3 and x.b == 1)
297     """
298     co, names, consts = [], ['x', ], [None, ]
299     i = 0
300     for key, val in kwargs.iteritems():
301         i += 1
302         names.append(key)
303         consts.append(val)
304         co += [124, 0, 0,
305                105, i, 0,
306                100, i, 0,
307                106, 2, 0,
308                111, 0, 0,
309                1,
310                ]
311     if kwargs:
312         # pop extraneous final JUMP and POP_TOP.
313         del co[-4:]
314     co.append(83)
315    
316     # Figure JUMP targets
317     for op in range(len(co)):
318         if co[op] == 111:
319             co[op + 1] = (len(co) - 4) - op
320    
321     # Form code object and function.
322     # code(argcount, nlocals, stacksize, flags, codestring,
323     #      constants, names, varnames,
324     #      filename, name, firstlineno, lnotab[, freevars[, cellvars]])
325     co = CodeType(1, 1, 2, 67, ''.join(map(chr, co)),
326                   tuple(consts), tuple(names), ('x', ),
327                   '', '<lambda>', 1, '')
328     func = FunctionType(co, {})
329     return Expression(func)
330
331
332 def comparison(attr, cmp_op, criteria):
333     """Form an Expression lambda x: x.attr cmp_op criteria.
334     
335     Allows you to write:
336         e = logic.comparison('Size', cmp_op_index, 4)
337     ...instead of:
338         e = logic.Expression(lambda x: x.Size <= 4)
339     
340     This allows one to pass dynamic, isolated arguments, without having
341     to construct a lambda out of them first.
342     """
343     # cmp_op (from opcode):
344     # ('<', '<=', '==', '!=', '>', '>=', 'in', 'not in', 'is',
345     #  'is not', 'exception match', 'BAD')
346     if cmp_op < 0 or cmp_op > 11:
347         raise ValueError("The cmp_op argument must be between 0 and 11")
348    
349     if not isinstance(attr, str):
350         attr = str(attr)
351     names = ('x', attr)
352    
353     consts = (None, criteria)
354     co = [124, 0, 0,
355           105, 1, 0,
356           100, 1, 0,
357           106, cmp_op, 0,
358           83,
359           ]
360    
361     # Form code object and function.
362     # code(argcount, nlocals, stacksize, flags, codestring,
363     #      constants, names, varnames,
364     #      filename, name, firstlineno, lnotab[, freevars[, cellvars]])
365     co = CodeType(1, 1, 2, 67, ''.join(map(chr, co)),
366                   consts, names, ('x',), '', '<lambda>', 1, '')
367     func = FunctionType(co, {})
368     return Expression(func)
369
Note: See TracBrowser for help on using the browser.