Contact: fumanchu@aminus.org

Log in as guest/dejavu to create tickets

I think I've seen this ORM somewhere before...

root/branches/ldap/logic.py

Revision 398 (checked in by fumanchu, 6 years ago)

docstrings

  • Property svn:eol-style set to native
Line 
1 """First-class Expression objects.
2
3 This work, including the source code, documentation
4 and related data, is placed into the public domain.
5
6 The original author is Robert Brewer, Amor Ministries.
7
8 THIS SOFTWARE IS PROVIDED AS-IS, WITHOUT WARRANTY
9 OF ANY KIND, NOT EVEN THE IMPLIED WARRANTY OF
10 MERCHANTABILITY. THE AUTHOR OF THIS SOFTWARE
11 ASSUMES _NO_ RESPONSIBILITY FOR ANY CONSEQUENCE
12 RESULTING FROM THE USE, MODIFICATION, OR
13 REDISTRIBUTION OF THIS SOFTWARE.
14
15 Python evaluates expressions like any other language; however,
16 the expression itself cannot be 'passed around' easily--that is,
17 the expression itself is a code block, not a callable. In most cases,
18 this is not an issue: if an evaluation step needs to be 'first-class',
19 it's usually wrapped up in a function (sometimes anonymous), and
20 that function is passed. This allows lazy evaluation, for example.
21
22 In some cases, however, we wish to manipulate the actual logic of the
23 expression:
24     1. Inspection. Code might form an expression from user input,
25            then take secondary actions depending upon the operands.
26     2. Modification. For example, correction of an expression
27            if it raises an Exception.
28     3. Translation. A common case is converting Python expressions
29            into SQL.
30
31 It is possible to provide these benefits through some combination
32 of the standard modules parser and compiler, and/or via the builtins
33 eval() and exec(). However, these approaches require placing the
34 expression in a string, which introduces problems of substituting
35 user data; for example, ("x.Name == r'%s'" % user_data) will break if
36 user_data contains quote-marks. This is by far not the only example of
37 the abuses of eval(). Solutions using parser and compiler also tend to
38 be quite slow in pure Python.
39
40 This module takes the approach that the Python developer should be
41 able to form first-class Expressions directly from Python code.
42
43     "This, even if the rest were true, which it isn't, is patently
44      impossible, say the doubters."
45        -- The Restaurant at the End of the Universe, Douglas Adams
46
47 But we can come close.
48
49
50 Expression formation:
51
52     >>> import logic
53     >>> e = logic.Expression(lambda x: not (x.a == 3 and (x.b > 1 or x.b < -10)))
54     >>> e
55     logic.Expression(lambda x: not ((x.a == 3) and ((x.b > 1) or (x.b < -10))))
56     
57     You'll notice, in this first example, some extra parentheses in the final
58     lambda. The lambda has already undergone an explicit compile/decompile
59     step. These differences don't affect the logic in any way, but it's
60     impossible to guess the exact original syntax when decompiling.
61     
62     However, be advised of this IMPORTANT point. When you form an Expression
63     from a lambda, that lambda goes through a transformer which EARLY BINDS
64     everything it can. If we had included global or free variables in our
65     lambda, those would have been replaced with constants when the Expression
66     was formed. See codewalk.EarlyBinder for more details.
67     
68     We *can*, however, use and define arbitrary comparison functions,
69     such as containedby and startswith.
70
71
72 Lazy Evaluation:
73     >>> e = logic.Expression(lambda x: (x.a == 3) and (x.b > 1 or x.b < -10))
74     >>> class DumbObject(object):
75     ...     a = 3
76     ...     b = 5
77     ...     
78     >>> pass # Do some other things...
79     >>> e(DumbObject())
80     True
81     
82     When calling an Expression, it accepts any object instance(s),
83     and returns the truth value of itself, getting any named attributes
84     from the passed-in object(s). Notice that the passed-in objects do not
85     need to be instantiated prior to the construction of the Expression.
86
87
88 Late binding of arguments (lazier yet!):
89     >>> e = logic.Expression(lambda x, **kw: x.a == kw['Size'])
90     >>> class DumbObject(object):
91     ...     a = 3
92     ...     
93     >>> pass # Do some other things...
94     >>> e(DumbObject(), Size=3)
95     True
96     >>> e.bind_args(Size=3)
97     >>> e(DumbObject())
98     True
99     
100     If the lambda possesses a **kwargs argument in its signature, that
101     dictionary may be used to pass in late-bound locals. They may either
102     be passed when calling the Expression, or may be bound to the
103     Expression using the 'bind_args' method. If both are provided,
104     the passed-in kwargs will overwrite any bound kwargs.
105
106
107 Derivation (Decompilation) and Translation:
108     'Deriving' is the opposite of 'parsing'. The codewalk.LambdaDecompiler
109     class walks a function or code object and produces equivalent Python
110     code in a string.
111     
112     >>> e = logic.Expression(lambda x: x.a == 3 and (x.b > 1 or x.b < -10))
113     >>> codewalk.LambdaDecompiler(e.func).code()
114     'lambda x: not ((x.a == 3) and ((x.b > 1) or (x.b < -10)))'
115     
116     However, we are not limited to Python statements of our Expression!
117     Another decompiler might produce our Expression in another language;
118     this example produces a WHERE clause for SQL (a declarative language!):
119     
120     >>> e = logic.Expression(lambda x: x.Group == '3' and
121                              x.Date > datetime.date(2004, 2, 14) and
122                              x.Name.endswith('_'))
123     >>> ADOSQLDecompiler(e).code()
124     "([Group] = '3' and [Date] > #2/14/2004#) and [Name] Like '%\\_'"
125
126 Pickling:
127     The Expression object includes custom pickling code (__getstate__ and
128     __setstate__). You might notice that the function itself is *not*
129     pickled; instead, its code() method is called, which produces a
130     string representation of the function (decompilation). This makes
131     pickled Expressions much more stable across Python versions than,
132     say, storing the function's co_code. However, this presents a problem
133     when the Expression is unpickled: the function must be eval'ed and
134     run through an EarlyBinder again. When this occurs (in __setstate__),
135     some of the free variables which were present in func_globals at the
136     time of pickling may not be present when the Expression is unpickled.
137     For example, an Expression which is built in myapp.py may include
138     a Numarray object in its co_consts. When that Expression is
139     unpickled, its function is eval'ed within this module, not within
140     myapp.py; since this module does not import Numarray, it will not
141     be included in the func_globals of the reconstituted function, and
142     codewalk.EarlyBinder will fail on LOAD_GLOBAL.
143     
144     Therefore, code which uses this module must determine which objects
145     will be referenced as Expressions are unpickled. Any that are neither
146     builtins nor in this module's globals() need to be injected into this
147     module, so they can be referenced in eval() when the Expression is
148     unpickled.
149
150 """
151
152 from compiler.consts import *
153 import sys
154 from types import CodeType, FunctionType
155
156 # Globals which assist in unpickling. If they're not present (can't be
157 # imported), that's OK--someone might want to build an app which
158 # doesn't use fixedpoints, for example.
159 import datetime
160
161 try:
162     import fixedpoint
163     from fixedpoint import FixedPoint
164 except ImportError:
165     pass
166
167 try:
168     import decimal
169     from decimal import Decimal
170 except ImportError:
171     pass
172
173 from dejavu import codewalk
174
175
176 class Aggregator(codewalk.Rewriter):
177     """Combine two code objects into one."""
178    
179     def __init__(self, obj):
180         codewalk.Rewriter.__init__(self, obj)
181         self.instr_index = [None] * len(self._bytecode)
182    
183     def combine(self, obj, conjunction):
184         obj = codewalk.Rewriter(obj)
185         bytecode = map(ord, obj.co_code)
186         newtarget = len(bytecode)
187        
188         self._bytecode.pop()      # RETURN_VALUE
189         self._bytecode.extend([conjunction, newtarget & 0xFF, newtarget >> 8])
190         self._bytecode.append(1)  # POP_TOP
191         self._bytecode.extend(bytecode)
192         self.instr_index[-1:] = [obj] * (newtarget + 4)
193        
194         # Expand self.co_argcount, co_nlocals if needed.
195         self.co_argcount = max(self.co_argcount, obj.co_argcount)
196         self.co_nlocals = max(self.co_nlocals, obj.co_nlocals)
197        
198         # Expand self.co_varnames list if needed.
199         for i, name in enumerate(obj.co_varnames):
200             if i >= len(self.co_varnames):
201                 self.co_varnames.append(name)
202        
203         # Add the **kwargs flag if present
204         if obj.co_flags & CO_VARKEYWORDS:
205             self.co_flags |= CO_VARKEYWORDS
206        
207         # Add the *args flag if present
208         if obj.co_flags & CO_VARARGS:
209             self.co_flags |= CO_VARARGS
210    
211     def and_combine(self, obj):
212         self.combine(obj, 111)
213    
214     def or_combine(self, obj):
215         self.combine(obj, 112)
216    
217     def visit_LOAD_ATTR(self, lo, hi):
218         src = self.instr_index[self.cursor]
219         if src:
220             value = src.co_names[lo + (hi << 8)]
221             newindex = self.name_index(value)
222             self.newcode[-2:] = [newindex & 0xFF, newindex >> 8]
223    
224     def visit_LOAD_CONST(self, lo, hi):
225         src = self.instr_index[self.cursor]
226         if src:
227             value = src.co_consts[lo + (hi << 8)]
228             newindex = self.const_index(value)
229             self.newcode[-2:] = [newindex & 0xFF, newindex >> 8]
230
231
232 class Expression(object):
233     """A filter for objects."""
234    
235     def __init__(self, func=None, kwtypes=None, earlybind=True):
236         """Expression(func, [kwtypes={}]). func(obj, [**kw]) must return bool.
237         
238         func: a function, with one positional arg and optional keyword args,
239             which must return bool. If func is None, it is initialized to
240             "lambda x: True".
241         kwtypes: a dictionary of {keyword: type} pairs.
242         earlybind: if True (the default), the given function will be
243             rewritten, binding as many constants as possible into co_consts.
244             The only reason to ever set it to False is for performance,
245             and you must be *certain* there are no global or cell refs
246             in your function.
247         """
248         if func is None:
249             self.func = lambda x: True
250         else:
251             if earlybind:
252                 self._load_func(func)
253             else:
254                 self.func = func
255        
256         if kwtypes is None:
257             kwtypes = {}
258         self.kwtypes = kwtypes
259         self.kwargs = {}
260    
261     def _load_func(self, func):
262         # Early-bind as much as possible.
263         binder = codewalk.EarlyBinder(func, bind_late=[datetime.datetime.now,
264                                                        datetime.date.today])
265         self.func = binder.function()
266    
267     def code(self):
268         """Return source code for self.func."""
269         if hasattr(self, 'func'):
270             return codewalk.LambdaDecompiler(self.func).code()
271         else:
272             return 'function not yet loaded'
273    
274     def __repr__(self):
275         return 'logic.Expression(%s)' % self.code()
276    
277     def __and__(self, other):
278         """Logical-and this Expression with another."""
279         if not isinstance(other, Expression):
280             other = Expression(other)
281         ag = Aggregator(self.func)
282         ag.and_combine(other.func)
283         agfunc = ag.function()
284         newkwtypes = self.kwtypes.copy()
285         newkwtypes.update(other.kwtypes)
286         return Expression(agfunc, newkwtypes)
287     __add__ = __and__
288    
289     def __or__(self, other):
290         """Logical-or this Expression with another."""
291         if not isinstance(other, Expression):
292             other = Expression(other)
293         ag = Aggregator(self.func)
294         ag.or_combine(other.func)
295         agfunc = ag.function()
296         newkwtypes = self.kwtypes.copy()
297         newkwtypes.update(other.kwtypes)
298         return Expression(agfunc, newkwtypes)
299    
300     def bind_args(self, **kwargs):
301         """Set self.kwargs to a shallow copy of the given kwargs."""
302         self.kwargs = {}
303         self.kwargs.update(kwargs)
304    
305     def evaluate(self, *args, **kwargs):
306         """Return self.func(*args, **kwargs + self.kwargs)."""
307         kw = self.kwargs.copy()
308         kw.update(kwargs)
309         return self.func(*args, **kw)
310     __call__ = evaluate
311    
312     def __getstate__(self):
313         return (self.code(), self.kwtypes, self.kwargs)
314    
315     def __setstate__(self, state):
316         if len(state) == 2:
317             # Older versions of Expression had a 2-tuple.
318             func, self.kwtypes = state
319             self.kwargs = {}
320         else:
321             func, self.kwtypes, self.kwargs = state
322         # The most difficult thing about Expressions is unpickling.
323         # Any func_globals at the time of pickling are lost, so any
324         # late-bound objects must be available at this point. Any
325         # such objects need to be injected into logic's globals()
326         # if you want them to be available here.
327         f = eval(func)
328         self._load_func(f)
329
330
331 def filter(**kwargs):
332     """Form an Expression from keyword arguments.
333     
334     Allows you to write:
335         e = logic.filter(a=3, b=1)
336     ...instead of:
337         e = logic.Expression(lambda x: x.a == 3 and x.b == 1)
338     """
339     co, names, consts = [], ['x', ], [None, ]
340     i = 0
341     for key, val in kwargs.iteritems():
342         i += 1
343         names.append(key)
344         consts.append(val)
345         co += [124, 0, 0,
346                105, i, 0,
347                100, i, 0,
348                106, 2, 0,
349                111, 0, 0,
350                1,
351                ]
352     if kwargs:
353         # pop extraneous final JUMP and POP_TOP.
354         del co[-4:]
355     co.append(83)
356    
357     # Figure JUMP targets
358     for op in range(len(co)):
359         if co[op] == 111:
360             co[op + 1] = (len(co) - 4) - op
361    
362     # Form code object and function.
363     # code(argcount, nlocals, stacksize, flags, codestring,
364     #      constants, names, varnames,
365     #      filename, name, firstlineno, lnotab[, freevars[, cellvars]])
366     co = CodeType(1, 1, 2, 67, ''.join(map(chr, co)),
367                   tuple(consts), tuple(names), ('x', ),
368                   '', '<lambda>', 1, '')
369     func = FunctionType(co, {})
370     return Expression(func, earlybind=False)
371
372
373 def comparison(attr, cmp_op, criteria):
374     """Form an Expression lambda x: x.attr cmp_op criteria.
375     
376     Allows you to write:
377         e = logic.comparison('Size', cmp_op_index, 4)
378     ...instead of:
379         e = logic.Expression(lambda x: x.Size <= 4)
380     
381     This allows one to pass dynamic, isolated arguments, without having
382     to construct a lambda out of them first.
383     """
384     # cmp_op (from opcode):
385     # ('<', '<=', '==', '!=', '>', '>=', 'in', 'not in', 'is',
386     #  'is not', 'exception match', 'BAD')
387     if cmp_op < 0 or cmp_op > 11:
388         raise ValueError("The cmp_op argument must be between 0 and 11")
389    
390     if not isinstance(attr, str):
391         attr = str(attr)
392    
393     co_flags = codewalk.CO_NOFREE | codewalk.CO_OPTIMIZED | codewalk.CO_NEWLOCALS
394    
395     if sys.version_info >= (2, 5):
396         # 2.5 doesn't include arguments in co_names anymore,
397         idx = 0
398         names = (attr,)
399         # It also stopped prepending None to co_consts...
400         consts = (criteria,)
401         # ...and nested_scopes are now "always on".
402         co_flags |= codewalk.CO_NESTED
403     else:
404         idx = 1
405         names = ('x', attr)
406         consts = (None, criteria)
407    
408     co = [124, 0, 0,
409           105, idx, 0,
410           100, idx, 0,
411           106, cmp_op, 0,
412           83,
413           ]
414    
415     # Form code object and function.
416     # code(argcount, nlocals, stacksize, flags, codestring,
417     #      constants, names, varnames,
418     #      filename, name, firstlineno, lnotab[, freevars[, cellvars]])
419     co = CodeType(1, 1, 2, co_flags, ''.join(map(chr, co)),
420                   consts, names, ('x',), '', '<lambda>', 1, '')
421     func = FunctionType(co, {})
422     return Expression(func, earlybind=False)
423
Note: See TracBrowser for help on using the browser.