Contact: fumanchu@aminus.org

Log in as guest/dejavu to create tickets

root/tags/1.4.0/logic.py

Revision 114 (checked in by fumanchu, 3 years ago)

logic: function Aggregator needed to take max(co_argcount) and max(co_nlocals).

  • Property svn:eol-style set to native
Line 
1 """First-class Expression objects.
2
3 This work, including the source code, documentation
4 and related data, is placed into the public domain.
5
6 The original author is Robert Brewer, Amor Ministries.
7
8 THIS SOFTWARE IS PROVIDED AS-IS, WITHOUT WARRANTY
9 OF ANY KIND, NOT EVEN THE IMPLIED WARRANTY OF
10 MERCHANTABILITY. THE AUTHOR OF THIS SOFTWARE
11 ASSUMES _NO_ RESPONSIBILITY FOR ANY CONSEQUENCE
12 RESULTING FROM THE USE, MODIFICATION, OR
13 REDISTRIBUTION OF THIS SOFTWARE.
14
15 Python evaluates expressions like any other language; however,
16 the expression itself cannot be 'passed around' easily--that is,
17 the expression itself is a code block, not a callable. In most cases,
18 this is not an issue: if an evaluation step needs to be 'first-class',
19 it's usually wrapped up in a function (sometimes anonymous), and
20 that function is passed. This allows lazy evaluation, for example.
21
22 In some cases, however, we wish to manipulate the actual logic of the
23 expression:
24     1. Inspection. Code might form an expression from user input,
25            then take secondary actions depending upon the operands.
26     2. Modification. For example, correction of an expression
27            if it raises an Exception.
28     3. Translation. A common case is converting Python expressions
29            into SQL.
30
31 It is possible to provide these benefits through some combination
32 of the standard modules parser and compiler, and/or via the builtins
33 eval() and exec(). However, these approaches require placing the
34 expression in a string, which introduces problems of substituting
35 user data; for example, ("x.Name == r'%s'" % user_data) will break if
36 user_data contains quote-marks. This is by far not the only example of
37 the abuses of eval(). Solutions using parser and compiler also tend to
38 be quite slow in pure Python.
39
40 This module takes the approach that the Python developer should be
41 able to form first-class Expressions directly from Python code.
42
43     "This, even if the rest were true, which it isn't, is patently
44      impossible, say the doubters."
45        -- The Restaurant at the End of the Universe, Douglas Adams
46
47 But we can come close.
48
49
50 Expression formation:
51
52     >>> import logic
53     >>> e = logic.Expression(lambda x: not (x.a == 3 and (x.b > 1 or x.b < -10)))
54     >>> e
55     logic.Expression(lambda x: not ((x.a == 3) and ((x.b > 1) or (x.b < -10))))
56     
57     You'll notice, in this first example, some extra parentheses in the final
58     lambda. The lambda has already undergone an explicit compile/decompile
59     step. These differences don't affect the logic in any way, but it's
60     impossible to guess the exact original syntax when decompiling.
61     
62     However, be advised of this IMPORTANT point. When you form an Expression
63     from a lambda, that lambda goes through a transformer which EARLY BINDS
64     everything it can. If we had included global or free variables in our
65     lambda, those would have been replaced with constants when the Expression
66     was formed. See codewalk.EarlyBinder for more details.
67     
68     We *can*, however, use and define arbitrary comparison functions,
69     such as containedby and startswith.
70
71
72 Lazy Evaluation:
73     >>> e = logic.Expression(lambda x: (x.a == 3) and (x.b > 1 or x.b < -10))
74     >>> class DumbObject(object):
75     ...     a = 3
76     ...     b = 5
77     ...     
78     >>> pass # Do some other things...
79     >>> e(DumbObject())
80     True
81     
82     When calling an Expression, it accepts any object instance(s),
83     and returns the truth value of itself, getting any named attributes
84     from the passed-in object(s). Notice that the passed-in objects do not
85     need to be instantiated prior to the construction of the Expression.
86
87
88 Late binding of arguments (lazier yet!):
89     >>> e = logic.Expression(lambda x, **kw: x.a == kw['Size'])
90     >>> class DumbObject(object):
91     ...     a = 3
92     ...     
93     >>> pass # Do some other things...
94     >>> e(DumbObject(), Size=3)
95     True
96     >>> e.bind_args(Size=3)
97     >>> e(DumbObject())
98     True
99     
100     If the lambda possesses a **kwargs argument in its signature, that
101     dictionary may be used to pass in late-bound locals. They may either
102     be passed when calling the Expression, or may be bound to the
103     Expression using the 'bind_args' method. If both are provided,
104     the passed-in kwargs will overwrite any bound kwargs.
105
106
107 Derivation (Decompilation) and Translation:
108     'Deriving' is the opposite of 'parsing'. The codewalk.LambdaDecompiler
109     class walks a function or code object and produces equivalent Python
110     code in a string.
111     
112     >>> e = logic.Expression(lambda x: x.a == 3 and (x.b > 1 or x.b < -10))
113     >>> codewalk.LambdaDecompiler(e.func).code()
114     'lambda x: not ((x.a == 3) and ((x.b > 1) or (x.b < -10)))'
115     
116     However, we are not limited to Python statements of our Expression!
117     Another decompiler might produce our Expression in another language;
118     this example produces a WHERE clause for SQL (a declarative language!):
119     
120     >>> e = logic.Expression(lambda x: x.Group == '3' and
121                              x.Date > datetime.date(2004, 2, 14) and
122                              x.Name.endswith('_'))
123     >>> ADOSQLDecompiler(e).code()
124     "([Group] = '3' and [Date] > #2/14/2004#) and [Name] Like '%\\_'"
125
126 Pickling:
127     The Expression object includes custom pickling code (__getstate__ and
128     __setstate__). You might notice that the function itself is *not*
129     pickled; instead, its code() method is called, which produces a
130     string representation of the function (decompilation). This makes
131     pickled Expressions much more stable across Python versions than,
132     say, storing the function's co_code. However, this presents a problem
133     when the Expression is unpickled: the function must be eval'ed and
134     run through an EarlyBinder again. When this occurs (in __setstate__),
135     some of the free variables which were present in func_globals at the
136     time of pickling may not be present when the Expression is unpickled.
137     For example, an Expression which is built in myapp.py may include
138     a datetime.date object in its co_consts. When that Expression is
139     unpickled, its function is eval'ed within this module, not within
140     myapp.py; since this module does not import 'datetime', it will not
141     be included in the func_globals of the reconstituted function, and
142     codewalk.EarlyBinder will fail on LOAD_GLOBAL.
143     
144     Therefore, code which uses this module must determine which objects
145     will be referenced as Expressions are unpickled. Any that are neither
146     builtins nor in this module's globals() need to be injected into this
147     module, so they can be referenced in eval() when the Expression is
148     unpickled.
149
150 """
151
152 from dejavu import codewalk
153 from types import CodeType, FunctionType
154
155 # Globals which assist in unpickling. If they're not present (can't be
156 # imported), that's OK--someone might want to build an app which
157 # doesn't use fixedpoints, for example.
158 import datetime
159
160 try:
161     import fixedpoint
162     from fixedpoint import FixedPoint
163 except ImportError:
164     pass
165
166 try:
167     import decimal
168     from decimal import Decimal
169 except ImportError:
170     pass
171
172
173 class Aggregator(codewalk.Rewriter):
174     """Combine two code objects into one."""
175    
176     def __init__(self, obj):
177         codewalk.Rewriter.__init__(self, obj)
178         self.instr_index = [None] * len(self._bytecode)
179    
180     def combine(self, obj, conjunction):
181         obj = codewalk.Rewriter(obj)
182         bytecode = map(ord, obj.co_code)
183         newtarget = len(bytecode)
184        
185         self._bytecode.pop()      # RETURN_VALUE
186         self._bytecode.extend([conjunction, newtarget & 0xFF, newtarget >> 8])
187         self._bytecode.append(1)  # POP_TOP
188         self._bytecode.extend(bytecode)
189         self.instr_index[-1:] = [obj] * (newtarget + 4)
190        
191         # Expand self.co_argcount, co_nlocals if needed.
192         self.co_argcount = max(self.co_argcount, obj.co_argcount)
193         self.co_nlocals = max(self.co_nlocals, obj.co_nlocals)
194        
195         # Expand self.co_varnames list if needed.
196         for i, name in enumerate(obj.co_varnames):
197             if i >= len(self.co_varnames):
198                 self.co_varnames.append(name)
199        
200         # Add the **kwargs flag if present
201         if obj.co_flags & 0x08:
202             self.co_flags |= 0x08
203        
204         # Add the *args flag if present
205         if obj.co_flags & 0x04:
206             self.co_flags |= 0x04
207    
208     def and_combine(self, obj):
209         self.combine(obj, 111)
210    
211     def or_combine(self, obj):
212         self.combine(obj, 112)
213    
214     def visit_LOAD_ATTR(self, lo, hi):
215         src = self.instr_index[self.cursor]
216         if src:
217             value = src.co_names[lo + (hi << 8)]
218             newindex = self.name_index(value)
219             self.newcode[-2:] = [newindex & 0xFF, newindex >> 8]
220    
221     def visit_LOAD_CONST(self, lo, hi):
222         src = self.instr_index[self.cursor]
223         if src:
224             value = src.co_consts[lo + (hi << 8)]
225             newindex = self.const_index(value)
226             self.newcode[-2:] = [newindex & 0xFF, newindex >> 8]
227
228
229 class Expression(object):
230     """A filter for objects."""
231    
232     def __init__(self, func=None, kwtypes=None):
233         """Expression(func, [kwtypes]={}). func(obj, [**kw]) must return bool.
234         
235         func: a function, with one positional arg and optional keyword args,
236             which must return bool. If func is None, it is initialized to
237             lambda x: True
238         kwtypes: a dictionary of {keyword: type} pairs.
239         """
240         if func is None:
241             func = lambda x: True
242         self._load_func(func)
243         if kwtypes is None:
244             kwtypes = {}
245         self.kwtypes = kwtypes
246         self.kwargs = {}
247    
248     def _load_func(self, func):
249         # Early-bind as much as possible.
250         binder = codewalk.EarlyBinder(func, bind_late=[datetime.datetime.now,
251                                                        datetime.date.today])
252         self.func = binder.function()
253    
254     def code(self):
255         return codewalk.LambdaDecompiler(self.func).code()
256    
257     def __repr__(self):
258         return 'logic.Expression(%s)' % self.code()
259    
260     def __and__(self, other):
261         """Logical-and this Expression with another."""
262         if not isinstance(other, Expression):
263             raise TypeError("'%s' is not an Expression" % other)
264         ag = Aggregator(self.func)
265         ag.and_combine(other.func)
266         agfunc = ag.function()
267         return Expression(agfunc)
268     __add__ = __and__
269    
270     def __or__(self, other):
271         """Logical-or this Expression with another."""
272         if not isinstance(other, Expression):
273             raise TypeError("'%s' is not an Expression" % other)
274         ag = Aggregator(self.func)
275         ag.or_combine(other.func)
276         agfunc = ag.function()
277         return Expression(agfunc)
278    
279     def bind_args(self, **kwargs):
280         self.kwargs = {}
281         self.kwargs.update(kwargs)
282    
283     def evaluate(self, *args, **kwargs):
284         kw = {}
285         kw.update(self.kwargs)
286         kw.update(kwargs)
287         return self.func(*args, **kw)
288     __call__ = evaluate
289    
290     def __getstate__(self):
291         return (self.code(), self.kwtypes, self.kwargs)
292    
293     def __setstate__(self, state):
294         if len(state) == 2:
295             # Older versions of Expression had a 2-tuple.
296             func, self.kwtypes = state
297             self.kwargs = {}
298         else:
299             func, self.kwtypes, self.kwargs = state
300         # The most difficult thing about Expressions is unpickling.
301         # Any func_globals at the time of pickling are lost, so any
302         # late-bound objects must be available at this point. Any
303         # such objects need to be injected into logic's globals()
304         # if you want them to be available here.
305         f = eval(func)
306         self._load_func(f)
307
308
309 def filter(**kwargs):
310     """Form an Expression from keyword arguments.
311     
312     Allows you to write:
313         e = logic.filter(a=3, b=1)
314     ...instead of:
315         e = logic.Expression(lambda x: x.a == 3 and x.b == 1)
316     """
317     co, names, consts = [], ['x', ], [None, ]
318     i = 0
319     for key, val in kwargs.iteritems():
320         i += 1
321         names.append(key)
322         consts.append(val)
323         co += [124, 0, 0,
324                105, i, 0,
325                100, i, 0,
326                106, 2, 0,
327                111, 0, 0,
328                1,
329                ]
330     if kwargs:
331         # pop extraneous final JUMP and POP_TOP.
332         del co[-4:]
333     co.append(83)
334    
335     # Figure JUMP targets
336     for op in range(len(co)):
337         if co[op] == 111:
338             co[op + 1] = (len(co) - 4) - op
339    
340     # Form code object and function.
341     # code(argcount, nlocals, stacksize, flags, codestring,
342     #      constants, names, varnames,
343     #      filename, name, firstlineno, lnotab[, freevars[, cellvars]])
344     co = CodeType(1, 1, 2, 67, ''.join(map(chr, co)),
345                   tuple(consts), tuple(names), ('x', ),
346                   '', '<lambda>', 1, '')
347     func = FunctionType(co, {})
348     return Expression(func)
349
350
351 def comparison(attr, cmp_op, criteria):
352     """Form an Expression lambda x: x.attr cmp_op criteria.
353     
354     Allows you to write:
355         e = logic.comparison('Size', cmp_op_index, 4)
356     ...instead of:
357         e = logic.Expression(lambda x: x.Size <= 4)
358     
359     This allows one to pass dynamic, isolated arguments, without having
360     to construct a lambda out of them first.
361     """
362     # cmp_op (from opcode):
363     # ('<', '<=', '==', '!=', '>', '>=', 'in', 'not in', 'is',
364     #  'is not', 'exception match', 'BAD')
365     if cmp_op < 0 or cmp_op > 11:
366         raise ValueError("The cmp_op argument must be between 0 and 11")
367    
368     if not isinstance(attr, str):
369         attr = str(attr)
370     names = ('x', attr)
371    
372     consts = (None, criteria)
373     co = [124, 0, 0,
374           105, 1, 0,
375           100, 1, 0,
376           106, cmp_op, 0,
377           83,
378           ]
379    
380     # Form code object and function.
381     # code(argcount, nlocals, stacksize, flags, codestring,
382     #      constants, names, varnames,
383     #      filename, name, firstlineno, lnotab[, freevars[, cellvars]])
384     co = CodeType(1, 1, 2, 67, ''.join(map(chr, co)),
385                   consts, names, ('x',), '', '<lambda>', 1, '')
386     func = FunctionType(co, {})
387     return Expression(func)
388
Note: See TracBrowser for help on using the browser.