| 1 |
"""First-class Expression objects. |
|---|
| 2 |
|
|---|
| 3 |
This work, including the source code, documentation |
|---|
| 4 |
and related data, is placed into the public domain. |
|---|
| 5 |
|
|---|
| 6 |
The original author is Robert Brewer, Amor Ministries. |
|---|
| 7 |
|
|---|
| 8 |
THIS SOFTWARE IS PROVIDED AS-IS, WITHOUT WARRANTY |
|---|
| 9 |
OF ANY KIND, NOT EVEN THE IMPLIED WARRANTY OF |
|---|
| 10 |
MERCHANTABILITY. THE AUTHOR OF THIS SOFTWARE |
|---|
| 11 |
ASSUMES _NO_ RESPONSIBILITY FOR ANY CONSEQUENCE |
|---|
| 12 |
RESULTING FROM THE USE, MODIFICATION, OR |
|---|
| 13 |
REDISTRIBUTION OF THIS SOFTWARE. |
|---|
| 14 |
|
|---|
| 15 |
Python evaluates expressions like any other language; however, |
|---|
| 16 |
the expression itself cannot be 'passed around' easily--that is, |
|---|
| 17 |
the expression itself is a code block, not a callable. In most cases, |
|---|
| 18 |
this is not an issue: if an evaluation step needs to be 'first-class', |
|---|
| 19 |
it's usually wrapped up in a function (sometimes anonymous), and |
|---|
| 20 |
that function is passed. This allows lazy evaluation, for example. |
|---|
| 21 |
|
|---|
| 22 |
In some cases, however, we wish to manipulate the actual logic of the |
|---|
| 23 |
expression: |
|---|
| 24 |
1. Inspection. Code might form an expression from user input, |
|---|
| 25 |
then take secondary actions depending upon the operands. |
|---|
| 26 |
2. Modification. For example, correction of an expression |
|---|
| 27 |
if it raises an Exception. |
|---|
| 28 |
3. Translation. A common case is converting Python expressions |
|---|
| 29 |
into SQL. |
|---|
| 30 |
|
|---|
| 31 |
It is possible to provide these benefits through some combination |
|---|
| 32 |
of the standard modules parser and compiler, and/or via the builtins |
|---|
| 33 |
eval() and exec(). However, these approaches require placing the |
|---|
| 34 |
expression in a string, which introduces problems of substituting |
|---|
| 35 |
user data; for example, ("x.Name == r'%s'" % user_data) will break if |
|---|
| 36 |
user_data contains quote-marks. This is by far not the only example of |
|---|
| 37 |
the abuses of eval(). Solutions using parser and compiler also tend to |
|---|
| 38 |
be quite slow in pure Python. |
|---|
| 39 |
|
|---|
| 40 |
This module takes the approach that the Python developer should be |
|---|
| 41 |
able to form first-class Expressions directly from Python code. |
|---|
| 42 |
|
|---|
| 43 |
"This, even if the rest were true, which it isn't, is patently |
|---|
| 44 |
impossible, say the doubters." |
|---|
| 45 |
-- The Restaurant at the End of the Universe, Douglas Adams |
|---|
| 46 |
|
|---|
| 47 |
But we can come close. |
|---|
| 48 |
|
|---|
| 49 |
|
|---|
| 50 |
Expression formation: |
|---|
| 51 |
|
|---|
| 52 |
>>> import logic |
|---|
| 53 |
>>> e = logic.Expression(lambda x: not (x.a == 3 and (x.b > 1 or x.b < -10))) |
|---|
| 54 |
>>> e |
|---|
| 55 |
logic.Expression(lambda x: not ((x.a == 3) and ((x.b > 1) or (x.b < -10)))) |
|---|
| 56 |
|
|---|
| 57 |
You'll notice, in this first example, some extra parentheses in the final |
|---|
| 58 |
lambda. The lambda has already undergone an explicit compile/decompile |
|---|
| 59 |
step. These differences don't affect the logic in any way, but it's |
|---|
| 60 |
impossible to guess the exact original syntax when decompiling. |
|---|
| 61 |
|
|---|
| 62 |
However, be advised of this IMPORTANT point. When you form an Expression |
|---|
| 63 |
from a lambda, that lambda goes through a transformer which EARLY BINDS |
|---|
| 64 |
everything it can. If we had included global or free variables in our |
|---|
| 65 |
lambda, those would have been replaced with constants when the Expression |
|---|
| 66 |
was formed. See codewalk.EarlyBinder for more details. |
|---|
| 67 |
|
|---|
| 68 |
We *can*, however, use and define arbitrary comparison functions, |
|---|
| 69 |
such as containedby and startswith. |
|---|
| 70 |
|
|---|
| 71 |
|
|---|
| 72 |
Lazy Evaluation: |
|---|
| 73 |
>>> e = logic.Expression(lambda x: (x.a == 3) and (x.b > 1 or x.b < -10)) |
|---|
| 74 |
>>> class DumbObject(object): |
|---|
| 75 |
... a = 3 |
|---|
| 76 |
... b = 5 |
|---|
| 77 |
... |
|---|
| 78 |
>>> pass # Do some other things... |
|---|
| 79 |
>>> e.evaluate(DumbObject()) |
|---|
| 80 |
True |
|---|
| 81 |
|
|---|
| 82 |
The evaluate() method of an Expression accepts any object instance, and |
|---|
| 83 |
returns the truth value of itself, getting any named attributes from |
|---|
| 84 |
the passed-in object. Notice that the passed-in object does not need |
|---|
| 85 |
to be instantiated prior to the formation of the Expression. |
|---|
| 86 |
|
|---|
| 87 |
|
|---|
| 88 |
Late binding of arguments (lazier yet!): |
|---|
| 89 |
>>> e = logic.Expression(lambda x, **kw: x.a == kw['Size']) |
|---|
| 90 |
>>> class DumbObject(object): |
|---|
| 91 |
... a = 3 |
|---|
| 92 |
... |
|---|
| 93 |
>>> pass # Do some other things... |
|---|
| 94 |
>>> e.bind_args(Size=3) |
|---|
| 95 |
>>> e.evaluate(DumbObject()) |
|---|
| 96 |
True |
|---|
| 97 |
|
|---|
| 98 |
If the lambda possesses a **kwargs argument in its signature, that |
|---|
| 99 |
dictionary may be used to pass in late-bound locals. Before calling |
|---|
| 100 |
Expression.evaluate, callers should call .bind_args, passing a dict. |
|---|
| 101 |
Each subscript of kwargs will be looked up in that dictionary, and |
|---|
| 102 |
the mapped value will replace the operand in the evaluation step. |
|---|
| 103 |
|
|---|
| 104 |
|
|---|
| 105 |
Derivation (Decompilation) and Translation: |
|---|
| 106 |
'Deriving' is the opposite of 'parsing'. The codewalk.LambdaDecompiler |
|---|
| 107 |
class walks a function or code object and produces equivalent Python |
|---|
| 108 |
code in a string. |
|---|
| 109 |
|
|---|
| 110 |
>>> e = logic.Expression(lambda x: x.a == 3 and (x.b > 1 or x.b < -10)) |
|---|
| 111 |
>>> codewalk.LambdaDecompiler(e.func).code() |
|---|
| 112 |
'lambda x: not ((x.a == 3) and ((x.b > 1) or (x.b < -10)))' |
|---|
| 113 |
|
|---|
| 114 |
However, we are not limited to Python statements of our Expression! |
|---|
| 115 |
Another decompiler might produce our Expression in another language; |
|---|
| 116 |
this example produces a WHERE clause for SQL (a declarative language!): |
|---|
| 117 |
|
|---|
| 118 |
>>> e = logic.Expression(lambda x: x.Group == '3' and |
|---|
| 119 |
x.Date > datetime.date(2004, 2, 14) and |
|---|
| 120 |
x.Name.endswith('_')) |
|---|
| 121 |
>>> ADOSQLDecompiler(e).code() |
|---|
| 122 |
"([Group] = '3' and [Date] > #2/14/2004#) and [Name] Like '%\\_'" |
|---|
| 123 |
|
|---|
| 124 |
Pickling: |
|---|
| 125 |
The Expression object includes custom pickling code (__getstate__ and |
|---|
| 126 |
__setstate__). You might notice that the function itself is *not* |
|---|
| 127 |
pickled; instead, its code() method is called, which produces a |
|---|
| 128 |
string representation of the function (decompilation). This makes |
|---|
| 129 |
pickled Expressions much more stable across Python versions than, |
|---|
| 130 |
say, storing the function's co_code. However, this presents a problem |
|---|
| 131 |
when the Expression is unpickled: the function must be eval'ed and |
|---|
| 132 |
run through an EarlyBinder again. When this occurs (in __setstate__), |
|---|
| 133 |
some of the free variables which were present in func_globals at the |
|---|
| 134 |
time of pickling may not be present when the Expression is unpickled. |
|---|
| 135 |
For example, an Expression which is built in myapp.py may include |
|---|
| 136 |
a datetime.date object in its co_consts. When that Expression is |
|---|
| 137 |
unpickled, its function is eval'ed within this module, not within |
|---|
| 138 |
myapp.py; since this module does not import 'datetime', it will not |
|---|
| 139 |
be included in the func_globals of the reconstituted function, and |
|---|
| 140 |
codewalk.EarlyBinder will fail on LOAD_GLOBAL. |
|---|
| 141 |
|
|---|
| 142 |
Therefore, code which uses this module must determine which objects |
|---|
| 143 |
will be referenced as Expressions are unpickled. Any that are neither |
|---|
| 144 |
builtins nor in this module's globals() need to be injected into this |
|---|
| 145 |
module, so they can be referenced in eval() when the Expression is |
|---|
| 146 |
unpickled. |
|---|
| 147 |
|
|---|
| 148 |
""" |
|---|
| 149 |
|
|---|
| 150 |
from dejavu import codewalk |
|---|
| 151 |
from types import CodeType, FunctionType |
|---|
| 152 |
|
|---|
| 153 |
|
|---|
| 154 |
|
|---|
| 155 |
|
|---|
| 156 |
import datetime |
|---|
| 157 |
|
|---|
| 158 |
try: |
|---|
| 159 |
import fixedpoint |
|---|
| 160 |
from fixedpoint import FixedPoint |
|---|
| 161 |
except ImportError: |
|---|
| 162 |
pass |
|---|
| 163 |
|
|---|
| 164 |
try: |
|---|
| 165 |
import decimal |
|---|
| 166 |
from decimal import Decimal |
|---|
| 167 |
except ImportError: |
|---|
| 168 |
pass |
|---|
| 169 |
|
|---|
| 170 |
|
|---|
| 171 |
class Aggregator(codewalk.Rewriter): |
|---|
| 172 |
"""Combine two code objects into one.""" |
|---|
| 173 |
|
|---|
| 174 |
def __init__(self, obj): |
|---|
| 175 |
codewalk.Rewriter.__init__(self, obj) |
|---|
| 176 |
self.instr_index = [None] * len(self._bytecode) |
|---|
| 177 |
|
|---|
| 178 |
def and_combine(self, obj): |
|---|
| 179 |
obj = codewalk.Rewriter(obj) |
|---|
| 180 |
bytecode = map(ord, obj.co_code) |
|---|
| 181 |
newtarget = len(bytecode) |
|---|
| 182 |
|
|---|
| 183 |
self._bytecode.pop() |
|---|
| 184 |
self._bytecode.extend([111, newtarget & 0xFF, newtarget >> 8, |
|---|
| 185 |
1]) |
|---|
| 186 |
self._bytecode.extend(bytecode) |
|---|
| 187 |
self.instr_index[-1:] = [obj] * (newtarget + 4) |
|---|
| 188 |
|
|---|
| 189 |
def or_combine(self, obj): |
|---|
| 190 |
obj = codewalk.Rewriter(obj) |
|---|
| 191 |
bytecode = map(ord, obj.co_code) |
|---|
| 192 |
newtarget = len(bytecode) |
|---|
| 193 |
|
|---|
| 194 |
self._bytecode.pop() |
|---|
| 195 |
self._bytecode.extend([112, newtarget & 0xFF, newtarget >> 8, |
|---|
| 196 |
1]) |
|---|
| 197 |
self._bytecode.extend(bytecode) |
|---|
| 198 |
self.instr_index[-1:] = [obj] * (newtarget + 4) |
|---|
| 199 |
|
|---|
| 200 |
def visit_LOAD_ATTR(self, lo, hi): |
|---|
| 201 |
src = self.instr_index[self.cursor] |
|---|
| 202 |
if src: |
|---|
| 203 |
value = src.co_names[lo + (hi << 8)] |
|---|
| 204 |
newindex = self.name_index(value) |
|---|
| 205 |
self.newcode[-2:] = [newindex & 0xFF, newindex >> 8] |
|---|
| 206 |
|
|---|
| 207 |
def visit_LOAD_CONST(self, lo, hi): |
|---|
| 208 |
src = self.instr_index[self.cursor] |
|---|
| 209 |
if src: |
|---|
| 210 |
value = src.co_consts[lo + (hi << 8)] |
|---|
| 211 |
newindex = self.const_index(value) |
|---|
| 212 |
self.newcode[-2:] = [newindex & 0xFF, newindex >> 8] |
|---|
| 213 |
|
|---|
| 214 |
|
|---|
| 215 |
class Expression(object): |
|---|
| 216 |
"""A filter for objects.""" |
|---|
| 217 |
|
|---|
| 218 |
def __init__(self, func=None, kwtypes=None): |
|---|
| 219 |
"""Expression(func, [kwtypes]={}). func(obj, [**kw]) must return bool. |
|---|
| 220 |
|
|---|
| 221 |
func: a function, with one positional arg and optional keyword args, |
|---|
| 222 |
which must return bool. If func is None, it is initialized to |
|---|
| 223 |
lambda x: True |
|---|
| 224 |
kwtypes: a dictionary of {keyword: type} pairs. |
|---|
| 225 |
""" |
|---|
| 226 |
if func is None: |
|---|
| 227 |
func = lambda x: True |
|---|
| 228 |
self._load_func(func) |
|---|
| 229 |
if kwtypes is None: |
|---|
| 230 |
kwtypes = {} |
|---|
| 231 |
self.kwtypes = kwtypes |
|---|
| 232 |
self.kwargs = {} |
|---|
| 233 |
|
|---|
| 234 |
def _load_func(self, func): |
|---|
| 235 |
|
|---|
| 236 |
binder = codewalk.EarlyBinder(func, taintlist=['now', 'today', |
|---|
| 237 |
'iscurrentweek']) |
|---|
| 238 |
self.func = binder.function() |
|---|
| 239 |
|
|---|
| 240 |
def code(self): |
|---|
| 241 |
return codewalk.LambdaDecompiler(self.func).code() |
|---|
| 242 |
|
|---|
| 243 |
def __repr__(self): |
|---|
| 244 |
return 'logic.Expression(%s)' % self.code() |
|---|
| 245 |
|
|---|
| 246 |
def __and__(self, other): |
|---|
| 247 |
"""Logical-and this Expression with another.""" |
|---|
| 248 |
assert isinstance(other, Expression) |
|---|
| 249 |
ag = Aggregator(self.func) |
|---|
| 250 |
ag.and_combine(other.func) |
|---|
| 251 |
agfunc = ag.function() |
|---|
| 252 |
return Expression(agfunc) |
|---|
| 253 |
__add__ = __and__ |
|---|
| 254 |
|
|---|
| 255 |
def __or__(self, other): |
|---|
| 256 |
"""Logical-or this Expression with another.""" |
|---|
| 257 |
assert isinstance(other, Expression) |
|---|
| 258 |
ag = Aggregator(self.func) |
|---|
| 259 |
ag.or_combine(other.func) |
|---|
| 260 |
agfunc = ag.function() |
|---|
| 261 |
return Expression(agfunc) |
|---|
| 262 |
|
|---|
| 263 |
def bind_args(self, **kwargs): |
|---|
| 264 |
self.kwargs = {} |
|---|
| 265 |
self.kwargs.update(kwargs) |
|---|
| 266 |
|
|---|
| 267 |
def evaluate(self, obj): |
|---|
| 268 |
kw = self.kwargs |
|---|
| 269 |
return self.func(obj, **kw) |
|---|
| 270 |
|
|---|
| 271 |
def __getstate__(self): |
|---|
| 272 |
return (self.code(), self.kwtypes, self.kwargs) |
|---|
| 273 |
|
|---|
| 274 |
def __setstate__(self, state): |
|---|
| 275 |
if len(state) == 2: |
|---|
| 276 |
|
|---|
| 277 |
func, self.kwtypes = state |
|---|
| 278 |
self.kwargs = {} |
|---|
| 279 |
else: |
|---|
| 280 |
func, self.kwtypes, self.kwargs = state |
|---|
| 281 |
|
|---|
| 282 |
|
|---|
| 283 |
|
|---|
| 284 |
|
|---|
| 285 |
|
|---|
| 286 |
f = eval(func) |
|---|
| 287 |
self._load_func(f) |
|---|
| 288 |
|
|---|
| 289 |
|
|---|
| 290 |
def filter(**kwargs): |
|---|
| 291 |
"""Form an Expression from keyword arguments. |
|---|
| 292 |
|
|---|
| 293 |
Allows you to write: |
|---|
| 294 |
e = logic.filter(a=3, b=1) |
|---|
| 295 |
...instead of: |
|---|
| 296 |
e = logic.Expression(lambda x: x.a == 3 and x.b == 1) |
|---|
| 297 |
""" |
|---|
| 298 |
co, names, consts = [], ['x', ], [None, ] |
|---|
| 299 |
i = 0 |
|---|
| 300 |
for key, val in kwargs.iteritems(): |
|---|
| 301 |
i += 1 |
|---|
| 302 |
names.append(key) |
|---|
| 303 |
consts.append(val) |
|---|
| 304 |
co += [124, 0, 0, |
|---|
| 305 |
105, i, 0, |
|---|
| 306 |
100, i, 0, |
|---|
| 307 |
106, 2, 0, |
|---|
| 308 |
111, 0, 0, |
|---|
| 309 |
1, |
|---|
| 310 |
] |
|---|
| 311 |
if kwargs: |
|---|
| 312 |
|
|---|
| 313 |
del co[-4:] |
|---|
| 314 |
co.append(83) |
|---|
| 315 |
|
|---|
| 316 |
|
|---|
| 317 |
for op in range(len(co)): |
|---|
| 318 |
if co[op] == 111: |
|---|
| 319 |
co[op + 1] = (len(co) - 4) - op |
|---|
| 320 |
|
|---|
| 321 |
|
|---|
| 322 |
|
|---|
| 323 |
|
|---|
| 324 |
|
|---|
| 325 |
co = CodeType(1, 1, 2, 67, ''.join(map(chr, co)), |
|---|
| 326 |
tuple(consts), tuple(names), ('x', ), |
|---|
| 327 |
'', '<lambda>', 1, '') |
|---|
| 328 |
func = FunctionType(co, {}) |
|---|
| 329 |
return Expression(func) |
|---|
| 330 |
|
|---|
| 331 |
|
|---|
| 332 |
def comparison(attr, cmp_op, criteria): |
|---|
| 333 |
"""Form an Expression lambda x: x.attr cmp_op criteria. |
|---|
| 334 |
|
|---|
| 335 |
Allows you to write: |
|---|
| 336 |
e = logic.comparison('Size', cmp_op_index, 4) |
|---|
| 337 |
...instead of: |
|---|
| 338 |
e = logic.Expression(lambda x: x.Size <= 4) |
|---|
| 339 |
|
|---|
| 340 |
This allows one to pass dynamic, isolated arguments, without having |
|---|
| 341 |
to construct a lambda out of them first. |
|---|
| 342 |
""" |
|---|
| 343 |
|
|---|
| 344 |
|
|---|
| 345 |
|
|---|
| 346 |
if cmp_op < 0 or cmp_op > 11: |
|---|
| 347 |
raise ValueError("The cmp_op argument must be between 0 and 11") |
|---|
| 348 |
|
|---|
| 349 |
if not isinstance(attr, str): |
|---|
| 350 |
attr = str(attr) |
|---|
| 351 |
names = ('x', attr) |
|---|
| 352 |
|
|---|
| 353 |
consts = (None, criteria) |
|---|
| 354 |
co = [124, 0, 0, |
|---|
| 355 |
105, 1, 0, |
|---|
| 356 |
100, 1, 0, |
|---|
| 357 |
106, cmp_op, 0, |
|---|
| 358 |
83, |
|---|
| 359 |
] |
|---|
| 360 |
|
|---|
| 361 |
|
|---|
| 362 |
|
|---|
| 363 |
|
|---|
| 364 |
|
|---|
| 365 |
co = CodeType(1, 1, 2, 67, ''.join(map(chr, co)), |
|---|
| 366 |
consts, names, ('x',), '', '<lambda>', 1, '') |
|---|
| 367 |
func = FunctionType(co, {}) |
|---|
| 368 |
return Expression(func) |
|---|
| 369 |
|
|---|