Monday, 23 July 2007

How to use the between predicate

The between predicate is an invaluable tool in your troubleshooting arsenal but it can also be quite tricky to use. The format of the predicate is:
between(a,z,x,b)

a and z should be code units that represent the start and end of the between block.
x is the code between a and z. If x is output, it will come in the form of one line of soot analysis at at time, in the form of units. If x is input, it should be a unit (or unit box) of one or more lines of code.
b is a boolean choice that selects between must (true) and may (false) analysis, eg. code in an if block between a and z may not be called between a and z and thus will be included in a may, but not a must analysis.

To use between to find the x variable, the most important thing is to ensure that you specify a and z as two non-equal code units. However, units are not necessarily the most instinctive way to specify a and z. For example, you may want to specify a or z in terms of a call. The wrong way to do this is:
between(call(<-,*,*), z, X, false)

This has a number of faults including:
  • There's nothing to stop the result of your call being the same as z (an arbitrary unit) which could give a null exception
  • Your input is likely to give multiple locations, which will give multiple results from between that may not be easily distinguishable from each other
Between is very powerful but you need to be firm about your input to it. Here is an example of how to use between to find all the code that may be between two method calls, one called begin, the other called commit.
getUnit(call(<-,p,*), y), getUnit(call(<-,q,*), z), methodMatches(p, "* begin(..)), methodMatches(q, "* commit(..)"), between(y, z, X, false )

This works because call's middle parameter returns a method. This method name can then be checked with methodMatches to ensure that it has an expected name. Since the two match parameters are different, this also ensures the two results won't be equal. Then we get the unit from the result so that we are sure between will be receiving a unit as both parameters. X is the result which may be multiple lines of code we can then test.

No comments: