Hi again,
So it's really not a domain specific language at all then? You can do that without a specific parser and without stem. Just feed the data subset into your favorite analysis tool. Stem, and parsers by themselves are basically useless for analysis. Without an integrated method of performing semantic analysis (specific to the data) you end up with excessive implementation complexity and exponential processing time for even trivial tasks.
You can get the data there fast, sure, but parsing is inherently naive for all data types. The gold standard for efficiency is measured in automata based recognition. To my knowledge none of the parsers do this, so none can be considered efficient. Even if they did though it wouldn't matter. Semantic analysis is hard. So even if the parsers were realized ideally, and you had your pick of the best, the processing would still end up being exponential. Just some food for thought.
In any case, good luck, and I'll probably bring this up at some future measurement team meeting.
Regards
--leeroy