Sunday, May 1, 2011

C#: How to parse arbitrary strings into expression trees?

In a project that I'm working on I have to work with a rather weird data source. I can give it a "query" and it will return me a DataTable. But the query is not a traditional string. It's more like... a set of method calls that define the criteria that I want. Something along these lines:

var tbl = MySource.GetObject("TheTable");
tbl.AddFilterRow(new FilterRow("Column1", 123, FilterRow.Expression.Equals));
tbl.AddFilterRow(new FilterRow("Column2", 456, FilterRow.Expression.LessThan));
var result = tbl.GetDataTable();

In essence, it supports all the standard stuff (boolean operators, parantheses, a few functions, etc.) but the syntax for writing it is quite verbose and uncomfortable for everyday use.

I wanted to make a little parser that would parse a given expression (like "Column1 = 123 AND Column2 < 456") and convert it to the above function calls. Also, it would be nice if I could add parameters there, so I would be protected against injection attacks. The last little piece of sugar on the top would be if it could cache the parse results and reuse them when the same query is to be re-executed on another object.

So I was wondering - are there any existing solutions that I could use for this, or will I have to roll out my own expression parser? It's not too complicated, but if I can save myself two or three days of coding and a heapload of bugs to fix, it would be worth it.

From stackoverflow
  • Try out Irony. Though the documentation is lacking, the samples will get you up and running very quickly. Irony is a project for parsing code and building abstract syntax trees, but you might have to write a little logic to create a form that suits your needs. The DLR may be the complement for this, since it can dynamically generate / execute code from abstract syntax trees (it's used for IronPython and IronRuby). The two should make a good pair.

    Oh, and they're both first-class .NET solutions and open source.

    Vilx- : Looks huge. My need is tiny. :P
    OregonGhost : The Irony assembly is 171 KB in size (debug version). And you can compile it into your app, if you need to, since the source code is available. For your needs, it should be quite simple (i.e. not much code) to use it. I use it in an expression parser / evaluator project, and the code is just a few hundred lines, though my expression language is much more complex than what you described. The DLR, on the other hand, really is a bit larger, but it's not really necessary for you :)
  • Bison or JavaCC or the like will generate a parser from a grammar. You can then augment the nodes of the tree with your own code to transform the expression.

    OP comments: I really don't want to ship 3rd party executables with my soft. I want it to be compiled in my code.

    Both tools generate source code, which you link with.

    Vilx- : I really don't want to ship 3rd party executables with my soft. I want it to be compiled in my code.
  • Check also this link. Seems appropriate for your goal:

    Parsing Expression Grammar

  • I wrote a parser for exaclty this usage and complexity level by hand. It took about 2 days. I'm glad I did it, but I wouldn't do it again. I'd use ANTLR or F#'s Fslex.

0 comments:

Post a Comment