Forum > Database

Parsing expressions and convert them to SQL

(1/1)

dseligo:
I have application with multiple forms where database data is listed in DBGrids.
When I want to filter certain database data by some criteria, I usually put some controls on form (TEdit, TComboBox, TDateEdit, ...) where user can input or choose data to be filtered upon, make OnChange or similar events, change SQL queries according to entered data, ...
It works but it is labour intensive to add each new field to be filtered, hard to maintain and still rather limited for filtering.

I am working on a solution to use same code across my application: I provide fields which are allowed to be filtered for some DBGrid, let user input filtering data and create SQL code to be put below 'where' and 'having' clauses in SQL.

I've solved this and I would like to give users considerably more flexibilty to filter data.
Now I mainly let them type filter criteria with only a couple possibilites: filter exact word, date or amount (resulting SQL like: where somefield = 'John Doe'; or: where somedate = '2022-04-09'), filter words including some text (resulting SQL like: where somefield like '%John%') or some range (resulting SQL like: somedate between '2022-03-01' and '2022-03-31').

I would like to allow entering expressions to be filtered upon and then convert entered expression into SQL.

This is where I would like advice in which direction or how to proceed.

Expressions I had in mind would look like this (all entered data in SQL would be passed as parameters, but I didn't want to complicate it here):

Entered criteriaDescriptionResulting SQLJohn DoeField contains exact stringsomefield = 'John Doe'*John*field contains entered stringsomefield like '%John%'120Field containing exact numbersomefield = 120120..160Interval between 120 and 160somefield between 120 and 160<>100Field not equal to entered value (less than, greater that also should work)somefield <> 100<100&>20And operatersomefield < 100 and somefield > 20%John%|%Paul%|%Mary%Or operater(somefield like '%John%' or somefield like '%Paul%' or somefield like '%Mary%')100|(>30&<50)Use of parentheses((somefield = 100) or (somefield > 30 and somefield < 50))'a*'Single quote filters exact expression so special characters could also be searchedsomefield = 'a*'''Two single quotes enable search of empty expressionssomefield = ''12|1103|210321Date fields expansion: if one or two digits is entered it refers to current month, if four digits are entered it refers to current year(somedate = '2022-04-12' or somedate = '2022-03-11' or somedate = '2021-03-21')
So, what would be the way to parse this expressions so I can build SQL expressions the way I want to?

MarkMLl:
Looking at what you have in mind, I'd make two suggestions.

The first is that you don't want to consider e.g. * as an operator, but as a natural part of a string, i.e. /every/ string, after expressions have been parsed, is checked to see whether comparison is exact or with wildcards.

The second is that I think you'd make life easier for yourself if you had some explicit marker representing "the field under consideration". If you don't do that you will, sooner or later, find yourself painted into a corner when adding facilities but unable to change the defined format because somebody other than yourself is using it.

At that point you should be able to use a simple recursive descent or shunting yard approach.

From my own experience of having done this sort of thing multiple times, I'd suggest that you'd find it useful to scan strings using your own code (i.e. rather than a prebuilt lexer), since knowing the type of quote a string was actually using can be useful (e.g. allow both "" and '' quoting, but treat wildcards literally in a single-quoted string).

Also I'd caution that simple lexers and parsers can be fooled if they don't look far enough ahead particularly if they don't have a backtracking capability: that's one of the reasons that I think you need a "current field" marker.

I'm currently tinkering with a pattern-driven disassembler for captures from an elderly logic analyser, and because I was decided to write a simple pattern parser from scratch rather than reusing the Meta-2 implementation I've had on the go since the mid 80s I suddenly found myself with an ambiguity between ! (negation) and != (inequality) which I couldn't resolve because in the interest of simplicity I hadn't added support for backtracking.

MarkMLl

Thaddy:
Why would you try to parse from a visual control? that is the world upside down.
FPC comes as standard with an SQL parser, by the way. It is in the fcl-db package under SQL.

Plz never confuse a control with proper coding.

MarkMLl:
He isn't trying to parse SQL, he's trying to parse "query by example" instructions from users... which is an entirely reasonable thing to do.

MarkMLl

Zvoni:

--- Quote from: dseligo on April 09, 2022, 04:45:43 pm ---Entered criteriaDescriptionResulting SQLJohn DoeField contains exact stringsomefield = 'John Doe'*John*field contains entered stringsomefield like '%John%'120Field containing exact numbersomefield = 120120..160Interval between 120 and 160somefield between 120 and 160<>100Field not equal to entered value (less than, greater that also should work)somefield <> 100<100&>20And operatersomefield < 100 and somefield > 20%John%|%Paul%|%Mary%Or operater(somefield like '%John%' or somefield like '%Paul%' or somefield like '%Mary%')100|(>30&<50)Use of parentheses((somefield = 100) or (somefield > 30 and somefield < 50))'a*'Single quote filters exact expression so special characters could also be searchedsomefield = 'a*'''Two single quotes enable search of empty expressionssomefield = ''12|1103|210321Date fields expansion: if one or two digits is entered it refers to current month, if four digits are entered it refers to current year(somedate = '2022-04-12' or somedate = '2022-03-11' or somedate = '2021-03-21')
So, what would be the way to parse this expressions so I can build SQL expressions the way I want to?

--- End quote ---
Some rules i came up with for myself, since i had to do something along those lines:
1) Never ever let the User enter a Operator (in a TEdit).
2) Never ever let the User define an Interval in a single TEdit
3) If a User enters a Wildcard, strip it out. It's not his business to "wildcard" a Query
4) As a starting point: Look at the Custom Autofilter-Option in Excel
5) Expanding a Date....*sigh*.... i despise lazy Users. We have that feature in our ERP-System, and i don't know how many times i had to go hunting through the Database to find the "wrong" dates, just because the user entered the first two digits for the Day, completely missing it's the wrong month or year.
Use a masked TEdit defaulting to "today"

Navigation

[0] Message Index

Go to full version