Parsing SQL
Parsing is done by \sad_spirit\pg_builder\Parser class backed by \sad_spirit\pg_builder\Lexer.
The latter splits the SQL string into tokens, the former goes over the tokens building the Abstract Syntax Tree.
This section describes usage of these classes. An additional section describes implementation details that may be of interest to those trying to extend pg_builder in some way.
Parser API
Tip
It is generally not necessary to manually call methods that parse SQL statements or their parts: use
StatementFactory::createFromString() to parse complete statements
or one of its builder methods to create Statement instances from scratch.
Statement instances created by StatementFactory will automatically accept strings for their properties
and call relevant Parser methods.
It may be necessary to configure the Parser instance, however. Parser constructor accepts
an instance of Lexer (see below for its configuration options) and an optional instance
of a class implementing CacheItemPoolInterface from PSR-6.
parseSomething() methods
All public Parser methods that have parse prefix and process (parts of) SQL statements are
actually overloaded via __call() magic method. It contains code for getting / setting cache if available,
tokenizing strings with Lexer and forwarding a call to a protected method that does actual parsing work.
Several dozens of such methods are defined, e.g.
parseStatement(string|TokenStream $input): StatementParses a complete SQL statement. Used internally by StatementFactory::createFromString().
parseTypeName(string|TokenStream $input): nodes\TypeNameParses a type name. Used by converters\BuilderSupportDecorator so that it can handle any type name Postgres itself can.
Other parse*() methods are used by Node implementations that accept strings for their properties or array
offsets.
Caching of ASTs
Parser can automatically cache ASTs generated by its parseSomething() methods.
You only need to provide an instance of class implementing CacheItemPoolInterface from
PSR-6 either to Parser constructor or
to its setCache(CacheItemPoolInterface $cache): void method.
$parser = new Parser(new Lexer(), new CacheImplementation());
$parser->setCache(new AnotherCacheImplementation());
ASTs will be stored in cache under keys having parsetree- prefix.
Tip
Unserializing AST is at least 4-5 times faster than creating it from SQL. Use cache if possible.
Lexer API
The class has only one public method
tokenize(string $sql): \sad_spirit\pg_builder\TokenStreamTokenizes the input string. Usually you don’t need to call it yourself as it is automatically called by
Parserwhen a string is passed to any of itsparse*()methods.
You may need to set options via Lexer’s constructor, however:
'standard_conforming_strings'Has the same meaning as postgresql.conf parameter of the same name: when
true(default), then backslashes in'...'strings are treated literally, whenfalsethey are treated as escape characters. Backslashes ine'...'strings are always treated as escape characters, of course.
use sad_spirit\pg_builder\Lexer;
$strings = <<<TEST
'foo\\\\bar' e'foo\\\\bar'
TEST;
$lexerStandard = new Lexer([
'standard_conforming_strings' => true
]);
$lexerNonStandard = new Lexer([
'standard_conforming_strings' => false
]);
echo $lexerStandard->tokenize($strings)
. "\n\n"
. $lexerNonStandard->tokenize($strings);
will output
string literal 'foo\\bar' at position 0
string literal 'foo\bar' at position 11
end of input
string literal 'foo\bar' at position 0
string literal 'foo\bar' at position 11
end of input