summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMattias EngdegÄrd <mattiase@acm.org>2023-06-18 10:37:53 +0200
committerMattias EngdegÄrd <mattiase@acm.org>2023-06-18 10:42:44 +0200
commit8f62e7b85f69bb4026e9cf2971668b0d77077792 (patch)
tree5d7c37590d9f9a5a3183c4f56eab2be4eae93fd6
parenteacd75df4e475c3d2483c64f32e3edb3be5c7785 (diff)
downloademacs-8f62e7b85f69bb4026e9cf2971668b0d77077792.tar.gz
Describe primarily the Emacs s-exp dialect for treesit queries
* doc/lispref/parsing.texi (Pattern Matching, Multiple Languages): Writing tree-sitter queries as Emacs s-expressions is much more convenient than using the native query notation inside a string, so it makes sense to base the documentation on the former dialect (bug#64017).
-rw-r--r--doc/lispref/parsing.texi132
1 files changed, 66 insertions, 66 deletions
diff --git a/doc/lispref/parsing.texi b/doc/lispref/parsing.texi
index 3906ca0118a..9e1df07d25c 100644
--- a/doc/lispref/parsing.texi
+++ b/doc/lispref/parsing.texi
@@ -1084,9 +1084,9 @@ Now we can introduce the @dfn{query functions}.
@defun treesit-query-capture node query &optional beg end node-only
This function matches patterns in @var{query} within @var{node}. The
-argument @var{query} can be either a string, an s-expression, or a
-compiled query object. For now, we focus on the string syntax;
-s-expression syntax and compiled queries are described at the end of
+argument @var{query} can be either an s-expression, a string, or a
+compiled query object. For now, we focus on the s-expression syntax;
+string syntax and compiled queries are described at the end of
the section.
The argument @var{node} can also be a parser or a language symbol. A
@@ -1118,8 +1118,8 @@ For example, suppose @var{node}'s text is @code{1 + 2}, and
@example
@group
(setq query
- "(binary_expression
- (number_literal) @@number-in-exp) @@biexp")
+ '((binary_expression
+ (number_literal) @@number-in-exp) @@biexp)
@end group
@end example
@@ -1140,8 +1140,8 @@ For example, it could have two top-level patterns:
@example
@group
(setq query
- "(binary_expression) @@biexp
- (number_literal) @@number @@biexp")
+ '((binary_expression) @@biexp
+ (number_literal) @@number @@biexp)
@end group
@end example
@@ -1199,23 +1199,23 @@ field, say, a @code{function_definition} without a @code{body} field:
@subheading Quantify node
@cindex quantify node, tree-sitter
-Tree-sitter recognizes quantification operators @samp{*}, @samp{+},
-and @samp{?}. Their meanings are the same as in regular expressions:
-@samp{*} matches the preceding pattern zero or more times, @samp{+}
-matches one or more times, and @samp{?} matches zero or one times.
+Tree-sitter recognizes quantification operators @samp{:*}, @samp{:+},
+and @samp{:?}. Their meanings are the same as in regular expressions:
+@samp{:*} matches the preceding pattern zero or more times, @samp{:+}
+matches one or more times, and @samp{:?} matches zero or one times.
For example, the following pattern matches @code{type_declaration}
nodes that have @emph{zero or more} @code{long} keywords.
@example
-(type_declaration "long"*) @@long-type
+(type_declaration "long" :*) @@long-type
@end example
The following pattern matches a type declaration that may or may not
have a @code{long} keyword:
@example
-(type_declaration "long"?) @@long-type
+(type_declaration "long" :?) @@long-type
@end example
@subheading Grouping
@@ -1225,15 +1225,14 @@ groups and apply quantification operators to them. For example, to
express a comma-separated list of identifiers, one could write
@example
-(identifier) ("," (identifier))*
+(identifier) ("," (identifier)) :*
@end example
@subheading Alternation
Again, similar to regular expressions, we can express ``match any one
-of these patterns'' in a pattern. The syntax is a list of patterns
-enclosed in square brackets. For example, to capture some keywords in
-C, the pattern would be
+of these patterns'' in a pattern. The syntax is a vector of patterns.
+For example, to capture some keywords in C, the pattern would be
@example
@group
@@ -1248,7 +1247,7 @@ C, the pattern would be
@subheading Anchor
-The anchor operator @samp{.} can be used to enforce juxtaposition,
+The anchor operator @code{:anchor} can be used to enforce juxtaposition,
i.e., to enforce two things to be directly next to each other. The
two ``things'' can be two nodes, or a child and the end of its parent.
For example, to capture the first child, the last child, or two
@@ -1257,19 +1256,19 @@ adjacent children:
@example
@group
;; Anchor the child with the end of its parent.
-(compound_expression (_) @@last-child .)
+(compound_expression (_) @@last-child :anchor)
@end group
@group
;; Anchor the child with the beginning of its parent.
-(compound_expression . (_) @@first-child)
+(compound_expression :anchor (_) @@first-child)
@end group
@group
;; Anchor two adjacent children.
(compound_expression
(_) @@prev-child
- .
+ :anchor
(_) @@next-child)
@end group
@end example
@@ -1285,8 +1284,8 @@ example, with the following pattern:
@example
@group
(
- (array . (_) @@first (_) @@last .)
- (#equal @@first @@last)
+ (array :anchor (_) @@first (_) @@last :anchor)
+ (:equal @@first @@last)
)
@end group
@end example
@@ -1294,22 +1293,22 @@ example, with the following pattern:
@noindent
tree-sitter only matches arrays where the first element is equal to
the last element. To attach a predicate to a pattern, we need to
-group them together. A predicate always starts with a @samp{#}.
-Currently there are three predicates: @code{#equal}, @code{#match},
-and @code{#pred}.
+group them together. Currently there are three predicates:
+@code{:equal}, @code{:match}, and @code{:pred}.
-@deffn Predicate equal arg1 arg2
+@deffn Predicate :equal arg1 arg2
Matches if @var{arg1} is equal to @var{arg2}. Arguments can be either
strings or capture names. Capture names represent the text that the
captured node spans in the buffer.
@end deffn
-@deffn Predicate match regexp capture-name
+@deffn Predicate :match regexp capture-name
Matches if the text that @var{capture-name}'s node spans in the buffer
-matches regular expression @var{regexp}. Matching is case-sensitive.
+matches regular expression @var{regexp}, given as a string literal.
+Matching is case-sensitive.
@end deffn
-@deffn Predicate pred fn &rest nodes
+@deffn Predicate :pred fn &rest nodes
Matches if function @var{fn} returns non-@code{nil} when passed each
node in @var{nodes} as arguments.
@end deffn
@@ -1318,23 +1317,23 @@ Note that a predicate can only refer to capture names that appear in
the same pattern. Indeed, it makes little sense to refer to capture
names in other patterns.
-@heading S-expression patterns
+@heading String patterns
-@cindex tree-sitter patterns as sexps
-@cindex patterns, tree-sitter, in sexp form
-Besides strings, Emacs provides an s-expression based syntax for
-tree-sitter patterns. It largely resembles the string-based syntax.
-For example, the following query
+@cindex tree-sitter patterns as strings
+@cindex patterns, tree-sitter, in string form
+Besides s-expressions, Emacs allows the tree-sitter's native query
+syntax to be used by writing them as strings. It largely resembles
+the s-expression syntax. For example, the following query
@example
@group
(treesit-query-capture
- node "(addition_expression
- left: (_) @@left
- \"+\" @@plus-sign
- right: (_) @@right) @@addition
+ node '((addition_expression
+ left: (_) @@left
+ "+" @@plus-sign
+ right: (_) @@right) @@addition
- [\"return\" \"break\"] @@keyword")
+ ["return" "break"] @@keyword))
@end group
@end example
@@ -1344,52 +1343,53 @@ is equivalent to
@example
@group
(treesit-query-capture
- node '((addition_expression
- left: (_) @@left
- "+" @@plus-sign
- right: (_) @@right) @@addition
+ node "(addition_expression
+ left: (_) @@left
+ \"+\" @@plus-sign
+ right: (_) @@right) @@addition
- ["return" "break"] @@keyword))
+ [\"return\" \"break\"] @@keyword")
@end group
@end example
-Most patterns can be written directly as strange but nevertheless
-valid s-expressions. Only a few of them need modification:
+Most patterns can be written directly as s-expressions inside a string.
+Only a few of them need modification:
@itemize
@item
-Anchor @samp{.} is written as @code{:anchor}.
+Anchor @code{:anchor} is written as @samp{.}.
@item
-@samp{?} is written as @samp{:?}.
+@samp{:?} is written as @samp{?}.
@item
-@samp{*} is written as @samp{:*}.
+@samp{:*} is written as @samp{*}.
@item
-@samp{+} is written as @samp{:+}.
+@samp{:+} is written as @samp{+}.
@item
-@code{#equal} is written as @code{:equal}. In general, predicates
-change their @samp{#} to @samp{:}.
+@code{:equal}, @code{:match} and @code{:pred} are written as
+@code{#equal}, @code{#match} and @code{#pred}, respectively.
+In general, predicates change their @samp{:} to @samp{#}.
@end itemize
For example,
@example
@group
-"(
- (compound_expression . (_) @@first (_)* @@rest)
- (#match \"love\" @@first)
- )"
+'((
+ (compound_expression :anchor (_) @@first (_) :* @@rest)
+ (:match "love" @@first)
+ ))
@end group
@end example
@noindent
-is written in s-expression syntax as
+is written in string form as
@example
@group
-'((
- (compound_expression :anchor (_) @@first (_) :* @@rest)
- (:match "love" @@first)
- ))
+"(
+ (compound_expression . (_) @@first (_)* @@rest)
+ (#match \"love\" @@first)
+ )"
@end group
@end example
@@ -1413,7 +1413,7 @@ validate and debug the query.
@end defun
@defun treesit-query-language query
-This function return the language of @var{query}.
+This function returns the language of @var{query}.
@end defun
@defun treesit-query-expand query
@@ -1605,7 +1605,7 @@ ranges for @acronym{CSS} and JavaScript parsers:
(setq css-range
(treesit-query-range
'html
- "(style_element (raw_text) @@capture)"))
+ '((style_element (raw_text) @@capture))))
(treesit-parser-set-included-ranges css css-range)
@end group
@@ -1614,7 +1614,7 @@ ranges for @acronym{CSS} and JavaScript parsers:
(setq js-range
(treesit-query-range
'html
- "(script_element (raw_text) @@capture)"))
+ '((script_element (raw_text) @@capture))))
(treesit-parser-set-included-ranges js js-range)
@end group
@end example