commit python-lark for openSUSE:Factory

4 Oct 2024

Script 'mail_helper' called by obssrc
Hello community,

here is the log from the commit of package python-lark for openSUSE:Factory checked in at 2024-10-04 17:08:27
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Comparing /work/SRC/openSUSE:Factory/python-lark (Old)
 and      /work/SRC/openSUSE:Factory/.python-lark.new.19354 (New)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Package is "python-lark"

Fri Oct  4 17:08:27 2024 rev:9 rq:1205380 version:1.2.2

Changes:
--------

--- /work/SRC/openSUSE:Factory/python-lark/python-lark.changes	2024-01-15 22:11:00.316794908 +0100
+++ /work/SRC/openSUSE:Factory/.python-lark.new.19354/python-lark.changes	2024-10-04 17:08:33.983879725 +0200
@@ -1,0 +2,25 @@
+Thu Oct  3 08:30:59 UTC 2024 - Dirk Müller <dmueller@suse.com>
+
+- update to 1.2.2:
+  * Bugfix: Earley now respects ambiguity='resolve' again.
+- update to 1.2.1:
+  * Dropped support for Python versions lower than 3.8
+  * Several bugfixes in the Earley algorithm, related to
+    suppressed ambiguities
+  * Improved performance in `InteractiveParser.accepts()`
+  * Give "Shaping the tree" clear sub-headings
+  * Fix for when providing a transformer with a Token
+  * Pin types-regex to a working version
+  * Add Outlines to list of projects using Lark
+  * Code coverage: Update Python version
+  * Attempt to solve performance problems in accepts()
+  * Docs: Added Indenter
+  * Clean up test_parser.py, use xFail instead of skip where
+    appropriate
+  * Update config and drop python < 3.8
+  * BUGFIX Earley: Now yielding a previously repressed ambiguity
+  * Fix SymbolNode.end for completed tokens
+  * Disable ForestToParseTree cache when ambiguity='resolve'
+  * Bugfix for issue #1434
+
+-------------------------------------------------------------------

Old:
----
  lark-1.1.9.tar.gz

New:
----
  lark-1.2.2.tar.gz

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Other differences:
------------------
++++++ python-lark.spec ++++++
--- /var/tmp/diff_new_pack.OmrZCt/_old	2024-10-04 17:08:34.483900614 +0200
+++ /var/tmp/diff_new_pack.OmrZCt/_new	2024-10-04 17:08:34.483900614 +0200
@@ -18,7 +18,7 @@
 
 %{?sle15_python_module_pythons}
 Name:           python-lark
-Version:        1.1.9
+Version:        1.2.2
 Release:        0
 Summary:        A parsing library for Python
 License:        MIT

++++++ lark-1.1.9.tar.gz -> lark-1.2.2.tar.gz ++++++
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/.github/workflows/codecov.yml new/lark-1.2.2/.github/workflows/codecov.yml
--- old/lark-1.1.9/.github/workflows/codecov.yml	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/.github/workflows/codecov.yml	2024-08-13 21:47:06.000000000 +0200
@@ -8,7 +8,7 @@
         os: [ubuntu-latest, macos-latest, windows-latest]
     env:
       OS: ${{ matrix.os }}
-      PYTHON: '3.7'
+      PYTHON: '3.8'
     steps:
     - uses: actions/checkout@v3
       name: Download with submodules
@@ -17,7 +17,7 @@
     - name: Setup Python
       uses: actions/setup-python@v3
       with:
-        python-version: "3.7"
+        python-version: "3.8"
     - name: Install dependencies
       run: |
         python -m pip install --upgrade pip
@@ -35,6 +35,6 @@
         flags: unittests
         env_vars: OS,PYTHON
         name: codecov-umbrella
-        fail_ci_if_error: true
+        fail_ci_if_error: false
         path_to_write_report: ./coverage/codecov_report.txt
         verbose: true
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/.github/workflows/tests.yml new/lark-1.2.2/.github/workflows/tests.yml
--- old/lark-1.1.9/.github/workflows/tests.yml	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/.github/workflows/tests.yml	2024-08-13 21:47:06.000000000 +0200
@@ -3,11 +3,10 @@
 
 jobs:
   build:
-    # runs-on: ubuntu-latest
-    runs-on: ubuntu-20.04   # See https://github.com/actions/setup-python/issues/544
+    runs-on: ubuntu-latest
     strategy:
       matrix:
-        python-version: ["3.6", "3.7", "3.8", "3.9", "3.10", "3.11", "3.12", "pypy-3.7"]
+        python-version: ["3.8", "3.9", "3.10", "3.11", "3.12", "3.13-dev", "pypy-3.10"]
 
     steps:
       - uses: actions/checkout@v3
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/README.md new/lark-1.2.2/README.md
--- old/lark-1.1.9/README.md	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/README.md	2024-08-13 21:47:06.000000000 +0200
@@ -125,7 +125,7 @@
 
 Check out the [JSON tutorial](/docs/json_tutorial.md#conclusion) for more details on how the comparison was made.
 
-For a more thorough and objective comparison, checkout the [Python Parsing Benchmarks](https://github.com/goodmami/python-parsing-benchmarks) repo.
+For thorough 3rd-party benchmarks, checkout the [Python Parsing Benchmarks](https://github.com/goodmami/python-parsing-benchmarks) repo.
 
 #### Feature comparison
 
@@ -164,6 +164,7 @@
  - [harmalysis](https://github.com/napulen/harmalysis) - A language for harmonic analysis and music theory
  - [gersemi](https://github.com/BlankSpruce/gersemi) - A CMake code formatter
  - [MistQL](https://github.com/evinism/mistql) - A query language for JSON-like structures
+ - [Outlines](https://github.com/outlines-dev/outlines) - Structured generation with Large Language Models
 
 [Full list](https://github.com/lark-parser/lark/network/dependents?package_id=UGFja2FnZS...)
 
@@ -179,8 +180,8 @@
 
 Big thanks to everyone who contributed so far:
 
-<a href="https://github.com/lark-parser/lark/graphs/contributors">
   <img src="https://contributors-img.web.app/image?repo=lark-parser/lark" />
+<a href="https://github.com/lark-parser/lark/graphs/contributors">
 </a>
 
 ## Sponsor
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/docs/classes.rst new/lark-1.2.2/docs/classes.rst
--- old/lark-1.1.9/docs/classes.rst	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/docs/classes.rst	2024-08-13 21:47:06.000000000 +0200
@@ -90,3 +90,9 @@
 .. autofunction:: lark.ast_utils.create_transformer
 
 .. _/examples/advanced/create_ast.py: examples/advanced/create_ast.html
+
+Indenter
+--------
+
+.. autoclass:: lark.indenter.Indenter
+.. autoclass:: lark.indenter.PythonIndenter
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/docs/parsers.md new/lark-1.2.2/docs/parsers.md
--- old/lark-1.1.9/docs/parsers.md	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/docs/parsers.md	2024-08-13 21:47:06.000000000 +0200
@@ -23,7 +23,7 @@
 
 2) Users may choose to receive the set of all possible parse-trees (using ambiguity='explicit'), and choose the best derivation themselves. While simple and flexible, it comes at the cost of space and performance, and so it isn't recommended for highly ambiguous grammars, or very long inputs.
 
-3) As an advanced feature, users may use specialized visitors to iterate the SPPF themselves.
+3) As an advanced feature, users may use specialized visitors to iterate the SPPF themselves. There is also [a 3rd party utility for iterating over the SPPF](https://github.com/chanicpanic/lark-ambig-tools).
 
 **lexer="dynamic_complete"**
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/docs/tree_construction.md new/lark-1.2.2/docs/tree_construction.md
--- old/lark-1.1.9/docs/tree_construction.md	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/docs/tree_construction.md	2024-08-13 21:47:06.000000000 +0200
@@ -78,8 +78,9 @@
 
 Users can alter the automatic construction of the tree using a collection of grammar features.
 
+### Inlining rules with `_`
 
-* Rules whose name begins with an underscore will be inlined into their containing rule.
+Rules whose name begins with an underscore will be inlined into their containing rule.
 
 **Example:**
 
@@ -94,8 +95,9 @@
         "hello"
         "world"
 
+### Conditionally inlining rules with `?`
 
-* Rules that receive a question mark (?) at the beginning of their definition, will be inlined if they have a single child, after filtering.
+Rules that receive a question mark (?) at the beginning of their definition, will be inlined if they have a single child, after filtering.
 
 **Example:**
 
@@ -113,7 +115,9 @@
             "world"
         "planet"
 
-* Rules that begin with an exclamation mark will keep all their terminals (they won't get filtered).
+### Pinning rule terminals with `!`
+
+Rules that begin with an exclamation mark will keep all their terminals (they won't get filtered).
 
 ```perl
     !expr: "(" expr ")"
@@ -136,7 +140,9 @@
 
 Using the `!` prefix is usually a "code smell", and may point to a flaw in your grammar design.
 
-* Aliases - options in a rule can receive an alias. It will be then used as the branch name for the option, instead of the rule name.
+### Aliasing rules
+
+Aliases - options in a rule can receive an alias. It will be then used as the branch name for the option, instead of the rule name.
 
 **Example:**
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/__init__.py new/lark-1.2.2/lark/__init__.py
--- old/lark-1.1.9/lark/__init__.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/__init__.py	2024-08-13 21:47:06.000000000 +0200
@@ -14,7 +14,7 @@
 from .utils import logger
 from .visitors import Discard, Transformer, Transformer_NonRecursive, Visitor, v_args
 
-__version__: str = "1.1.9"
+__version__: str = "1.2.2"
 
 __all__ = (
     "GrammarError",
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/common.py new/lark-1.2.2/lark/common.py
--- old/lark-1.1.9/lark/common.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/common.py	2024-08-13 21:47:06.000000000 +0200
@@ -8,10 +8,7 @@
     from .lexer import Lexer
     from .grammar import Rule
     from typing import Union, Type
-    if sys.version_info >= (3, 8):
-        from typing import Literal
-    else:
-        from typing_extensions import Literal
+    from typing import Literal
     if sys.version_info >= (3, 10):
         from typing import TypeAlias
     else:
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/indenter.py new/lark-1.2.2/lark/indenter.py
--- old/lark-1.1.9/lark/indenter.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/indenter.py	2024-08-13 21:47:06.000000000 +0200
@@ -13,6 +13,19 @@
     pass
 
 class Indenter(PostLex, ABC):
+    """This is a postlexer that "injects" indent/dedent tokens based on indentation.
+
+    It keeps track of the current indentation, as well as the current level of parentheses.
+    Inside parentheses, the indentation is ignored, and no indent/dedent tokens get generated.
+
+    Note: This is an abstract class. To use it, inherit and implement all its abstract methods:
+        - tab_len
+        - NL_type
+        - OPEN_PAREN_types, CLOSE_PAREN_types
+        - INDENT_type, DEDENT_type
+
+    See also: the ``postlex`` option in `Lark`.
+    """
     paren_level: int
     indent_level: List[int]
 
@@ -73,35 +86,53 @@
     @property
     @abstractmethod
     def NL_type(self) -> str:
+        "The name of the newline token"
         raise NotImplementedError()
 
     @property
     @abstractmethod
     def OPEN_PAREN_types(self) -> List[str]:
+        "The names of the tokens that open a parenthesis"
         raise NotImplementedError()
 
     @property
     @abstractmethod
     def CLOSE_PAREN_types(self) -> List[str]:
+        """The names of the tokens that close a parenthesis
+        """
         raise NotImplementedError()
 
     @property
     @abstractmethod
     def INDENT_type(self) -> str:
+        """The name of the token that starts an indentation in the grammar.
+
+        See also: %declare
+        """
         raise NotImplementedError()
 
     @property
     @abstractmethod
     def DEDENT_type(self) -> str:
+        """The name of the token that end an indentation in the grammar.
+
+        See also: %declare
+        """
         raise NotImplementedError()
 
     @property
     @abstractmethod
     def tab_len(self) -> int:
+        """How many spaces does a tab equal"""
         raise NotImplementedError()
 
 
 class PythonIndenter(Indenter):
+    """A postlexer that "injects" _INDENT/_DEDENT tokens based on indentation, according to the Python syntax.
+
+    See also: the ``postlex`` option in `Lark`.
+    """
+
     NL_type = '_NEWLINE'
     OPEN_PAREN_types = ['LPAR', 'LSQB', 'LBRACE']
     CLOSE_PAREN_types = ['RPAR', 'RSQB', 'RBRACE']
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/lark.py new/lark-1.2.2/lark/lark.py
--- old/lark-1.1.9/lark/lark.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/lark.py	2024-08-13 21:47:06.000000000 +0200
@@ -12,14 +12,11 @@
     from .parsers.lalr_interactive_parser import InteractiveParser
     from .tree import ParseTree
     from .visitors import Transformer
-    if sys.version_info >= (3, 8):
-        from typing import Literal
-    else:
-        from typing_extensions import Literal
+    from typing import Literal
     from .parser_frontends import ParsingFrontend
 
 from .exceptions import ConfigurationError, assert_config, UnexpectedInput
-from .utils import Serialize, SerializeMemoizer, FS, isascii, logger
+from .utils import Serialize, SerializeMemoizer, FS, logger
 from .load_grammar import load_grammar, FromPackageLoader, Grammar, verify_used_files, PackageResource, sha256_digest
 from .tree import Tree
 from .common import LexerConf, ParserConf, _ParserArgType, _LexerArgType
@@ -303,7 +300,7 @@
         if isinstance(grammar, str):
             self.source_grammar = grammar
             if self.options.use_bytes:
-                if not isascii(grammar):
+                if not grammar.isascii():
                     raise ConfigurationError("Grammar must be ascii only, when use_bytes=True")
 
             if self.options.cache:
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/parsers/earley.py new/lark-1.2.2/lark/parsers/earley.py
--- old/lark-1.1.9/lark/parsers/earley.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/parsers/earley.py	2024-08-13 21:47:06.000000000 +0200
@@ -15,7 +15,7 @@
 from ..lexer import Token
 from ..tree import Tree
 from ..exceptions import UnexpectedEOF, UnexpectedToken
-from ..utils import logger, OrderedSet
+from ..utils import logger, OrderedSet, dedup_list
 from .grammar_analysis import GrammarAnalyzer
 from ..grammar import NonTerminal
 from .earley_common import Item
@@ -169,6 +169,7 @@
                         items.append(new_item)
 
     def _parse(self, lexer, columns, to_scan, start_symbol=None):
+
         def is_quasi_complete(item):
             if item.is_complete:
                 return True
@@ -281,7 +282,7 @@
         # If the parse was successful, the start
         # symbol should have been completed in the last step of the Earley cycle, and will be in
         # this column. Find the item for the start_symbol, which is the root of the SPPF tree.
-        solutions = [n.node for n in columns[-1] if n.is_complete and n.node is not None and n.s == start_symbol and n.start == 0]
+        solutions = dedup_list(n.node for n in columns[-1] if n.is_complete and n.node is not None and n.s == start_symbol and n.start == 0)
         if not solutions:
             expected_terminals = [t.expect.name for t in to_scan]
             raise UnexpectedEOF(expected_terminals, state=frozenset(i.s for i in to_scan))
@@ -293,16 +294,24 @@
             except ImportError:
                 logger.warning("Cannot find dependency 'pydot', will not generate sppf debug image")
             else:
-                debug_walker.visit(solutions[0], "sppf.png")
-
+                for i, s in enumerate(solutions):
+                    debug_walker.visit(s, f"sppf{i}.png")
 
-        if len(solutions) > 1:
-            assert False, 'Earley should not generate multiple start symbol items!'
 
         if self.Tree is not None:
             # Perform our SPPF -> AST conversion
-            transformer = ForestToParseTree(self.Tree, self.callbacks, self.forest_sum_visitor and self.forest_sum_visitor(), self.resolve_ambiguity)
-            return transformer.transform(solutions[0])
+            # Disable the ForestToParseTree cache when ambiguity='resolve'
+            # to prevent a tree construction bug. See issue #1283
+            use_cache = not self.resolve_ambiguity
+            transformer = ForestToParseTree(self.Tree, self.callbacks, self.forest_sum_visitor and self.forest_sum_visitor(), self.resolve_ambiguity, use_cache)
+            solutions = [transformer.transform(s) for s in solutions]
+
+            if len(solutions) > 1 and not self.resolve_ambiguity:
+                t: Tree = self.Tree('_ambig', solutions)
+                t.expand_kids_by_data('_ambig')     # solutions may themselves be _ambig nodes
+                return t
+            return solutions[0]
 
         # return the root of the SPPF
+        # TODO return a list of solutions, or join them together somehow
         return solutions[0]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/parsers/earley_common.py new/lark-1.2.2/lark/parsers/earley_common.py
--- old/lark-1.1.9/lark/parsers/earley_common.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/parsers/earley_common.py	2024-08-13 21:47:06.000000000 +0200
@@ -20,13 +20,13 @@
             self.s = (rule, ptr)
             self.expect = rule.expansion[ptr]
             self.previous = rule.expansion[ptr - 1] if ptr > 0 and len(rule.expansion) else None
-        self._hash = hash((self.s, self.start))
+        self._hash = hash((self.s, self.start, self.rule))
 
     def advance(self):
         return Item(self.rule, self.ptr + 1, self.start)
 
     def __eq__(self, other):
-        return self is other or (self.s == other.s and self.start == other.start)
+        return self is other or (self.s == other.s and self.start == other.start and self.rule == other.rule)
 
     def __hash__(self):
         return self._hash
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/parsers/earley_forest.py new/lark-1.2.2/lark/parsers/earley_forest.py
--- old/lark-1.1.9/lark/parsers/earley_forest.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/parsers/earley_forest.py	2024-08-13 21:47:06.000000000 +0200
@@ -38,15 +38,15 @@
 
     Parameters:
         s: A Symbol, or a tuple of (rule, ptr) for an intermediate node.
-        start: The index of the start of the substring matched by this symbol (inclusive).
-        end: The index of the end of the substring matched by this symbol (exclusive).
+        start: For dynamic lexers, the index of the start of the substring matched by this symbol (inclusive).
+        end: For dynamic lexers, the index of the end of the substring matched by this symbol (exclusive).
 
     Properties:
         is_intermediate: True if this node is an intermediate node.
         priority: The priority of the node's symbol.
     """
     Set: Type[AbstractSet] = set   # Overridden by StableSymbolNode
-    __slots__ = ('s', 'start', 'end', '_children', 'paths', 'paths_loaded', 'priority', 'is_intermediate', '_hash')
+    __slots__ = ('s', 'start', 'end', '_children', 'paths', 'paths_loaded', 'priority', 'is_intermediate')
     def __init__(self, s, start, end):
         self.s = s
         self.start = start
@@ -59,7 +59,6 @@
         #   unlike None or float('NaN'), and sorts appropriately.
         self.priority = float('-inf')
         self.is_intermediate = isinstance(s, tuple)
-        self._hash = hash((self.s, self.start, self.end))
 
     def add_family(self, lr0, rule, start, left, right):
         self._children.add(PackedNode(self, lr0, rule, start, left, right))
@@ -93,14 +92,6 @@
     def __iter__(self):
         return iter(self._children)
 
-    def __eq__(self, other):
-        if not isinstance(other, SymbolNode):
-            return False
-        return self is other or (type(self.s) == type(other.s) and self.s == other.s and self.start == other.start and self.end is other.end)
-
-    def __hash__(self):
-        return self._hash
-
     def __repr__(self):
         if self.is_intermediate:
             rule = self.s[0]
@@ -618,9 +609,10 @@
                 children.append(data.left)
         if data.right is not PackedData.NO_DATA:
             children.append(data.right)
-        if node.parent.is_intermediate:
-            return self._cache.setdefault(id(node), children)
-        return self._cache.setdefault(id(node), self._call_rule_func(node, children))
+        transformed = children if node.parent.is_intermediate else self._call_rule_func(node, children)
+        if self._use_cache:
+            self._cache[id(node)] = transformed
+        return transformed
 
     def visit_symbol_node_in(self, node):
         super(ForestToParseTree, self).visit_symbol_node_in(node)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/parsers/grammar_analysis.py new/lark-1.2.2/lark/parsers/grammar_analysis.py
--- old/lark-1.1.9/lark/parsers/grammar_analysis.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/parsers/grammar_analysis.py	2024-08-13 21:47:06.000000000 +0200
@@ -3,7 +3,7 @@
 from collections import Counter, defaultdict
 from typing import List, Dict, Iterator, FrozenSet, Set
 
-from ..utils import bfs, fzset, classify
+from ..utils import bfs, fzset, classify, OrderedSet
 from ..exceptions import GrammarError
 from ..grammar import Rule, Terminal, NonTerminal, Symbol
 from ..common import ParserConf
@@ -177,13 +177,13 @@
 
         self.FIRST, self.FOLLOW, self.NULLABLE = calculate_sets(rules)
 
-    def expand_rule(self, source_rule: NonTerminal, rules_by_origin=None) -> State:
+    def expand_rule(self, source_rule: NonTerminal, rules_by_origin=None) -> OrderedSet[RulePtr]:
         "Returns all init_ptrs accessible by rule (recursive)"
 
         if rules_by_origin is None:
             rules_by_origin = self.rules_by_origin
 
-        init_ptrs = set()
+        init_ptrs = OrderedSet[RulePtr]()
         def _expand_rule(rule: NonTerminal) -> Iterator[NonTerminal]:
             assert not rule.is_term, rule
 
@@ -200,4 +200,4 @@
         for _ in bfs([source_rule], _expand_rule):
             pass
 
-        return fzset(init_ptrs)
+        return init_ptrs
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/parsers/lalr_interactive_parser.py new/lark-1.2.2/lark/parsers/lalr_interactive_parser.py
--- old/lark-1.1.9/lark/parsers/lalr_interactive_parser.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/parsers/lalr_interactive_parser.py	2024-08-13 21:47:06.000000000 +0200
@@ -6,6 +6,7 @@
 
 from lark.exceptions import UnexpectedToken
 from lark.lexer import Token, LexerThread
+from .lalr_parser_state import ParserState
 
 ###{standalone
 
@@ -14,7 +15,7 @@
 
     For a simpler interface, see the ``on_error`` argument to ``Lark.parse()``.
     """
-    def __init__(self, parser, parser_state, lexer_thread: LexerThread):
+    def __init__(self, parser, parser_state: ParserState, lexer_thread: LexerThread):
         self.parser = parser
         self.parser_state = parser_state
         self.lexer_thread = lexer_thread
@@ -63,15 +64,15 @@
 
         Calls to feed_token() won't affect the old instance, and vice-versa.
         """
+        return self.copy()
+
+    def copy(self, deepcopy_values=True):
         return type(self)(
             self.parser,
-            copy(self.parser_state),
+            self.parser_state.copy(deepcopy_values=deepcopy_values),
             copy(self.lexer_thread),
         )
 
-    def copy(self):
-        return copy(self)
-
     def __eq__(self, other):
         if not isinstance(other, InteractiveParser):
             return False
@@ -109,7 +110,7 @@
         conf_no_callbacks.callbacks = {}
         for t in self.choices():
             if t.isupper(): # is terminal?
-                new_cursor = copy(self)
+                new_cursor = self.copy(deepcopy_values=False)
                 new_cursor.parser_state.parse_conf = conf_no_callbacks
                 try:
                     new_cursor.feed_token(self.lexer_thread._Token(t, ''))
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/parsers/lalr_parser_state.py new/lark-1.2.2/lark/parsers/lalr_parser_state.py
--- old/lark-1.1.9/lark/parsers/lalr_parser_state.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/parsers/lalr_parser_state.py	2024-08-13 21:47:06.000000000 +0200
@@ -54,16 +54,16 @@
         return len(self.state_stack) == len(other.state_stack) and self.position == other.position
 
     def __copy__(self):
+        return self.copy()
+
+    def copy(self, deepcopy_values=True) -> 'ParserState[StateT]':
         return type(self)(
             self.parse_conf,
             self.lexer, # XXX copy
             copy(self.state_stack),
-            deepcopy(self.value_stack),
+            deepcopy(self.value_stack) if deepcopy_values else copy(self.value_stack),
         )
 
-    def copy(self) -> 'ParserState[StateT]':
-        return copy(self)
-
     def feed_token(self, token: Token, is_end=False) -> Any:
         state_stack = self.state_stack
         value_stack = self.value_stack
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/parsers/xearley.py new/lark-1.2.2/lark/parsers/xearley.py
--- old/lark-1.1.9/lark/parsers/xearley.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/parsers/xearley.py	2024-08-13 21:47:06.000000000 +0200
@@ -104,7 +104,7 @@
                     token.end_pos = i + 1
 
                     new_item = item.advance()
-                    label = (new_item.s, new_item.start, i)
+                    label = (new_item.s, new_item.start, i + 1)
                     token_node = TokenNode(token, terminals[token.type])
                     new_item.node = node_cache[label] if label in node_cache else node_cache.setdefault(label, self.SymbolNode(*label))
                     new_item.node.add_family(new_item.s, item.rule, new_item.start, item.node, token_node)
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/tools/__init__.py new/lark-1.2.2/lark/tools/__init__.py
--- old/lark-1.1.9/lark/tools/__init__.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/tools/__init__.py	2024-08-13 21:47:06.000000000 +0200
@@ -28,9 +28,8 @@
 lalr_argparser.add_argument('-v', '--verbose', action='count', default=0, help="Increase Logger output level, up to three times")
 lalr_argparser.add_argument('-s', '--start', action='append', default=[])
 lalr_argparser.add_argument('-l', '--lexer', default='contextual', choices=('basic', 'contextual'))
-encoding: Optional[str] = 'utf-8' if sys.version_info > (3, 4) else None
-lalr_argparser.add_argument('-o', '--out', type=FileType('w', encoding=encoding), default=sys.stdout, help='the output file (default=stdout)')
-lalr_argparser.add_argument('grammar_file', type=FileType('r', encoding=encoding), help='A valid .lark file')
+lalr_argparser.add_argument('-o', '--out', type=FileType('w', encoding='utf-8'), default=sys.stdout, help='the output file (default=stdout)')
+lalr_argparser.add_argument('grammar_file', type=FileType('r', encoding='utf-8'), help='A valid .lark file')
 
 for flag in flags:
     if isinstance(flag, tuple):
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/tree.py new/lark-1.2.2/lark/tree.py
--- old/lark-1.1.9/lark/tree.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/tree.py	2024-08-13 21:47:06.000000000 +0200
@@ -9,13 +9,9 @@
         import rich
     except ImportError:
         pass
-    if sys.version_info >= (3, 8):
-        from typing import Literal
-    else:
-        from typing_extensions import Literal
+    from typing import Literal
 
 ###{standalone
-from collections import OrderedDict
 
 class Meta:
 
@@ -140,11 +136,10 @@
         Iterates over all the subtrees, never returning to the same node twice (Lark's parse-tree is actually a DAG).
         """
         queue = [self]
-        subtrees = OrderedDict()
+        subtrees = dict()
         for subtree in queue:
             subtrees[id(subtree)] = subtree
-            # Reason for type ignore https://github.com/python/mypy/issues/10999
-            queue += [c for c in reversed(subtree.children)  # type: ignore[misc]
+            queue += [c for c in reversed(subtree.children)
                       if isinstance(c, Tree) and id(c) not in subtrees]
 
         del queue
@@ -242,7 +237,7 @@
     possible attributes, see https://www.graphviz.org/doc/info/attrs.html.
     """
 
-    import pydot  # type: ignore[import]
+    import pydot  # type: ignore[import-not-found]
     graph = pydot.Dot(graph_type='digraph', rankdir=rankdir, **kwargs)
 
     i = [0]
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/utils.py new/lark-1.2.2/lark/utils.py
--- old/lark-1.1.9/lark/utils.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/utils.py	2024-08-13 21:47:06.000000000 +0200
@@ -68,7 +68,7 @@
         res = {f: _serialize(getattr(self, f), memo) for f in fields}
         res['__type__'] = type(self).__name__
         if hasattr(self, '_serialize'):
-            self._serialize(res, memo)  # type: ignore[attr-defined]
+            self._serialize(res, memo)
         return res
 
     @classmethod
@@ -89,7 +89,7 @@
                 raise KeyError("Cannot find key for class", cls, e)
 
         if hasattr(inst, '_deserialize'):
-            inst._deserialize()  # type: ignore[attr-defined]
+            inst._deserialize()
 
         return inst
 
@@ -141,7 +141,7 @@
         regexp_final = expr
     try:
         # Fixed in next version (past 0.960) of typeshed
-        return [int(x) for x in sre_parse.parse(regexp_final).getwidth()]   # type: ignore[attr-defined]
+        return [int(x) for x in sre_parse.parse(regexp_final).getwidth()]
     except sre_constants.error:
         if not _has_regex:
             raise ValueError(expr)
@@ -188,11 +188,7 @@
     """Given a list (l) will removing duplicates from the list,
        preserving the original order of the list. Assumes that
        the list entries are hashable."""
-    dedup = set()
-    # This returns None, but that's expected
-    return [x for x in l if not (x in dedup or dedup.add(x))]  # type: ignore[func-returns-value]
-    # 2x faster (ordered in PyPy and CPython 3.6+, guaranteed to be ordered in Python 3.7+)
-    # return list(dict.fromkeys(l))
+    return list(dict.fromkeys(l))
 
 
 class Enumerator(Serialize):
@@ -234,8 +230,7 @@
     return list(product(*lists))
 
 try:
-    # atomicwrites doesn't have type bindings
-    import atomicwrites     # type: ignore[import]
+    import atomicwrites
     _has_atomicwrites = True
 except ImportError:
     _has_atomicwrites = False
@@ -251,19 +246,6 @@
             return open(name, mode, **kwargs)
 
 
-
-def isascii(s: str) -> bool:
-    """ str.isascii only exists in python3.7+ """
-    if sys.version_info >= (3, 7):
-        return s.isascii()
-    else:
-        try:
-            s.encode('ascii')
-            return True
-        except (UnicodeDecodeError, UnicodeEncodeError):
-            return False
-
-
 class fzset(frozenset):
     def __repr__(self):
         return '{%s}' % ', '.join(map(repr, self))
@@ -359,3 +341,6 @@
 
     def __len__(self) -> int:
         return len(self.d)
+
+    def __repr__(self):
+        return f"{type(self).__name__}({', '.join(map(repr,self))})"
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/lark/visitors.py new/lark-1.2.2/lark/visitors.py
--- old/lark-1.1.9/lark/visitors.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/lark/visitors.py	2024-08-13 21:47:06.000000000 +0200
@@ -158,7 +158,11 @@
 
     def transform(self, tree: Tree[_Leaf_T]) -> _Return_T:
         "Transform the given tree, and return the final result"
-        return self._transform_tree(tree)
+        res = list(self._transform_children([tree]))
+        if not res:
+            return None     # type: ignore[return-value]
+        assert len(res) == 1
+        return res[0]
 
     def __mul__(
             self: 'Transformer[_Leaf_T, Tree[_Leaf_U]]',
@@ -470,8 +474,7 @@
     def __init__(self, func: Callable, visit_wrapper: Callable[[Callable, str, list, Any], Any]):
         if isinstance(func, _VArgsWrapper):
             func = func.base_func
-        # https://github.com/python/mypy/issues/708
-        self.base_func = func  # type: ignore[assignment]
+        self.base_func = func
         self.visit_wrapper = visit_wrapper
         update_wrapper(self, func)
 
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/pyproject.toml new/lark-1.2.2/pyproject.toml
--- old/lark-1.1.9/pyproject.toml	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/pyproject.toml	2024-08-13 21:47:06.000000000 +0200
@@ -17,7 +17,7 @@
     "Topic :: Text Processing :: Linguistic",
     "License :: OSI Approved :: MIT License",
 ]
-requires-python = ">=3.6"
+requires-python = ">=3.8"
 dependencies = []
 dynamic = ["version"]
 
@@ -41,7 +41,7 @@
 - Import grammars from Nearley.js
 - Extensive test suite
 - And much more!
-Since version 1.0, only Python versions 3.6 and up are supported."""
+Since version 1.2, only Python versions 3.8 and up are supported."""
 content-type = "text/markdown"
 
 [project.urls]
@@ -76,9 +76,9 @@
 
 [tool.mypy]
 files = "lark"
-python_version = "3.6"
+python_version = "3.8"
 show_error_codes = true
-enable_error_code = ["ignore-without-code"]
+enable_error_code = ["ignore-without-code", "unused-ignore"]
 exclude = [
   "^lark/__pyinstaller",
 ]
@@ -95,3 +95,11 @@
 ]
 [tool.pyright]
 include = ["lark"]
+
+[tool.pytest.ini_options]
+minversion = 6.0
+addopts = "-ra -q"
+testpaths =[
+    "tests"
+]
+python_files = "__main__.py"
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/pytest.ini new/lark-1.2.2/pytest.ini
--- old/lark-1.1.9/pytest.ini	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/pytest.ini	1970-01-01 01:00:00.000000000 +0100
@@ -1,6 +0,0 @@
-[pytest]
-minversion = 6.0
-addopts = -ra -q
-testpaths =
-    tests
-python_files = __main__.py
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/tests/test_cache.py new/lark-1.2.2/tests/test_cache.py
--- old/lark-1.1.9/tests/test_cache.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/tests/test_cache.py	2024-08-13 21:47:06.000000000 +0200
@@ -7,17 +7,14 @@
 from lark.lexer import Lexer, Token
 import lark.lark as lark_module
 
-try:
-    from StringIO import StringIO
-except ImportError:
-    from io import BytesIO as StringIO
+from io import BytesIO
 
 try:
     import regex
 except ImportError:
     regex = None
 
-class MockFile(StringIO):
+class MockFile(BytesIO):
     def close(self):
         pass
     def __enter__(self):
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/tests/test_logger.py new/lark-1.2.2/tests/test_logger.py
--- old/lark-1.1.9/tests/test_logger.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/tests/test_logger.py	2024-08-13 21:47:06.000000000 +0200
@@ -3,10 +3,7 @@
 from lark import Lark, logger
 from unittest import TestCase, main, skipIf
 
-try:
-    from StringIO import StringIO
-except ImportError:
-    from io import StringIO
+from io import StringIO
 
 try:
     import interegular
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/tests/test_parser.py new/lark-1.2.2/tests/test_parser.py
--- old/lark-1.1.9/tests/test_parser.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/tests/test_parser.py	2024-08-13 21:47:06.000000000 +0200
@@ -7,15 +7,8 @@
 import sys
 from copy import copy, deepcopy
 
-from lark.utils import isascii
-
 from lark import Token, Transformer_NonRecursive, LexError
 
-try:
-    from cStringIO import StringIO as cStringIO
-except ImportError:
-    # Available only in Python 2.x, 3.x only has io.StringIO from below
-    cStringIO = None
 from io import (
         StringIO as uStringIO,
         BytesIO,
@@ -28,6 +21,7 @@
 except ImportError:
     regex = None
 
+
 import lark
 from lark import logger
 from lark.lark import Lark
@@ -399,6 +393,8 @@
 
             self.assertEqual( g.parse('abc').children[0], 'abc')
 
+
+        @unittest.skipIf(LEXER=='basic', "Requires dynamic lexer")
         def test_earley(self):
             g = Lark("""start: A "b" c
                         A: "a"+
@@ -421,8 +417,7 @@
             l = Lark(grammar, parser='earley', lexer=LEXER)
             l.parse(program)
 
-
-        @unittest.skipIf(LEXER=='dynamic', "Only relevant for the dynamic_complete parser")
+        @unittest.skipIf(LEXER != 'dynamic_complete', "Only relevant for the dynamic_complete parser")
         def test_earley3(self):
             """Tests prioritization and disambiguation for pseudo-terminals (there should be only one result)
 
@@ -758,6 +753,8 @@
             self.assertEqual(ambig_tree.data, '_ambig')
             self.assertEqual(set(ambig_tree.children), expected)
 
+
+        @unittest.skipIf(LEXER=='basic', "Requires dynamic lexer")
         def test_fruitflies_ambig(self):
             grammar = """
                 start: noun verb noun        -> simple
@@ -828,6 +825,27 @@
             tree = parser.parse(text)
             self.assertEqual(tree.children, ['foo', 'bar'])
 
+        def test_multiple_start_solutions(self):
+            grammar = r"""
+                !start: a | A
+                !a: A
+                A: "x"
+            """
+
+            l = Lark(grammar, ambiguity='explicit', lexer=LEXER)
+            tree = l.parse('x')
+
+            expected = Tree('_ambig', [
+                Tree('start', ['x']),
+                Tree('start', [Tree('a', ['x'])])]
+            )
+            self.assertEqual(tree, expected)
+
+            l = Lark(grammar, ambiguity='resolve', lexer=LEXER)
+            tree = l.parse('x')
+            assert tree == Tree('start', ['x'])
+
+
         def test_cycle(self):
             grammar = """
             start: start?
@@ -843,16 +861,24 @@
 
         def test_cycle2(self):
             grammar = """
-            start: _operation
-            _operation:  value
-            value: "b"
-                  | "a" value
-                  | _operation
+            start: _recurse
+            _recurse:  v
+            v: "b"
+                  | "a" v
+                  | _recurse
             """
 
             l = Lark(grammar, ambiguity="explicit", lexer=LEXER)
             tree = l.parse("ab")
-            self.assertEqual(tree, Tree('start', [Tree('value', [Tree('value', [])])]))
+            expected = (
+                Tree('start', [
+                    Tree('_ambig', [
+                        Tree('v', [Tree('v', [])]),
+                        Tree('v', [Tree('v', [Tree('v', [])])])
+                    ])
+                ])
+            )
+            self.assertEqual(tree, expected)
 
         def test_cycles(self):
             grammar = """
@@ -912,24 +938,60 @@
             tree = l.parse('');
             self.assertEqual(tree, Tree('a', [Tree('x', [Tree('b', [])])]))
 
+        @unittest.skipIf(LEXER=='basic', "start/end values work differently for the basic lexer")
+        def test_symbol_node_start_end_dynamic_lexer(self):
+            grammar = """
+            start: "ABC"
+            """
+
+            l = Lark(grammar, ambiguity='forest', lexer=LEXER)
+            node = l.parse('ABC')
+            self.assertEqual(node.start, 0)
+            self.assertEqual(node.end, 3)
+
+            grammar2 = """
+            start: abc
+            abc: "ABC"
+            """
+
+            l = Lark(grammar2, ambiguity='forest', lexer=LEXER)
+            node = l.parse('ABC')
+            self.assertEqual(node.start, 0)
+            self.assertEqual(node.end, 3)
 
+        def test_resolve_ambiguity_with_shared_node(self):
+            grammar = """
+            start: (a+)*
+            !a.1: "A" |
+            """
+
+            l = Lark(grammar, ambiguity='resolve', lexer=LEXER)
+            tree = l.parse("A")
+            self.assertEqual(tree, Tree('start', [Tree('a', []), Tree('a', []), Tree('a', ['A'])]))
 
+        def test_resolve_ambiguity_with_shared_node2(self):
+            grammar = """
+            start: _s x _s
+            x: "X"?
+            _s: " "?
+            """
+
+            l = Lark(grammar, ambiguity='resolve', lexer=LEXER)
+            tree = l.parse("")
+            self.assertEqual(tree, Tree('start', [Tree('x', [])]))
 
 
-        # @unittest.skipIf(LEXER=='dynamic', "Not implemented in Dynamic Earley yet")  # TODO
-        # def test_not_all_derivations(self):
-        #     grammar = """
-        #     start: cd+ "e"
-
-        #     !cd: "c"
-        #        | "d"
-        #        | "cd"
-
-        #     """
-        #     l = Lark(grammar, parser='earley', ambiguity='explicit', lexer=LEXER, earley__all_derivations=False)
-        #     x = l.parse('cde')
-        #     assert x.data != '_ambig', x
-        #     assert len(x.children) == 1
+        def test_consistent_derivation_order1(self):
+            # Should return the same result for any hash-seed
+            parser = Lark('''
+                start: a a
+                a: "." | b
+                b: "."
+            ''', lexer=LEXER)
+
+            tree = parser.parse('..')
+            n = Tree('a', [Tree('b', [])])
+            assert tree == Tree('start', [n, n])
 
     _NAME = "TestFullEarley" + LEXER.capitalize()
     _TestFullEarley.__name__ = _NAME
@@ -987,7 +1049,7 @@
     def __init__(self, g, *args, **kwargs):
         self.text_lexer = Lark(g, *args, use_bytes=False, **kwargs)
         g = self.text_lexer.grammar_source.lower()
-        if '\\u' in g or not isascii(g):
+        if '\\u' in g or not g.isascii():
             # Bytes re can't deal with uniode escapes
             self.bytes_lark = None
         else:
@@ -996,7 +1058,7 @@
 
     def parse(self, text, start=None):
         # TODO: Easy workaround, more complex checks would be beneficial
-        if not isascii(text) or self.bytes_lark is None:
+        if not text.isascii() or self.bytes_lark is None:
             return self.text_lexer.parse(text, start)
         try:
             rv = self.text_lexer.parse(text, start)
@@ -1086,11 +1148,6 @@
             assert x.data == 'start' and x.children == ['12', '2'], x
 
 
-        @unittest.skipIf(cStringIO is None, "cStringIO not available")
-        def test_stringio_bytes(self):
-            """Verify that a Lark can be created from file-like objects other than Python's standard 'file' object"""
-            _Lark(cStringIO(b'start: a+ b a* "b" a*\n b: "b"\n a: "a" '))
-
         def test_stringio_unicode(self):
             """Verify that a Lark can be created from file-like objects other than Python's standard 'file' object"""
             _Lark(uStringIO(u'start: a+ b a* "b" a*\n b: "b"\n a: "a" '))
@@ -1140,7 +1197,7 @@
                           """)
             g.parse('abc')
 
-        @unittest.skipIf(sys.version_info < (3, 3), "re package did not support 32bit unicode escape sequence before Python 3.3")
+
         def test_unicode_literal_range_escape2(self):
             g = _Lark(r"""start: A+
                           A: "\U0000FFFF".."\U00010002"
@@ -1153,8 +1210,7 @@
                           """)
             g.parse('\x01\x02\x03')
 
-        @unittest.skipIf(sys.version_info[0]==2 or sys.version_info[:2]==(3, 4),
-                         "bytes parser isn't perfect in Python2, exceptions don't work correctly")
+
         def test_bytes_utf8(self):
             g = r"""
             start: BOM? char+
@@ -1305,49 +1361,6 @@
             [list] = r.children
             self.assertSequenceEqual([item.data for item in list.children], ())
 
-        @unittest.skipIf(True, "Flattening list isn't implemented (and may never be)")
-        def test_single_item_flatten_list(self):
-            g = _Lark(r"""start: list
-                            list: | item "," list
-                            item : A
-                            A: "a"
-                         """)
-            r = g.parse("a,")
-
-            # Because 'list' is a flatten rule it's top-level element should *never* be expanded
-            self.assertSequenceEqual([subtree.data for subtree in r.children], ('list',))
-
-            # Sanity check: verify that 'list' contains exactly the one 'item' we've given it
-            [list] = r.children
-            self.assertSequenceEqual([item.data for item in list.children], ('item',))
-
-        @unittest.skipIf(True, "Flattening list isn't implemented (and may never be)")
-        def test_multiple_item_flatten_list(self):
-            g = _Lark(r"""start: list
-                            #list: | item "," list
-                            item : A
-                            A: "a"
-                         """)
-            r = g.parse("a,a,")
-
-            # Because 'list' is a flatten rule it's top-level element should *never* be expanded
-            self.assertSequenceEqual([subtree.data for subtree in r.children], ('list',))
-
-            # Sanity check: verify that 'list' contains exactly the two 'item's we've given it
-            [list] = r.children
-            self.assertSequenceEqual([item.data for item in list.children], ('item', 'item'))
-
-        @unittest.skipIf(True, "Flattening list isn't implemented (and may never be)")
-        def test_recurse_flatten(self):
-            """Verify that stack depth doesn't get exceeded on recursive rules marked for flattening."""
-            g = _Lark(r"""start: a | start a
-                         a : A
-                         A : "a" """)
-
-            # Force PLY to write to the debug log, but prevent writing it to the terminal (uses repr() on the half-built
-            # STree data structures, which uses recursion).
-            g.parse("a" * (sys.getrecursionlimit() // 4))
-
         def test_token_collision(self):
             g = _Lark(r"""start: "Hello" NAME
                         NAME: /\w/+
@@ -1459,20 +1472,6 @@
             x1 = g.parse("ABBc")
             x2 = g.parse("abdE")
 
-        # def test_string_priority(self):
-        #     g = _Lark("""start: (A | /a?bb/)+
-        #                  A: "a"  """)
-        #     x = g.parse('abb')
-        #     self.assertEqual(len(x.children), 2)
-
-        #     # This parse raises an exception because the lexer will always try to consume
-        #     # "a" first and will never match the regular expression
-        #     # This behavior is subject to change!!
-        #     # This won't happen with ambiguity handling.
-        #     g = _Lark("""start: (A | /a?ab/)+
-        #                  A: "a"  """)
-        #     self.assertRaises(LexError, g.parse, 'aab')
-
         def test_rule_collision(self):
             g = _Lark("""start: "a"+ "b"
                              | "a"+ """)
@@ -1561,13 +1560,6 @@
                       """)
             x = g.parse('\n')
 
-
-        # def test_token_recurse(self):
-        #     g = _Lark("""start: A
-        #                  A: B
-        #                  B: A
-        #               """)
-
         @unittest.skipIf(PARSER == 'cyk', "No empty rules")
         def test_empty(self):
             # Fails an Earley implementation without special handling for empty rules,
@@ -1649,13 +1641,6 @@
             tree = l.parse('aA')
             self.assertEqual(tree.children, ['a', 'A'])
 
-            # g = """!start: "a"i "a"
-            #     """
-            # self.assertRaises(GrammarError, _Lark, g)
-
-            # g = """!start: /a/i /a/
-            #     """
-            # self.assertRaises(GrammarError, _Lark, g)
 
             g = """start: NAME "," "a"
                    NAME: /[a-z_]/i /[a-z0-9_]/i*
@@ -1666,6 +1651,25 @@
             tree = l.parse('AB,a')
             self.assertEqual(tree.children, ['AB'])
 
+        @unittest.skipIf(LEXER in ('basic', 'custom_old', 'custom_new'), "Requires context sensitive terminal selection")
+        def test_token_flags_collision(self):
+
+            g = """!start: "a"i "a"
+                """
+            l = _Lark(g)
+            self.assertEqual(l.parse('aa').children, ['a', 'a'])
+            self.assertEqual(l.parse('Aa').children, ['A', 'a'])
+            self.assertRaises(UnexpectedInput, l.parse, 'aA')
+            self.assertRaises(UnexpectedInput, l.parse, 'AA')
+
+            g = """!start: /a/i /a/
+                """
+            l = _Lark(g)
+            self.assertEqual(l.parse('aa').children, ['a', 'a'])
+            self.assertEqual(l.parse('Aa').children, ['A', 'a'])
+            self.assertRaises(UnexpectedInput, l.parse, 'aA')
+            self.assertRaises(UnexpectedInput, l.parse, 'AA')
+
         def test_token_flags3(self):
             l = _Lark("""!start: ABC+
                       ABC: "abc"i
@@ -1754,7 +1758,7 @@
             self.assertEqual(len(tree.children), 2)
 
 
-        @unittest.skipIf(LEXER != 'basic', "basic lexer prioritization differs from dynamic lexer prioritization")
+        @unittest.skipIf('dynamic' in LEXER, "basic lexer prioritization differs from dynamic lexer prioritization")
         def test_lexer_prioritization(self):
             "Tests effect of priority on result"
 
@@ -2274,7 +2278,6 @@
 
 
 
-        @unittest.skipIf(PARSER=='earley', "Priority not handled correctly right now")  # TODO XXX
         def test_priority_vs_embedded(self):
             g = """
             A.2: "a"
@@ -2407,7 +2410,7 @@
             parser = _Lark(grammar)
 
 
-        @unittest.skipIf(PARSER!='lalr' or 'custom' in LEXER, "Serialize currently only works for LALR parsers without custom lexers (though it should be easy to extend)")
+        @unittest.skipIf(PARSER!='lalr' or LEXER == 'custom_old', "Serialize currently only works for LALR parsers without custom lexers (though it should be easy to extend)")
         def test_serialize(self):
             grammar = """
                 start: _ANY b "C"
@@ -2512,7 +2515,7 @@
             """
             self.assertRaises((GrammarError, LexError, re.error), _Lark, g, regex=True)
 
-        @unittest.skipIf(PARSER!='lalr', "interactive_parser is only implemented for LALR at the moment")
+        @unittest.skipIf(PARSER != 'lalr', "interactive_parser is only implemented for LALR at the moment")
         def test_parser_interactive_parser(self):
 
             g = _Lark(r'''
@@ -2549,7 +2552,7 @@
             res = ip_copy.feed_eof()
             self.assertEqual(res, Tree('start', ['a', 'b', 'b']))
 
-        @unittest.skipIf(PARSER!='lalr', "interactive_parser error handling only works with LALR for now")
+        @unittest.skipIf(PARSER != 'lalr', "interactive_parser error handling only works with LALR for now")
         def test_error_with_interactive_parser(self):
             def ignore_errors(e):
                 if isinstance(e, UnexpectedCharacters):
@@ -2584,10 +2587,10 @@
             s = "[0 1, 2,@, 3,,, 4, 5 6 ]$"
             tree = g.parse(s, on_error=ignore_errors)
 
-        @unittest.skipIf(PARSER!='lalr', "interactive_parser error handling only works with LALR for now")
+        @unittest.skipIf(PARSER != 'lalr', "interactive_parser error handling only works with LALR for now")
         def test_iter_parse(self):
             ab_grammar = '!start: "a"* "b"*'
-            parser = Lark(ab_grammar, parser="lalr")
+            parser = _Lark(ab_grammar)
             ip = parser.parse_interactive("aaabb")
             i = ip.iter_parse()
             assert next(i) == 'a'
@@ -2595,7 +2598,7 @@
             assert next(i) == 'a'
             assert next(i) == 'b'
 
-        @unittest.skipIf(PARSER!='lalr', "interactive_parser is only implemented for LALR at the moment")
+        @unittest.skipIf(PARSER != 'lalr', "interactive_parser is only implemented for LALR at the moment")
         def test_interactive_treeless_transformer(self):
             grammar = r"""
                 start: SYM+
@@ -2617,7 +2620,7 @@
             res = ip.feed_eof()
             self.assertEqual(res.children, [1, 2, 1])
 
-        @unittest.skipIf(PARSER!='lalr', "Tree-less mode is only supported in lalr")
+        @unittest.skipIf(PARSER == 'earley', "Tree-less mode is not supported in earley")
         def test_default_in_treeless_mode(self):
             grammar = r"""
                 start: expr
@@ -2643,7 +2646,7 @@
             b = parser.parse(s)
             assert a == b
 
-        @unittest.skipIf(PARSER!='lalr', "strict mode is only supported in lalr for now")
+        @unittest.skipIf(PARSER != 'lalr', "strict mode is only supported in lalr for now")
         def test_strict(self):
             # Test regex collision
             grammar = r"""
@@ -2687,7 +2690,7 @@
 for _LEXER, _PARSER in _TO_TEST:
     _make_parser_test(_LEXER, _PARSER)
 
-for _LEXER in ('dynamic', 'dynamic_complete'):
+for _LEXER in ('basic', 'dynamic', 'dynamic_complete'):
     _make_full_earley_test(_LEXER)
 
 if __name__ == '__main__':
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/tests/test_reconstructor.py new/lark-1.2.2/tests/test_reconstructor.py
--- old/lark-1.1.9/tests/test_reconstructor.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/tests/test_reconstructor.py	2024-08-13 21:47:06.000000000 +0200
@@ -154,7 +154,6 @@
         for code in examples:
             self.assert_reconstruct(g, code, keep_all_tokens=True)
 
-    @unittest.skipIf(sys.version_info < (3, 0), "Python 2 does not play well with Unicode.")
     def test_switch_grammar_unicode_terminal(self):
         """
         This test checks that a parse tree built with a grammar containing only ascii characters can be reconstructed
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/tests/test_trees.py new/lark-1.2.2/tests/test_trees.py
--- old/lark-1.1.9/tests/test_trees.py	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/tests/test_trees.py	2024-08-13 21:47:06.000000000 +0200
@@ -447,5 +447,20 @@
         with self.assertRaises(AttributeError):
             merge_transformers(T1(), module=T3())
 
+    def test_transform_token(self):
+        class MyTransformer(Transformer):
+            def INT(self, value):
+                return int(value)
+
+        t = Token('INT', '123')
+        assert MyTransformer().transform(t) == 123
+
+        class MyTransformer(Transformer):
+            def INT(self, value):
+                return Discard
+
+        assert MyTransformer().transform(t) is None
+
+
 if __name__ == '__main__':
     unittest.main()
diff -urN '--exclude=CVS' '--exclude=.cvsignore' '--exclude=.svn' '--exclude=.svnignore' old/lark-1.1.9/tox.ini new/lark-1.2.2/tox.ini
--- old/lark-1.1.9/tox.ini	2024-01-10 09:30:23.000000000 +0100
+++ new/lark-1.2.2/tox.ini	2024-08-13 21:47:06.000000000 +0200
@@ -1,5 +1,5 @@
 [tox]
-envlist = lint, type, py36, py37, py38, py39, py310, py311, py312, pypy3
+envlist = lint, type, py38, py39, py310, py311, py312, py313, pypy3
 skip_missing_interpreters = true
 
 [testenv]
@@ -25,8 +25,8 @@
 skip_install = true
 recreate = false
 deps =
-    mypy==0.950
-    interegular>=0.2.4
+    mypy==1.10
+    interegular>=0.3.1,<0.4.0
     types-atomicwrites
     types-regex
     rich<=13.4.1

    

Source-Sync

tags

participants (1)