madduck's git repository

Every one of the projects in this repository is available at the canonical URL git://git.madduck.net/madduck/pub/<projectpath> — see each project's metadata for the exact URL.

All patches and comments are welcome. Please squash your changes to logical commits before using git-format-patch and git-send-email to patches@git.madduck.net. If you'd read over the Git project's submission guidelines and adhered to them, I'd be especially grateful.

SSH access, as well as push access can be individually arranged.

If you use my repositories frequently, consider adding the following snippet to ~/.gitconfig and using the third clone URL listed for each project:

[url "git://git.madduck.net/madduck/"]
  insteadOf = madduck:

Improve f-string expression detection regex so ... (#2437)
authorRichard Si <63936253+ichard26@users.noreply.github.com>
Mon, 23 Aug 2021 02:52:19 +0000 (22:52 -0400)
committerGitHub <noreply@github.com>
Mon, 23 Aug 2021 02:52:19 +0000 (19:52 -0700)
we don't accidentally add backslashes to them when normalizing quotes
because that's invalid syntax!

The problem this commit fixes is that matches would eat too much
blocking important matches to occur. For example, here's one f-string
body:

    {a}{b}{c}

I know there's no risk of introducing backslashes here, but the regex
already goes sideways with this. Throwing this example at regex101
I get:

    {a}{b}{c}   # The As and Bs are the two matches, and the upper
    ---- ----   # case letters are the groups with those matches.
    aAaa bbBb

... we've missed the middle expression (so if any backslashes in a
more complex example were introduced there we wouldn't bail out
even though we should -- hence the bug). As it stands the regex
needs somesort of extra character (or the start/end of the body)
around the expressions but that isn't always the case as shown
above.

The fix implemented here is to turn the "eat a surrounding non-curly
bracket character" groups ie. `(?:[^{]|^)` and `(?:[^}]|$)` into
negative lookaheads and lookbehinds. This still guarantees the
already specified rules but without problematically eating extra
characters ^^

CHANGES.md
src/black/strings.py
tests/data/string_quotes.py

index 3a96029bf5cd3ccfcb481ad717a5aa8462839962..22ddc423e557e3f46031bd83a5cbb6b4bf2bf1d5 100644 (file)
@@ -7,6 +7,8 @@
 - Add support for formatting Jupyter Notebook files (#2357)
 - Move from `appdirs` dependency to `platformdirs` (#2375)
 - Present a more user-friendly error if .gitignore is invalid (#2414)
+- The failsafe for accidentally added backslashes in f-string expressions has been
+  hardened to handle more edge cases during quote normalization (#2437)
 
 ### Integrations
 
index 80f588f5119c4432c4a43ce8414b014893b4be2b..d7b6c240e80215ef1c268634433074d82e4c5aeb 100644 (file)
@@ -190,9 +190,9 @@ def normalize_string_quotes(s: str) -> str:
     if "f" in prefix.casefold():
         matches = re.findall(
             r"""
-            (?:[^{]|^)\{  # start of the string or a non-{ followed by a single {
+            (?:(?<!\{)|^)\{  # start of the string or a non-{ followed by a single {
                 ([^{].*?)  # contents of the brackets except if begins with {{
-            \}(?:[^}]|$)  # A } followed by end of the string or a non-}
+            \}(?:(?!\})|$)  # A } followed by end of the string or a non-}
             """,
             new_body,
             re.VERBOSE,
index 5a4bc5d0b119098d3097348d8131a49f84586e6a..3384241f4adaf7ad8e1dbff5be6da70e32a40558 100644 (file)
@@ -51,6 +51,11 @@ f'{y * x} \'{z}\''
 '\'{z}\' {y * " "}'
 '{y * x} \'{z}\''
 
+# We must bail out if changing the quotes would introduce backslashes in f-string
+# expressions. xref: https://github.com/psf/black/issues/2348
+f"\"{b}\"{' ' * (long-len(b)+1)}: \"{sts}\",\n"
+f"\"{a}\"{'hello' * b}\"{c}\""
+
 # output
 
 """"""
@@ -100,3 +105,8 @@ f'\'{z}\' {y * " "}'
 f"{y * x} '{z}'"
 "'{z}' {y * \" \"}"
 "{y * x} '{z}'"
+
+# We must bail out if changing the quotes would introduce backslashes in f-string
+# expressions. xref: https://github.com/psf/black/issues/2348
+f"\"{b}\"{' ' * (long-len(b)+1)}: \"{sts}\",\n"
+f"\"{a}\"{'hello' * b}\"{c}\""