madduck's git repository

Every one of the projects in this repository is available at the canonical URL git://git.madduck.net/madduck/pub/<projectpath> — see each project's metadata for the exact URL.

All patches and comments are welcome. Please squash your changes to logical commits before using git-format-patch and git-send-email to patches@git.madduck.net. If you'd read over the Git project's submission guidelines and adhered to them, I'd be especially grateful.

SSH access, as well as push access can be individually arranged.

If you use my repositories frequently, consider adding the following snippet to ~/.gitconfig and using the third clone URL listed for each project:

[url "git://git.madduck.net/madduck/"]
  insteadOf = madduck:

Jupyter notebook support (#2357)
authorMarco Edward Gorelli <marcogorelli@protonmail.com>
Fri, 6 Aug 2021 20:57:46 +0000 (21:57 +0100)
committerGitHub <noreply@github.com>
Fri, 6 Aug 2021 20:57:46 +0000 (16:57 -0400)
To summarise, based on what was discussed in that issue:

due to not being able to parse automagics (e.g. pip install black)
without a running IPython kernel, cells with syntax which is parseable
by neither ast.parse nor IPython will be skipped cells with multiline
magics will be skipped trailing semicolons will be preserved, as they
are often put there intentionally in Jupyter Notebooks to suppress
unnecessary output

Commit history before merge (excluding merge commits):

* wip
* fixup tests
* skip tests if no IPython
* install test requirements in ipynb tests
* if --ipynb format all as ipynb
* wip
* add some whole-notebook tests
* docstrings
* skip multiline magics
* add test for nested cell magic
* remove ipynb_test.yml, put ipynb tests in tox.ini
* add changelog entry
* typo
* make token same length as magic it replaces
* only include .ipynb by default if jupyter dependencies are found
* remove logic from const
* fixup
* fixup
* re.compile
* noop
* clear up
* new_src -> dst
* early exit for non-python notebooks
* add non-python test notebook
* add repo with many notebooks to black-primer
* install extra dependencies for black-primer
* fix planetary computer examples url
* dont run on ipynb files by default
* add scikit-lego (Expected to change) to black-primer
* add ipynb-specific diff
* fixup
* run on all (including ipynb) by default
* remove --include .ipynb from scikit-lego black-primer
* use tokenize so as to mirror the exact logic in IPython.core.displayhooks quiet
* fixup
* :art:
* clarify docstring
* add test for when comment is after trailing semicolon
* enumerate(reversed) instead of [::-1]
* clarify docstrings
* wip
* use jupyter and no_jupyter marks
* use THIS_DIR
* windows fixup
* perform safe check cell-by-cell for ipynb
* only perform safe check in ipynb if not fast
* remove redundant Optional
* :art:
* use typeguard
* dont process cell containing transformed magic
* require typing extensions before 3.10 so as to have TypeGuard
* use dataclasses
* mention black[jupyter] in docs as well as in README
* add faq
* add message to assertion error
* add test for indented quieted cell
* use tokenize_rt else we cant roundtrip
* fmake fronzet set for tokens to ignore when looking for trailing semicolon
* remove planetary code examples as recent commits result in changes
* use dataclasses which inherit from ast.NodeVisitor
* bump typing-extensions so that TypeGuard is available
* bump typing-extensions in Pipfile
* add test with notebook with empty metadata
* pipenv lock
* deprivative validate_cell
* Update README.md
* Update docs/getting_started.md
* dont cache notebooks if jupyter dependencies arent found
* dont write to cache if jupyter deps are not installed
* add notebook which cant be parsed
* use clirunner
* remove other subprocess calls
* add docstring
* make verbose and quiet keyword only
* :art:
* run second many test on directory, not on file
* test for warning message when running on directory
* early return from non-python cell magics
* move NothingChanged to report to avoid circular import
* remove circular import
* reinstate --ipynb flag

Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
28 files changed:
.github/workflows/primer.yml
.gitignore
.pre-commit-hooks.yaml
CHANGES.md
Pipfile
README.md
docs/faq.md
docs/getting_started.md
pyproject.toml
setup.py
src/black/__init__.py
src/black/const.py
src/black/files.py
src/black/handle_ipynb_magics.py [new file with mode: 0644]
src/black/mode.py
src/black/output.py
src/black/report.py
src/black_primer/primer.json
tests/data/non_python_notebook.ipynb [new file with mode: 0644]
tests/data/notebook_empty_metadata.ipynb [new file with mode: 0644]
tests/data/notebook_no_trailing_newline.ipynb [new file with mode: 0644]
tests/data/notebook_trailing_newline.ipynb [new file with mode: 0644]
tests/data/notebook_which_cant_be_parsed.ipynb [new file with mode: 0644]
tests/data/notebook_without_changes.ipynb [new file with mode: 0644]
tests/test_black.py
tests/test_ipynb.py [new file with mode: 0644]
tests/test_no_ipynb.py [new file with mode: 0644]
tox.ini

index 8f7c11c824efe26279be4a05cbeb6bc2b51ba8f2..01eb4ef61871926cd1d0335b0a7dbf6f27209275 100644 (file)
@@ -38,7 +38,7 @@ jobs:
       - name: Install dependencies
         run: |
           python -m pip install --upgrade pip
-          python -m pip install -e ".[d]"
+          python -m pip install -e ".[d,jupyter]"
 
       - name: Primer run
         env:
index ab796ce4cd03a4d33e3b46807f192c0353007559..f81bce8fd4e1117e092851aa0ec327decc5d5dff 100644 (file)
@@ -18,3 +18,4 @@ src/_black_version.py
 *.swp
 .hypothesis/
 venv/
+.ipynb_checkpoints/
index de2eb674e0d2e1d1076c2f38a7b6d0f8055aea31..81848d7dcf73043bad53457b28b7ba03c355f10e 100644 (file)
@@ -7,3 +7,14 @@
   minimum_pre_commit_version: 2.9.2
   require_serial: true
   types_or: [python, pyi]
+- id: black-jupyter
+  name: black-jupyter
+  description:
+    "Black: The uncompromising Python code formatter (with Jupyter Notebook support)"
+  entry: black
+  language: python
+  language_version: python3
+  minimum_pre_commit_version: 2.9.2
+  require_serial: true
+  types_or: [python, pyi, jupyter]
+  additional_dependencies: [".[jupyter]"]
index 6714a9d9eb26960c08da30f1262f7d32960d66fe..a678aaefc2acf1f027bc86db1086def322297003 100644 (file)
@@ -2,7 +2,10 @@
 
 ## Unreleased
 
-- Moved from `appdirs` dependency to `platformdirs` (#2375)
+### _Black_
+
+- Add support for formatting Jupyter Notebook files (#2357)
+- Move from `appdirs` dependency to `platformdirs` (#2375)
 
 ## 21.7b0
 
diff --git a/Pipfile b/Pipfile
index 6527958c4254ccc74bc0994414692d2e192d655f..433a3b392f72b4c57b0c346a20c60a12204ce436 100644 (file)
--- a/Pipfile
+++ b/Pipfile
@@ -33,6 +33,6 @@ pathspec = ">=0.8.1"
 regex = ">=2020.1.8"
 tomli = ">=0.2.6, <2.0.0"
 typed-ast = "==1.4.2"
-typing_extensions = {"python_version <" = "3.8","version >=" = "3.7.4"}
+typing_extensions = {"python_version <" = "3.10","version >=" = "3.10.0.0"}
 black = {editable = true,extras = ["d"],path = "."}
 dataclasses = {"python_version <" = "3.7","version >" = "0.1.3"}
index 7b87a03fc3808ae914628d88a701a3136db606ab..709478e1d58aba7b93d1d15d24eb2975f4edc833 100644 (file)
--- a/README.md
+++ b/README.md
@@ -41,7 +41,8 @@ Try it out now using the [Black Playground](https://black.vercel.app). Watch the
 
 _Black_ can be installed by running `pip install black`. It requires Python 3.6.2+ to
 run. If you want to format Python 2 code as well, install with
-`pip install black[python2]`.
+`pip install black[python2]`. If you want to format Jupyter Notebooks, install with
+`pip install black[jupyter]`.
 
 If you can't wait for the latest _hotness_ and want to install from GitHub, use:
 
index ac5ba937c1c2f4501dc7bb87428ba90687fb5048..d7e6a16351fb2e901de0c28faea3f667d588c589 100644 (file)
@@ -37,6 +37,29 @@ Most likely because it is ignored in `.gitignore` or excluded with configuration
 [file collection and discovery](usage_and_configuration/file_collection_and_discovery.md)
 for details.
 
+## Why is my Jupyter Notebook cell not formatted?
+
+_Black_ is timid about formatting Jupyter Notebooks. Cells containing any of the
+following will not be formatted:
+
+- automagics (e.g. `pip install black`)
+- multiline magics, e.g.:
+
+  ```python
+  %timeit f(1, \
+          2, \
+          3)
+  ```
+
+- code which `IPython`'s `TransformerManager` would transform magics into, e.g.:
+
+  ```python
+  get_ipython().system('ls')
+  ```
+
+- invalid syntax, as it can't be safely distinguished from automagics in the absense of
+  a running `IPython` kernel.
+
 ## Why are Flake8's E203 and W503 violated?
 
 Because they go against PEP 8. E203 falsely triggers on list
index a509d34e903907a27195db3f160fd4802502f4af..c79dc607c4afc6aa744b104f7c9002d62888f6c9 100644 (file)
@@ -18,7 +18,8 @@ Also, you can try out _Black_ online for minimal fuss on the
 
 _Black_ can be installed by running `pip install black`. It requires Python 3.6.2+ to
 run, but can format Python 2 code too. Python 2 support needs the `typed_ast`
-dependency, which be installed with `pip install black[python2]`.
+dependency, which be installed with `pip install black[python2]`. If you want to format
+Jupyter Notebooks, install with `pip install black[jupyter]`.
 
 If you can't wait for the latest _hotness_ and want to install from GitHub, use:
 
index 79060fc422dfef8b7e08946d7f4be26c45c562d2..d085c0ddc62bd4e985d3468c07ea3f92a781b859 100644 (file)
@@ -31,4 +31,5 @@ build-backend = "setuptools.build_meta"
 optional-tests = [
   "no_python2: run when `python2` extra NOT installed",
   "no_blackd: run when `d` extra NOT installed",
+  "no_jupyter: run when `jupyter` extra NOT installed",
 ]
index 0a00638108c4f07a29c041ec0297e99cc8ad5ffc..92b78f1abe116f9060ed53375f4cf0172ec28d00 100644 (file)
--- a/setup.py
+++ b/setup.py
@@ -79,7 +79,7 @@ setup(
         "regex>=2020.1.8",
         "pathspec>=0.8.1, <1",
         "dataclasses>=0.6; python_version < '3.7'",
-        "typing_extensions>=3.7.4; python_version < '3.8'",
+        "typing_extensions>=3.10.0.0; python_version < '3.10'",
         "mypy_extensions>=0.4.3",
     ],
     extras_require={
@@ -87,6 +87,7 @@ setup(
         "colorama": ["colorama>=0.4.3"],
         "python2": ["typed-ast>=1.4.2"],
         "uvloop": ["uvloop>=0.15.2"],
+        "jupyter": ["ipython>=7.8.0", "tokenize-rt>=3.2.0"],
     },
     test_suite="tests.test_black",
     classifiers=[
index 51384fb08da5edadc3d783425cbfb448f7306afe..29fb244f8b751651238fd99fefb055a9ada26eb1 100644 (file)
@@ -1,4 +1,6 @@
 import asyncio
+from json.decoder import JSONDecodeError
+import json
 from concurrent.futures import Executor, ThreadPoolExecutor, ProcessPoolExecutor
 from contextlib import contextmanager
 from datetime import datetime
@@ -18,6 +20,7 @@ from typing import (
     Generator,
     Iterator,
     List,
+    MutableMapping,
     Optional,
     Pattern,
     Set,
@@ -39,13 +42,21 @@ from black.mode import Mode, TargetVersion
 from black.mode import Feature, supports_feature, VERSION_TO_FEATURES
 from black.cache import read_cache, write_cache, get_cache_info, filter_cached, Cache
 from black.concurrency import cancel, shutdown, maybe_install_uvloop
-from black.output import dump_to_file, diff, color_diff, out, err
-from black.report import Report, Changed
+from black.output import dump_to_file, ipynb_diff, diff, color_diff, out, err
+from black.report import Report, Changed, NothingChanged
 from black.files import find_project_root, find_pyproject_toml, parse_pyproject_toml
 from black.files import gen_python_files, get_gitignore, normalize_path_maybe_ignore
 from black.files import wrap_stream_for_windows
 from black.parsing import InvalidInput  # noqa F401
 from black.parsing import lib2to3_parse, parse_ast, stringify_ast
+from black.handle_ipynb_magics import (
+    mask_cell,
+    unmask_cell,
+    remove_trailing_semicolon,
+    put_trailing_semicolon_back,
+    TRANSFORMED_MAGICS,
+    jupyter_dependencies_are_installed,
+)
 
 
 # lib2to3 fork
@@ -60,10 +71,6 @@ Encoding = str
 NewLine = str
 
 
-class NothingChanged(UserWarning):
-    """Raised when reformatted code is the same as source."""
-
-
 class WriteBack(Enum):
     NO = 0
     YES = 1
@@ -196,6 +203,14 @@ def validate_regex(
         " when piping source on standard input)."
     ),
 )
+@click.option(
+    "--ipynb",
+    is_flag=True,
+    help=(
+        "Format all input files like Jupyter Notebooks regardless of file extension "
+        "(useful when piping source on standard input)."
+    ),
+)
 @click.option(
     "-S",
     "--skip-string-normalization",
@@ -355,6 +370,7 @@ def main(
     color: bool,
     fast: bool,
     pyi: bool,
+    ipynb: bool,
     skip_string_normalization: bool,
     skip_magic_trailing_comma: bool,
     experimental_string_processing: bool,
@@ -380,6 +396,9 @@ def main(
             f" the running version `{__version__}`!"
         )
         ctx.exit(1)
+    if ipynb and pyi:
+        err("Cannot pass both `pyi` and `ipynb` flags!")
+        ctx.exit(1)
 
     write_back = WriteBack.from_configuration(check=check, diff=diff, color=color)
     if target_version:
@@ -391,6 +410,7 @@ def main(
         target_versions=versions,
         line_length=line_length,
         is_pyi=pyi,
+        is_ipynb=ipynb,
         string_normalization=not skip_string_normalization,
         magic_trailing_comma=not skip_magic_trailing_comma,
         experimental_string_processing=experimental_string_processing,
@@ -504,6 +524,11 @@ def get_sources(
             if is_stdin:
                 p = Path(f"{STDIN_PLACEHOLDER}{str(p)}")
 
+            if p.suffix == ".ipynb" and not jupyter_dependencies_are_installed(
+                verbose=verbose, quiet=quiet
+            ):
+                continue
+
             sources.add(p)
         elif p.is_dir():
             sources.update(
@@ -516,6 +541,8 @@ def get_sources(
                     force_exclude,
                     report,
                     gitignore,
+                    verbose=verbose,
+                    quiet=quiet,
                 )
             )
         elif s == "-":
@@ -585,6 +612,8 @@ def reformat_one(
         if is_stdin:
             if src.suffix == ".pyi":
                 mode = replace(mode, is_pyi=True)
+            elif src.suffix == ".ipynb":
+                mode = replace(mode, is_ipynb=True)
             if format_stdin_to_stdout(fast=fast, write_back=write_back, mode=mode):
                 changed = Changed.YES
         else:
@@ -733,6 +762,8 @@ def format_file_in_place(
     """
     if src.suffix == ".pyi":
         mode = replace(mode, is_pyi=True)
+    elif src.suffix == ".ipynb":
+        mode = replace(mode, is_ipynb=True)
 
     then = datetime.utcfromtimestamp(src.stat().st_mtime)
     with open(src, "rb") as buf:
@@ -741,6 +772,8 @@ def format_file_in_place(
         dst_contents = format_file_contents(src_contents, fast=fast, mode=mode)
     except NothingChanged:
         return False
+    except JSONDecodeError:
+        raise ValueError(f"File '{src}' cannot be parsed as valid Jupyter notebook.")
 
     if write_back == WriteBack.YES:
         with open(src, "w", encoding=encoding, newline=newline) as f:
@@ -749,7 +782,10 @@ def format_file_in_place(
         now = datetime.utcnow()
         src_name = f"{src}\t{then} +0000"
         dst_name = f"{src}\t{now} +0000"
-        diff_contents = diff(src_contents, dst_contents, src_name, dst_name)
+        if mode.is_ipynb:
+            diff_contents = ipynb_diff(src_contents, dst_contents, src_name, dst_name)
+        else:
+            diff_contents = diff(src_contents, dst_contents, src_name, dst_name)
 
         if write_back == WriteBack.COLOR_DIFF:
             diff_contents = color_diff(diff_contents)
@@ -819,6 +855,29 @@ def format_stdin_to_stdout(
         f.detach()
 
 
+def check_stability_and_equivalence(
+    src_contents: str, dst_contents: str, *, mode: Mode
+) -> None:
+    """Perform stability and equivalence checks.
+
+    Raise AssertionError if source and destination contents are not
+    equivalent, or if a second pass of the formatter would format the
+    content differently.
+    """
+    assert_equivalent(src_contents, dst_contents)
+
+    # Forced second pass to work around optional trailing commas (becoming
+    # forced trailing commas on pass 2) interacting differently with optional
+    # parentheses.  Admittedly ugly.
+    dst_contents_pass2 = format_str(dst_contents, mode=mode)
+    if dst_contents != dst_contents_pass2:
+        dst_contents = dst_contents_pass2
+        assert_equivalent(src_contents, dst_contents, pass_num=2)
+        assert_stable(src_contents, dst_contents, mode=mode)
+    # Note: no need to explicitly call `assert_stable` if `dst_contents` was
+    # the same as `dst_contents_pass2`.
+
+
 def format_file_contents(src_contents: str, *, fast: bool, mode: Mode) -> FileContent:
     """Reformat contents of a file and return new contents.
 
@@ -829,26 +888,116 @@ def format_file_contents(src_contents: str, *, fast: bool, mode: Mode) -> FileCo
     if not src_contents.strip():
         raise NothingChanged
 
-    dst_contents = format_str(src_contents, mode=mode)
+    if mode.is_ipynb:
+        dst_contents = format_ipynb_string(src_contents, fast=fast, mode=mode)
+    else:
+        dst_contents = format_str(src_contents, mode=mode)
     if src_contents == dst_contents:
         raise NothingChanged
 
-    if not fast:
-        assert_equivalent(src_contents, dst_contents)
-
-        # Forced second pass to work around optional trailing commas (becoming
-        # forced trailing commas on pass 2) interacting differently with optional
-        # parentheses.  Admittedly ugly.
-        dst_contents_pass2 = format_str(dst_contents, mode=mode)
-        if dst_contents != dst_contents_pass2:
-            dst_contents = dst_contents_pass2
-            assert_equivalent(src_contents, dst_contents, pass_num=2)
-            assert_stable(src_contents, dst_contents, mode=mode)
-        # Note: no need to explicitly call `assert_stable` if `dst_contents` was
-        # the same as `dst_contents_pass2`.
+    if not fast and not mode.is_ipynb:
+        # Jupyter notebooks will already have been checked above.
+        check_stability_and_equivalence(src_contents, dst_contents, mode=mode)
     return dst_contents
 
 
+def validate_cell(src: str) -> None:
+    """Check that cell does not already contain TransformerManager transformations.
+
+    If a cell contains ``!ls``, then it'll be transformed to
+    ``get_ipython().system('ls')``. However, if the cell originally contained
+    ``get_ipython().system('ls')``, then it would get transformed in the same way:
+
+        >>> TransformerManager().transform_cell("get_ipython().system('ls')")
+        "get_ipython().system('ls')\n"
+        >>> TransformerManager().transform_cell("!ls")
+        "get_ipython().system('ls')\n"
+
+    Due to the impossibility of safely roundtripping in such situations, cells
+    containing transformed magics will be ignored.
+    """
+    if any(transformed_magic in src for transformed_magic in TRANSFORMED_MAGICS):
+        raise NothingChanged
+
+
+def format_cell(src: str, *, fast: bool, mode: Mode) -> str:
+    """Format code in given cell of Jupyter notebook.
+
+    General idea is:
+
+      - if cell has trailing semicolon, remove it;
+      - if cell has IPython magics, mask them;
+      - format cell;
+      - reinstate IPython magics;
+      - reinstate trailing semicolon (if originally present);
+      - strip trailing newlines.
+
+    Cells with syntax errors will not be processed, as they
+    could potentially be automagics or multi-line magics, which
+    are currently not supported.
+    """
+    validate_cell(src)
+    src_without_trailing_semicolon, has_trailing_semicolon = remove_trailing_semicolon(
+        src
+    )
+    try:
+        masked_src, replacements = mask_cell(src_without_trailing_semicolon)
+    except SyntaxError:
+        raise NothingChanged
+    masked_dst = format_str(masked_src, mode=mode)
+    if not fast:
+        check_stability_and_equivalence(masked_src, masked_dst, mode=mode)
+    dst_without_trailing_semicolon = unmask_cell(masked_dst, replacements)
+    dst = put_trailing_semicolon_back(
+        dst_without_trailing_semicolon, has_trailing_semicolon
+    )
+    dst = dst.rstrip("\n")
+    if dst == src:
+        raise NothingChanged
+    return dst
+
+
+def validate_metadata(nb: MutableMapping[str, Any]) -> None:
+    """If notebook is marked as non-Python, don't format it.
+
+    All notebook metadata fields are optional, see
+    https://nbformat.readthedocs.io/en/latest/format_description.html. So
+    if a notebook has empty metadata, we will try to parse it anyway.
+    """
+    language = nb.get("metadata", {}).get("language_info", {}).get("name", None)
+    if language is not None and language != "python":
+        raise NothingChanged
+
+
+def format_ipynb_string(src_contents: str, *, fast: bool, mode: Mode) -> FileContent:
+    """Format Jupyter notebook.
+
+    Operate cell-by-cell, only on code cells, only for Python notebooks.
+    If the ``.ipynb`` originally had a trailing newline, it'll be preseved.
+    """
+    trailing_newline = src_contents[-1] == "\n"
+    modified = False
+    nb = json.loads(src_contents)
+    validate_metadata(nb)
+    for cell in nb["cells"]:
+        if cell.get("cell_type", None) == "code":
+            try:
+                src = "".join(cell["source"])
+                dst = format_cell(src, fast=fast, mode=mode)
+            except NothingChanged:
+                pass
+            else:
+                cell["source"] = dst.splitlines(keepends=True)
+                modified = True
+    if modified:
+        dst_contents = json.dumps(nb, indent=1, ensure_ascii=False)
+        if trailing_newline:
+            dst_contents = dst_contents + "\n"
+        return dst_contents
+    else:
+        raise NothingChanged
+
+
 def format_str(src_contents: str, *, mode: Mode) -> FileContent:
     """Reformat a string and return new contents.
 
index 821258588ab150b102b944e4368260edbb740975..dbb4826be0e9f9d4e1694dc32fe1b385240fbda5 100644 (file)
@@ -1,4 +1,4 @@
 DEFAULT_LINE_LENGTH = 88
 DEFAULT_EXCLUDES = r"/(\.direnv|\.eggs|\.git|\.hg|\.mypy_cache|\.nox|\.tox|\.venv|venv|\.svn|_build|buck-out|build|dist)/"  # noqa: B950
-DEFAULT_INCLUDES = r"\.pyi?$"
+DEFAULT_INCLUDES = r"(\.pyi?|\.ipynb)$"
 STDIN_PLACEHOLDER = "__BLACK_STDIN_FILENAME__"
index 213bbf82a58e79fc45e4cf67d1773f05953034b7..ba60c84a27528ed378aa2f7073ebd7c4d830aef5 100644 (file)
@@ -22,6 +22,7 @@ import tomli
 
 from black.output import err
 from black.report import Report
+from black.handle_ipynb_magics import jupyter_dependencies_are_installed
 
 if TYPE_CHECKING:
     import colorama  # noqa: F401
@@ -165,6 +166,9 @@ def gen_python_files(
     force_exclude: Optional[Pattern[str]],
     report: Report,
     gitignore: Optional[PathSpec],
+    *,
+    verbose: bool,
+    quiet: bool,
 ) -> Iterator[Path]:
     """Generate all files under `path` whose paths are not excluded by the
     `exclude_regex`, `extend_exclude`, or `force_exclude` regexes,
@@ -216,9 +220,15 @@ def gen_python_files(
                 force_exclude,
                 report,
                 gitignore + get_gitignore(child) if gitignore is not None else None,
+                verbose=verbose,
+                quiet=quiet,
             )
 
         elif child.is_file():
+            if child.suffix == ".ipynb" and not jupyter_dependencies_are_installed(
+                verbose=verbose, quiet=quiet
+            ):
+                continue
             include_match = include.search(normalized_path) if include else True
             if include_match:
                 yield child
diff --git a/src/black/handle_ipynb_magics.py b/src/black/handle_ipynb_magics.py
new file mode 100644 (file)
index 0000000..ad93c44
--- /dev/null
@@ -0,0 +1,457 @@
+"""Functions to process IPython magics with."""
+from functools import lru_cache
+import dataclasses
+import ast
+from typing import Dict
+
+import secrets
+from typing import List, Tuple
+import collections
+
+from typing import Optional
+from typing_extensions import TypeGuard
+from black.report import NothingChanged
+from black.output import out
+
+
+TRANSFORMED_MAGICS = frozenset(
+    (
+        "get_ipython().run_cell_magic",
+        "get_ipython().system",
+        "get_ipython().getoutput",
+        "get_ipython().run_line_magic",
+    )
+)
+TOKENS_TO_IGNORE = frozenset(
+    (
+        "ENDMARKER",
+        "NL",
+        "NEWLINE",
+        "COMMENT",
+        "DEDENT",
+        "UNIMPORTANT_WS",
+        "ESCAPED_NL",
+    )
+)
+NON_PYTHON_CELL_MAGICS = frozenset(
+    (
+        "%%bash",
+        "%%html",
+        "%%javascript",
+        "%%js",
+        "%%latex",
+        "%%markdown",
+        "%%perl",
+        "%%ruby",
+        "%%script",
+        "%%sh",
+        "%%svg",
+        "%%writefile",
+    )
+)
+
+
+@dataclasses.dataclass(frozen=True)
+class Replacement:
+    mask: str
+    src: str
+
+
+@lru_cache()
+def jupyter_dependencies_are_installed(*, verbose: bool, quiet: bool) -> bool:
+    try:
+        import IPython  # noqa:F401
+        import tokenize_rt  # noqa:F401
+    except ModuleNotFoundError:
+        if verbose or not quiet:
+            msg = (
+                "Skipping .ipynb files as Jupyter dependencies are not installed.\n"
+                "You can fix this by running ``pip install black[jupyter]``"
+            )
+            out(msg)
+        return False
+    else:
+        return True
+
+
+def remove_trailing_semicolon(src: str) -> Tuple[str, bool]:
+    """Remove trailing semicolon from Jupyter notebook cell.
+
+    For example,
+
+        fig, ax = plt.subplots()
+        ax.plot(x_data, y_data);  # plot data
+
+    would become
+
+        fig, ax = plt.subplots()
+        ax.plot(x_data, y_data)  # plot data
+
+    Mirrors the logic in `quiet` from `IPython.core.displayhook`, but uses
+    ``tokenize_rt`` so that round-tripping works fine.
+    """
+    from tokenize_rt import (
+        src_to_tokens,
+        tokens_to_src,
+        reversed_enumerate,
+    )
+
+    tokens = src_to_tokens(src)
+    trailing_semicolon = False
+    for idx, token in reversed_enumerate(tokens):
+        if token.name in TOKENS_TO_IGNORE:
+            continue
+        if token.name == "OP" and token.src == ";":
+            del tokens[idx]
+            trailing_semicolon = True
+        break
+    if not trailing_semicolon:
+        return src, False
+    return tokens_to_src(tokens), True
+
+
+def put_trailing_semicolon_back(src: str, has_trailing_semicolon: bool) -> str:
+    """Put trailing semicolon back if cell originally had it.
+
+    Mirrors the logic in `quiet` from `IPython.core.displayhook`, but uses
+    ``tokenize_rt`` so that round-tripping works fine.
+    """
+    if not has_trailing_semicolon:
+        return src
+    from tokenize_rt import src_to_tokens, tokens_to_src, reversed_enumerate
+
+    tokens = src_to_tokens(src)
+    for idx, token in reversed_enumerate(tokens):
+        if token.name in TOKENS_TO_IGNORE:
+            continue
+        tokens[idx] = token._replace(src=token.src + ";")
+        break
+    else:  # pragma: nocover
+        raise AssertionError(
+            "INTERNAL ERROR: Was not able to reinstate trailing semicolon. "
+            "Please report a bug on https://github.com/psf/black/issues.  "
+        ) from None
+    return str(tokens_to_src(tokens))
+
+
+def mask_cell(src: str) -> Tuple[str, List[Replacement]]:
+    """Mask IPython magics so content becomes parseable Python code.
+
+    For example,
+
+        %matplotlib inline
+        'foo'
+
+    becomes
+
+        "25716f358c32750e"
+        'foo'
+
+    The replacements are returned, along with the transformed code.
+    """
+    replacements: List[Replacement] = []
+    try:
+        ast.parse(src)
+    except SyntaxError:
+        # Might have IPython magics, will process below.
+        pass
+    else:
+        # Syntax is fine, nothing to mask, early return.
+        return src, replacements
+
+    from IPython.core.inputtransformer2 import TransformerManager
+
+    transformer_manager = TransformerManager()
+    transformed = transformer_manager.transform_cell(src)
+    transformed, cell_magic_replacements = replace_cell_magics(transformed)
+    replacements += cell_magic_replacements
+    transformed = transformer_manager.transform_cell(transformed)
+    transformed, magic_replacements = replace_magics(transformed)
+    if len(transformed.splitlines()) != len(src.splitlines()):
+        # Multi-line magic, not supported.
+        raise NothingChanged
+    replacements += magic_replacements
+    return transformed, replacements
+
+
+def get_token(src: str, magic: str) -> str:
+    """Return randomly generated token to mask IPython magic with.
+
+    For example, if 'magic' was `%matplotlib inline`, then a possible
+    token to mask it with would be `"43fdd17f7e5ddc83"`. The token
+    will be the same length as the magic, and we make sure that it was
+    not already present anywhere else in the cell.
+    """
+    assert magic
+    nbytes = max(len(magic) // 2 - 1, 1)
+    token = secrets.token_hex(nbytes)
+    counter = 0
+    while token in src:  # pragma: nocover
+        token = secrets.token_hex(nbytes)
+        counter += 1
+        if counter > 100:
+            raise AssertionError(
+                "INTERNAL ERROR: Black was not able to replace IPython magic. "
+                "Please report a bug on https://github.com/psf/black/issues.  "
+                f"The magic might be helpful: {magic}"
+            ) from None
+    if len(token) + 2 < len(magic):
+        token = f"{token}."
+    return f'"{token}"'
+
+
+def replace_cell_magics(src: str) -> Tuple[str, List[Replacement]]:
+    """Replace cell magic with token.
+
+    Note that 'src' will already have been processed by IPython's
+    TransformerManager().transform_cell.
+
+    Example,
+
+        get_ipython().run_cell_magic('t', '-n1', 'ls =!ls\\n')
+
+    becomes
+
+        "a794."
+        ls =!ls
+
+    The replacement, along with the transformed code, is returned.
+    """
+    replacements: List[Replacement] = []
+
+    tree = ast.parse(src)
+
+    cell_magic_finder = CellMagicFinder()
+    cell_magic_finder.visit(tree)
+    if cell_magic_finder.cell_magic is None:
+        return src, replacements
+    if cell_magic_finder.cell_magic.header.split()[0] in NON_PYTHON_CELL_MAGICS:
+        raise NothingChanged
+    mask = get_token(src, cell_magic_finder.cell_magic.header)
+    replacements.append(Replacement(mask=mask, src=cell_magic_finder.cell_magic.header))
+    return f"{mask}\n{cell_magic_finder.cell_magic.body}", replacements
+
+
+def replace_magics(src: str) -> Tuple[str, List[Replacement]]:
+    """Replace magics within body of cell.
+
+    Note that 'src' will already have been processed by IPython's
+    TransformerManager().transform_cell.
+
+    Example, this
+
+        get_ipython().run_line_magic('matplotlib', 'inline')
+        'foo'
+
+    becomes
+
+        "5e67db56d490fd39"
+        'foo'
+
+    The replacement, along with the transformed code, are returned.
+    """
+    replacements = []
+    magic_finder = MagicFinder()
+    magic_finder.visit(ast.parse(src))
+    new_srcs = []
+    for i, line in enumerate(src.splitlines(), start=1):
+        if i in magic_finder.magics:
+            offsets_and_magics = magic_finder.magics[i]
+            if len(offsets_and_magics) != 1:  # pragma: nocover
+                raise AssertionError(
+                    f"Expecting one magic per line, got: {offsets_and_magics}\n"
+                    "Please report a bug on https://github.com/psf/black/issues."
+                )
+            col_offset, magic = (
+                offsets_and_magics[0].col_offset,
+                offsets_and_magics[0].magic,
+            )
+            mask = get_token(src, magic)
+            replacements.append(Replacement(mask=mask, src=magic))
+            line = line[:col_offset] + mask
+        new_srcs.append(line)
+    return "\n".join(new_srcs), replacements
+
+
+def unmask_cell(src: str, replacements: List[Replacement]) -> str:
+    """Remove replacements from cell.
+
+    For example
+
+        "9b20"
+        foo = bar
+
+    becomes
+
+        %%time
+        foo = bar
+    """
+    for replacement in replacements:
+        src = src.replace(replacement.mask, replacement.src)
+    return src
+
+
+def _is_ipython_magic(node: ast.expr) -> TypeGuard[ast.Attribute]:
+    """Check if attribute is IPython magic.
+
+    Note that the source of the abstract syntax tree
+    will already have been processed by IPython's
+    TransformerManager().transform_cell.
+    """
+    return (
+        isinstance(node, ast.Attribute)
+        and isinstance(node.value, ast.Call)
+        and isinstance(node.value.func, ast.Name)
+        and node.value.func.id == "get_ipython"
+    )
+
+
+@dataclasses.dataclass(frozen=True)
+class CellMagic:
+    header: str
+    body: str
+
+
+@dataclasses.dataclass
+class CellMagicFinder(ast.NodeVisitor):
+    """Find cell magics.
+
+    Note that the source of the abstract syntax tree
+    will already have been processed by IPython's
+    TransformerManager().transform_cell.
+
+    For example,
+
+        %%time\nfoo()
+
+    would have been transformed to
+
+        get_ipython().run_cell_magic('time', '', 'foo()\\n')
+
+    and we look for instances of the latter.
+    """
+
+    cell_magic: Optional[CellMagic] = None
+
+    def visit_Expr(self, node: ast.Expr) -> None:
+        """Find cell magic, extract header and body."""
+        if (
+            isinstance(node.value, ast.Call)
+            and _is_ipython_magic(node.value.func)
+            and node.value.func.attr == "run_cell_magic"
+        ):
+            args = []
+            for arg in node.value.args:
+                assert isinstance(arg, ast.Str)
+                args.append(arg.s)
+            header = f"%%{args[0]}"
+            if args[1]:
+                header += f" {args[1]}"
+            self.cell_magic = CellMagic(header=header, body=args[2])
+        self.generic_visit(node)
+
+
+@dataclasses.dataclass(frozen=True)
+class OffsetAndMagic:
+    col_offset: int
+    magic: str
+
+
+@dataclasses.dataclass
+class MagicFinder(ast.NodeVisitor):
+    """Visit cell to look for get_ipython calls.
+
+    Note that the source of the abstract syntax tree
+    will already have been processed by IPython's
+    TransformerManager().transform_cell.
+
+    For example,
+
+        %matplotlib inline
+
+    would have been transformed to
+
+        get_ipython().run_line_magic('matplotlib', 'inline')
+
+    and we look for instances of the latter (and likewise for other
+    types of magics).
+    """
+
+    magics: Dict[int, List[OffsetAndMagic]] = dataclasses.field(
+        default_factory=lambda: collections.defaultdict(list)
+    )
+
+    def visit_Assign(self, node: ast.Assign) -> None:
+        """Look for system assign magics.
+
+        For example,
+
+            black_version = !black --version
+
+        would have been transformed to
+
+            black_version = get_ipython().getoutput('black --version')
+
+        and we look for instances of the latter.
+        """
+        if (
+            isinstance(node.value, ast.Call)
+            and _is_ipython_magic(node.value.func)
+            and node.value.func.attr == "getoutput"
+        ):
+            args = []
+            for arg in node.value.args:
+                assert isinstance(arg, ast.Str)
+                args.append(arg.s)
+            assert args
+            src = f"!{args[0]}"
+            self.magics[node.value.lineno].append(
+                OffsetAndMagic(node.value.col_offset, src)
+            )
+        self.generic_visit(node)
+
+    def visit_Expr(self, node: ast.Expr) -> None:
+        """Look for magics in body of cell.
+
+        For examples,
+
+            !ls
+            !!ls
+            ?ls
+            ??ls
+
+        would (respectively) get transformed to
+
+            get_ipython().system('ls')
+            get_ipython().getoutput('ls')
+            get_ipython().run_line_magic('pinfo', 'ls')
+            get_ipython().run_line_magic('pinfo2', 'ls')
+
+        and we look for instances of any of the latter.
+        """
+        if isinstance(node.value, ast.Call) and _is_ipython_magic(node.value.func):
+            args = []
+            for arg in node.value.args:
+                assert isinstance(arg, ast.Str)
+                args.append(arg.s)
+            assert args
+            if node.value.func.attr == "run_line_magic":
+                if args[0] == "pinfo":
+                    src = f"?{args[1]}"
+                elif args[0] == "pinfo2":
+                    src = f"??{args[1]}"
+                else:
+                    src = f"%{args[0]}"
+                    if args[1]:
+                        assert src is not None
+                        src += f" {args[1]}"
+            elif node.value.func.attr == "system":
+                src = f"!{args[0]}"
+            elif node.value.func.attr == "getoutput":
+                src = f"!!{args[0]}"
+            else:
+                raise NothingChanged  # unsupported magic.
+            self.magics[node.value.lineno].append(
+                OffsetAndMagic(node.value.col_offset, src)
+            )
+        self.generic_visit(node)
index e2ce322da5cf5a65519f5d197fe10d56fca8c845..0b7624eaf8ae5fad9f81d7cabde094709119847f 100644 (file)
@@ -101,6 +101,7 @@ class Mode:
     line_length: int = DEFAULT_LINE_LENGTH
     string_normalization: bool = True
     is_pyi: bool = False
+    is_ipynb: bool = False
     magic_trailing_comma: bool = True
     experimental_string_processing: bool = False
 
@@ -117,6 +118,7 @@ class Mode:
             str(self.line_length),
             str(int(self.string_normalization)),
             str(int(self.is_pyi)),
+            str(int(self.is_ipynb)),
             str(int(self.magic_trailing_comma)),
             str(int(self.experimental_string_processing)),
         ]
index c253c85e90e87353b75d47afc35b015395792dea..fd3dbb376279497c2fc23f6eb7f03d6db96c4fc7 100644 (file)
@@ -3,6 +3,7 @@
 The double calls are for patching purposes in tests.
 """
 
+import json
 from typing import Any, Optional
 from mypy_extensions import mypyc_attr
 import tempfile
@@ -34,6 +35,23 @@ def err(message: Optional[str] = None, nl: bool = True, **styles: Any) -> None:
     _err(message, nl=nl, **styles)
 
 
+def ipynb_diff(a: str, b: str, a_name: str, b_name: str) -> str:
+    """Return a unified diff string between each cell in notebooks `a` and `b`."""
+    a_nb = json.loads(a)
+    b_nb = json.loads(b)
+    diff_lines = [
+        diff(
+            "".join(a_nb["cells"][cell_number]["source"]) + "\n",
+            "".join(b_nb["cells"][cell_number]["source"]) + "\n",
+            f"{a_name}:cell_{cell_number}",
+            f"{b_name}:cell_{cell_number}",
+        )
+        for cell_number, cell in enumerate(a_nb["cells"])
+        if cell["cell_type"] == "code"
+    ]
+    return "".join(diff_lines)
+
+
 def diff(a: str, b: str, a_name: str, b_name: str) -> str:
     """Return a unified diff string between strings `a` and `b`."""
     import difflib
index 8fc5da2e1672d831ac9ef93636b6fdd467c28383..7e1c8b4b87f273bf6b99868fcdc641afadb8c4d6 100644 (file)
@@ -16,6 +16,10 @@ class Changed(Enum):
     YES = 2
 
 
+class NothingChanged(UserWarning):
+    """Raised when reformatted code is the same as source."""
+
+
 @dataclass
 class Report:
     """Provides a reformatting counter. Can be rendered with `str(report)`."""
index 9665b491a814fbd3c1728cce798e8a540224448a..edbed3f33dd3a353ae5d8555e4150f66db7a8359 100644 (file)
       "long_checkout": false,
       "py_versions": ["all"]
     },
+    "scikit-lego": {
+      "cli_arguments": ["--experimental-string-processing"],
+      "expect_formatting_changes": true,
+      "git_clone_url": "https://github.com/koaning/scikit-lego",
+      "long_checkout": false,
+      "py_versions": ["all"]
+    },
     "sqlalchemy": {
       "cli_arguments": ["--experimental-string-processing"],
       "expect_formatting_changes": true,
diff --git a/tests/data/non_python_notebook.ipynb b/tests/data/non_python_notebook.ipynb
new file mode 100644 (file)
index 0000000..da5cdd8
--- /dev/null
@@ -0,0 +1 @@
+{"metadata":{"kernelspec":{"name":"ir","display_name":"R","language":"R"},"language_info":{"name":"R","codemirror_mode":"r","pygments_lexer":"r","mimetype":"text/x-r-source","file_extension":".r","version":"4.0.5"}},"nbformat_minor":4,"nbformat":4,"cells":[{"cell_type":"code","source":"library(tidyverse) ","metadata":{"_uuid":"051d70d956493feee0c6d64651c6a088724dca2a","_execution_state":"idle"},"execution_count":null,"outputs":[]}]}
\ No newline at end of file
diff --git a/tests/data/notebook_empty_metadata.ipynb b/tests/data/notebook_empty_metadata.ipynb
new file mode 100644 (file)
index 0000000..7dc1f80
--- /dev/null
@@ -0,0 +1,27 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "%%time\n",
+    "\n",
+    "print('foo')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {},
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/tests/data/notebook_no_trailing_newline.ipynb b/tests/data/notebook_no_trailing_newline.ipynb
new file mode 100644 (file)
index 0000000..79f95be
--- /dev/null
@@ -0,0 +1,39 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "%%time\n",
+    "\n",
+    "print('foo')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "interpreter": {
+   "hash": "e758f3098b5b55f4d87fe30bbdc1367f20f246b483f96267ee70e6c40cb185d8"
+  },
+  "kernelspec": {
+   "display_name": "Python 3.8.10 64-bit ('black': venv)",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": ""
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
\ No newline at end of file
diff --git a/tests/data/notebook_trailing_newline.ipynb b/tests/data/notebook_trailing_newline.ipynb
new file mode 100644 (file)
index 0000000..4f82869
--- /dev/null
@@ -0,0 +1,39 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "%%time\n",
+    "\n",
+    "print('foo')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "interpreter": {
+   "hash": "e758f3098b5b55f4d87fe30bbdc1367f20f246b483f96267ee70e6c40cb185d8"
+  },
+  "kernelspec": {
+   "display_name": "Python 3.8.10 64-bit ('black': venv)",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": ""
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
diff --git a/tests/data/notebook_which_cant_be_parsed.ipynb b/tests/data/notebook_which_cant_be_parsed.ipynb
new file mode 100644 (file)
index 0000000..257cc56
--- /dev/null
@@ -0,0 +1 @@
+foo
diff --git a/tests/data/notebook_without_changes.ipynb b/tests/data/notebook_without_changes.ipynb
new file mode 100644 (file)
index 0000000..ac6c7e6
--- /dev/null
@@ -0,0 +1,46 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "%%time\n",
+    "\n",
+    "print(\"foo\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This notebook should not be reformatted"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "interpreter": {
+   "hash": "e758f3098b5b55f4d87fe30bbdc1367f20f246b483f96267ee70e6c40cb185d8"
+  },
+  "kernelspec": {
+   "display_name": "Python 3.8.10 64-bit ('black': venv)",
+   "name": "python3"
+  },
+  "language_info": {
+   "name": "python",
+   "version": ""
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
\ No newline at end of file
index 998ecfcbdebfe8b1968046ec62ef206030609e0b..942446ec3435d03d0aa580f9d7fccbd867bf3de9 100644 (file)
@@ -1379,6 +1379,8 @@ class BlackTestCase(BlackBaseTestCase):
                 None,
                 report,
                 gitignore,
+                verbose=False,
+                quiet=False,
             )
         )
         self.assertEqual(sorted(expected), sorted(sources))
@@ -1690,6 +1692,8 @@ class BlackTestCase(BlackBaseTestCase):
                 None,
                 report,
                 gitignore,
+                verbose=False,
+                quiet=False,
             )
         )
         self.assertEqual(sorted(expected), sorted(sources))
@@ -1717,6 +1721,8 @@ class BlackTestCase(BlackBaseTestCase):
                 None,
                 report,
                 root_gitignore,
+                verbose=False,
+                quiet=False,
             )
         )
         self.assertEqual(sorted(expected), sorted(sources))
@@ -1751,6 +1757,8 @@ class BlackTestCase(BlackBaseTestCase):
                 None,
                 report,
                 gitignore,
+                verbose=False,
+                quiet=False,
             )
         )
         self.assertEqual(sorted(expected), sorted(sources))
@@ -1775,6 +1783,8 @@ class BlackTestCase(BlackBaseTestCase):
                 None,
                 report,
                 gitignore,
+                verbose=False,
+                quiet=False,
             )
         )
         self.assertEqual(sorted(expected), sorted(sources))
@@ -1847,6 +1857,8 @@ class BlackTestCase(BlackBaseTestCase):
                     None,
                     report,
                     gitignore,
+                    verbose=False,
+                    quiet=False,
                 )
             )
         except ValueError as ve:
@@ -1868,6 +1880,8 @@ class BlackTestCase(BlackBaseTestCase):
                     None,
                     report,
                     gitignore,
+                    verbose=False,
+                    quiet=False,
                 )
             )
         path.iterdir.assert_called()
diff --git a/tests/test_ipynb.py b/tests/test_ipynb.py
new file mode 100644 (file)
index 0000000..038155e
--- /dev/null
@@ -0,0 +1,455 @@
+import pathlib
+from click.testing import CliRunner
+from black.handle_ipynb_magics import jupyter_dependencies_are_installed
+from black import (
+    main,
+    NothingChanged,
+    format_cell,
+    format_file_contents,
+    format_file_in_place,
+)
+import os
+import pytest
+from black import Mode
+from _pytest.monkeypatch import MonkeyPatch
+from py.path import local
+
+pytestmark = pytest.mark.jupyter
+pytest.importorskip("IPython", reason="IPython is an optional dependency")
+pytest.importorskip("tokenize_rt", reason="tokenize-rt is an optional dependency")
+
+JUPYTER_MODE = Mode(is_ipynb=True)
+
+runner = CliRunner()
+
+
+def test_noop() -> None:
+    src = 'foo = "a"'
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+@pytest.mark.parametrize("fast", [True, False])
+def test_trailing_semicolon(fast: bool) -> None:
+    src = 'foo = "a" ;'
+    result = format_cell(src, fast=fast, mode=JUPYTER_MODE)
+    expected = 'foo = "a";'
+    assert result == expected
+
+
+def test_trailing_semicolon_with_comment() -> None:
+    src = 'foo = "a" ;  # bar'
+    result = format_cell(src, fast=True, mode=JUPYTER_MODE)
+    expected = 'foo = "a";  # bar'
+    assert result == expected
+
+
+def test_trailing_semicolon_with_comment_on_next_line() -> None:
+    src = "import black;\n\n# this is a comment"
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_trailing_semicolon_indented() -> None:
+    src = "with foo:\n    plot_bar();"
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_trailing_semicolon_noop() -> None:
+    src = 'foo = "a";'
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_cell_magic() -> None:
+    src = "%%time\nfoo =bar"
+    result = format_cell(src, fast=True, mode=JUPYTER_MODE)
+    expected = "%%time\nfoo = bar"
+    assert result == expected
+
+
+def test_cell_magic_noop() -> None:
+    src = "%%time\n2 + 2"
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+@pytest.mark.parametrize(
+    "src, expected",
+    (
+        pytest.param("ls =!ls", "ls = !ls", id="System assignment"),
+        pytest.param("!ls\n'foo'", '!ls\n"foo"', id="System call"),
+        pytest.param("!!ls\n'foo'", '!!ls\n"foo"', id="Other system call"),
+        pytest.param("?str\n'foo'", '?str\n"foo"', id="Help"),
+        pytest.param("??str\n'foo'", '??str\n"foo"', id="Other help"),
+        pytest.param(
+            "%matplotlib inline\n'foo'",
+            '%matplotlib inline\n"foo"',
+            id="Line magic with argument",
+        ),
+        pytest.param("%time\n'foo'", '%time\n"foo"', id="Line magic without argument"),
+    ),
+)
+def test_magic(src: str, expected: str) -> None:
+    result = format_cell(src, fast=True, mode=JUPYTER_MODE)
+    assert result == expected
+
+
+@pytest.mark.parametrize(
+    "src",
+    (
+        "%%bash\n2+2",
+        "%%html --isolated\n2+2",
+    ),
+)
+def test_non_python_magics(src: str) -> None:
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_set_input() -> None:
+    src = "a = b??"
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_input_already_contains_transformed_magic() -> None:
+    src = '%time foo()\nget_ipython().run_cell_magic("time", "", "foo()\\n")'
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_magic_noop() -> None:
+    src = "ls = !ls"
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_cell_magic_with_magic() -> None:
+    src = "%%t -n1\nls =!ls"
+    result = format_cell(src, fast=True, mode=JUPYTER_MODE)
+    expected = "%%t -n1\nls = !ls"
+    assert result == expected
+
+
+def test_cell_magic_nested() -> None:
+    src = "%%time\n%%time\n2+2"
+    result = format_cell(src, fast=True, mode=JUPYTER_MODE)
+    expected = "%%time\n%%time\n2 + 2"
+    assert result == expected
+
+
+def test_cell_magic_with_magic_noop() -> None:
+    src = "%%t -n1\nls = !ls"
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_automagic() -> None:
+    src = "pip install black"
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_multiline_magic() -> None:
+    src = "%time 1 + \\\n2"
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_multiline_no_magic() -> None:
+    src = "1 + \\\n2"
+    result = format_cell(src, fast=True, mode=JUPYTER_MODE)
+    expected = "1 + 2"
+    assert result == expected
+
+
+def test_cell_magic_with_invalid_body() -> None:
+    src = "%%time\nif True"
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_empty_cell() -> None:
+    src = ""
+    with pytest.raises(NothingChanged):
+        format_cell(src, fast=True, mode=JUPYTER_MODE)
+
+
+def test_entire_notebook_empty_metadata() -> None:
+    with open(
+        os.path.join("tests", "data", "notebook_empty_metadata.ipynb"), "rb"
+    ) as fd:
+        content_bytes = fd.read()
+    content = content_bytes.decode()
+    result = format_file_contents(content, fast=True, mode=JUPYTER_MODE)
+    expected = (
+        "{\n"
+        ' "cells": [\n'
+        "  {\n"
+        '   "cell_type": "code",\n'
+        '   "execution_count": null,\n'
+        '   "metadata": {\n'
+        '    "tags": []\n'
+        "   },\n"
+        '   "outputs": [],\n'
+        '   "source": [\n'
+        '    "%%time\\n",\n'
+        '    "\\n",\n'
+        '    "print(\\"foo\\")"\n'
+        "   ]\n"
+        "  },\n"
+        "  {\n"
+        '   "cell_type": "code",\n'
+        '   "execution_count": null,\n'
+        '   "metadata": {},\n'
+        '   "outputs": [],\n'
+        '   "source": []\n'
+        "  }\n"
+        " ],\n"
+        ' "metadata": {},\n'
+        ' "nbformat": 4,\n'
+        ' "nbformat_minor": 4\n'
+        "}\n"
+    )
+    assert result == expected
+
+
+def test_entire_notebook_trailing_newline() -> None:
+    with open(
+        os.path.join("tests", "data", "notebook_trailing_newline.ipynb"), "rb"
+    ) as fd:
+        content_bytes = fd.read()
+    content = content_bytes.decode()
+    result = format_file_contents(content, fast=True, mode=JUPYTER_MODE)
+    expected = (
+        "{\n"
+        ' "cells": [\n'
+        "  {\n"
+        '   "cell_type": "code",\n'
+        '   "execution_count": null,\n'
+        '   "metadata": {\n'
+        '    "tags": []\n'
+        "   },\n"
+        '   "outputs": [],\n'
+        '   "source": [\n'
+        '    "%%time\\n",\n'
+        '    "\\n",\n'
+        '    "print(\\"foo\\")"\n'
+        "   ]\n"
+        "  },\n"
+        "  {\n"
+        '   "cell_type": "code",\n'
+        '   "execution_count": null,\n'
+        '   "metadata": {},\n'
+        '   "outputs": [],\n'
+        '   "source": []\n'
+        "  }\n"
+        " ],\n"
+        ' "metadata": {\n'
+        '  "interpreter": {\n'
+        '   "hash": "e758f3098b5b55f4d87fe30bbdc1367f20f246b483f96267ee70e6c40cb185d8"\n'  # noqa:B950
+        "  },\n"
+        '  "kernelspec": {\n'
+        '   "display_name": "Python 3.8.10 64-bit (\'black\': venv)",\n'
+        '   "name": "python3"\n'
+        "  },\n"
+        '  "language_info": {\n'
+        '   "name": "python",\n'
+        '   "version": ""\n'
+        "  }\n"
+        " },\n"
+        ' "nbformat": 4,\n'
+        ' "nbformat_minor": 4\n'
+        "}\n"
+    )
+    assert result == expected
+
+
+def test_entire_notebook_no_trailing_newline() -> None:
+    with open(
+        os.path.join("tests", "data", "notebook_no_trailing_newline.ipynb"), "rb"
+    ) as fd:
+        content_bytes = fd.read()
+    content = content_bytes.decode()
+    result = format_file_contents(content, fast=True, mode=JUPYTER_MODE)
+    expected = (
+        "{\n"
+        ' "cells": [\n'
+        "  {\n"
+        '   "cell_type": "code",\n'
+        '   "execution_count": null,\n'
+        '   "metadata": {\n'
+        '    "tags": []\n'
+        "   },\n"
+        '   "outputs": [],\n'
+        '   "source": [\n'
+        '    "%%time\\n",\n'
+        '    "\\n",\n'
+        '    "print(\\"foo\\")"\n'
+        "   ]\n"
+        "  },\n"
+        "  {\n"
+        '   "cell_type": "code",\n'
+        '   "execution_count": null,\n'
+        '   "metadata": {},\n'
+        '   "outputs": [],\n'
+        '   "source": []\n'
+        "  }\n"
+        " ],\n"
+        ' "metadata": {\n'
+        '  "interpreter": {\n'
+        '   "hash": "e758f3098b5b55f4d87fe30bbdc1367f20f246b483f96267ee70e6c40cb185d8"\n'  # noqa: B950
+        "  },\n"
+        '  "kernelspec": {\n'
+        '   "display_name": "Python 3.8.10 64-bit (\'black\': venv)",\n'
+        '   "name": "python3"\n'
+        "  },\n"
+        '  "language_info": {\n'
+        '   "name": "python",\n'
+        '   "version": ""\n'
+        "  }\n"
+        " },\n"
+        ' "nbformat": 4,\n'
+        ' "nbformat_minor": 4\n'
+        "}"
+    )
+    assert result == expected
+
+
+def test_entire_notebook_without_changes() -> None:
+    with open(
+        os.path.join("tests", "data", "notebook_without_changes.ipynb"), "rb"
+    ) as fd:
+        content_bytes = fd.read()
+    content = content_bytes.decode()
+    with pytest.raises(NothingChanged):
+        format_file_contents(content, fast=True, mode=JUPYTER_MODE)
+
+
+def test_non_python_notebook() -> None:
+    with open(os.path.join("tests", "data", "non_python_notebook.ipynb"), "rb") as fd:
+        content_bytes = fd.read()
+    content = content_bytes.decode()
+    with pytest.raises(NothingChanged):
+        format_file_contents(content, fast=True, mode=JUPYTER_MODE)
+
+
+def test_empty_string() -> None:
+    with pytest.raises(NothingChanged):
+        format_file_contents("", fast=True, mode=JUPYTER_MODE)
+
+
+def test_unparseable_notebook() -> None:
+    msg = (
+        r"File 'tests[/\\]data[/\\]notebook_which_cant_be_parsed\.ipynb' "
+        r"cannot be parsed as valid Jupyter notebook\."
+    )
+    with pytest.raises(ValueError, match=msg):
+        format_file_in_place(
+            pathlib.Path("tests") / "data/notebook_which_cant_be_parsed.ipynb",
+            fast=True,
+            mode=JUPYTER_MODE,
+        )
+
+
+def test_ipynb_diff_with_change() -> None:
+    result = runner.invoke(
+        main,
+        [
+            os.path.join("tests", "data", "notebook_trailing_newline.ipynb"),
+            "--diff",
+        ],
+    )
+    expected = "@@ -1,3 +1,3 @@\n %%time\n \n-print('foo')\n" '+print("foo")\n'
+    assert expected in result.output
+
+
+def test_ipynb_diff_with_no_change() -> None:
+    result = runner.invoke(
+        main,
+        [
+            os.path.join("tests", "data", "notebook_without_changes.ipynb"),
+            "--diff",
+        ],
+    )
+    expected = "1 file would be left unchanged."
+    assert expected in result.output
+
+
+def test_cache_isnt_written_if_no_jupyter_deps_single(
+    monkeypatch: MonkeyPatch, tmpdir: local
+) -> None:
+    # Check that the cache isn't written to if Jupyter dependencies aren't installed.
+    jupyter_dependencies_are_installed.cache_clear()
+    nb = os.path.join("tests", "data", "notebook_trailing_newline.ipynb")
+    tmp_nb = tmpdir / "notebook.ipynb"
+    with open(nb) as src, open(tmp_nb, "w") as dst:
+        dst.write(src.read())
+    monkeypatch.setattr(
+        "black.jupyter_dependencies_are_installed", lambda verbose, quiet: False
+    )
+    result = runner.invoke(main, [str(tmpdir / "notebook.ipynb")])
+    assert "No Python files are present to be formatted. Nothing to do" in result.output
+    jupyter_dependencies_are_installed.cache_clear()
+    monkeypatch.setattr(
+        "black.jupyter_dependencies_are_installed", lambda verbose, quiet: True
+    )
+    result = runner.invoke(main, [str(tmpdir / "notebook.ipynb")])
+    assert "reformatted" in result.output
+
+
+def test_cache_isnt_written_if_no_jupyter_deps_dir(
+    monkeypatch: MonkeyPatch, tmpdir: local
+) -> None:
+    # Check that the cache isn't written to if Jupyter dependencies aren't installed.
+    jupyter_dependencies_are_installed.cache_clear()
+    nb = os.path.join("tests", "data", "notebook_trailing_newline.ipynb")
+    tmp_nb = tmpdir / "notebook.ipynb"
+    with open(nb) as src, open(tmp_nb, "w") as dst:
+        dst.write(src.read())
+    monkeypatch.setattr(
+        "black.files.jupyter_dependencies_are_installed", lambda verbose, quiet: False
+    )
+    result = runner.invoke(main, [str(tmpdir)])
+    assert "No Python files are present to be formatted. Nothing to do" in result.output
+    jupyter_dependencies_are_installed.cache_clear()
+    monkeypatch.setattr(
+        "black.files.jupyter_dependencies_are_installed", lambda verbose, quiet: True
+    )
+    result = runner.invoke(main, [str(tmpdir)])
+    assert "reformatted" in result.output
+
+
+def test_ipynb_flag(tmpdir: local) -> None:
+    nb = os.path.join("tests", "data", "notebook_trailing_newline.ipynb")
+    tmp_nb = tmpdir / "notebook.a_file_extension_which_is_definitely_not_ipynb"
+    with open(nb) as src, open(tmp_nb, "w") as dst:
+        dst.write(src.read())
+    result = runner.invoke(
+        main,
+        [
+            str(tmp_nb),
+            "--diff",
+            "--ipynb",
+        ],
+    )
+    expected = "@@ -1,3 +1,3 @@\n %%time\n \n-print('foo')\n" '+print("foo")\n'
+    assert expected in result.output
+
+
+def test_ipynb_and_pyi_flags() -> None:
+    nb = os.path.join("tests", "data", "notebook_trailing_newline.ipynb")
+    result = runner.invoke(
+        main,
+        [
+            nb,
+            "--pyi",
+            "--ipynb",
+            "--diff",
+        ],
+    )
+    assert isinstance(result.exception, SystemExit)
+    expected = "Cannot pass both `pyi` and `ipynb` flags!\n"
+    assert result.output == expected
diff --git a/tests/test_no_ipynb.py b/tests/test_no_ipynb.py
new file mode 100644 (file)
index 0000000..bcda2d5
--- /dev/null
@@ -0,0 +1,37 @@
+import pytest
+import os
+
+from tests.util import THIS_DIR
+from black import main, jupyter_dependencies_are_installed
+from click.testing import CliRunner
+from _pytest.tmpdir import tmpdir
+
+pytestmark = pytest.mark.no_jupyter
+
+runner = CliRunner()
+
+
+def test_ipynb_diff_with_no_change_single() -> None:
+    jupyter_dependencies_are_installed.cache_clear()
+    path = THIS_DIR / "data/notebook_trailing_newline.ipynb"
+    result = runner.invoke(main, [str(path)])
+    expected_output = (
+        "Skipping .ipynb files as Jupyter dependencies are not installed.\n"
+        "You can fix this by running ``pip install black[jupyter]``\n"
+    )
+    assert expected_output in result.output
+
+
+def test_ipynb_diff_with_no_change_dir(tmpdir: tmpdir) -> None:
+    jupyter_dependencies_are_installed.cache_clear()
+    runner = CliRunner()
+    nb = os.path.join("tests", "data", "notebook_trailing_newline.ipynb")
+    tmp_nb = tmpdir / "notebook.ipynb"
+    with open(nb) as src, open(tmp_nb, "w") as dst:
+        dst.write(src.read())
+    result = runner.invoke(main, [str(tmpdir)])
+    expected_output = (
+        "Skipping .ipynb files as Jupyter dependencies are not installed.\n"
+        "You can fix this by running ``pip install black[jupyter]``\n"
+    )
+    assert expected_output in result.output
diff --git a/tox.ini b/tox.ini
index 3767f98a73b00c07489a9f9be66125959a7e523c..57f41acb3d17a04845dc69c336339f21d3640e2d 100644 (file)
--- a/tox.ini
+++ b/tox.ini
@@ -16,10 +16,17 @@ commands =
     pip install -e .[d]
     coverage erase
     pytest tests --run-optional no_python2 \
+        --run-optional no_jupyter \
         !ci: --numprocesses auto \
         --cov {posargs}
     pip install -e .[d,python2]
     pytest tests --run-optional python2 \
+        --run-optional no_jupyter \
+        !ci: --numprocesses auto \
+        --cov --cov-append {posargs}
+    pip install -e .[jupyter]
+    pytest tests --run-optional jupyter \
+        -m jupyter \
         !ci: --numprocesses auto \
         --cov --cov-append {posargs}
     coverage report