Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

Less INDENT/DEDENT? #59

Open
jimbaker opened this issue Jun 5, 2020 · 8 comments
Open

Less INDENT/DEDENT? #59

jimbaker opened this issue Jun 5, 2020 · 8 comments

Comments

@jimbaker
Copy link

jimbaker commented Jun 5, 2020

Arguably indenting case statements under the head match should not be done:

  1. Other "decision statements" - if/elif/else and try/except - line up vertically. Code like the following from asyncio resembles closely the use of match in the use of isinstance for match, but of course it is not indented:
        try:
            nbytes = sock.recv_into(buf)
        except (BlockingIOError, InterruptedError):
            return  # try again next time
        except (SystemExit, KeyboardInterrupt):
            raise
        except BaseException as exc:
            fut.set_exception(exc)
        else:
            fut.set_result(nbytes)
  1. Additional indentation in decision statements is a visual indication of complexity. Match statements change this assumption. And of course frequently Python developers, myself included, try to avoid increasing indentation by returning early.

  2. Minimization of diffs is a good thing. This is especially true when we consider that we want to encourage refactoring, at the very least in the stdlib. Or the occasional need to blame. Compare the following:

2,3c2,4
<     if isinstance(value, str):
<         self.simple_element("string", value)
---
>     match value:
>         case str():
>             self.simple_element("string", value)
5,6c6,7
<     elif value is True:
<         self.simple_element("true")
---
>         case  True:
>             self.simple_element("true")
8,9c9,10
<     elif value is False:
<         self.simple_element("false")
---
>         case False:
>             self.simple_element("false")
11,15c12,16
<     elif isinstance(value, int):
<         if -1 << 63 <= value < 1 << 64:
<             self.simple_element("integer", "%d" % value)
<         else:
<             raise OverflowError(value)
---
>         case int():
>             if -1 << 63 <= value < 1 << 64:
>                 self.simple_element("integer", "%d" % value)
>             else:
>                 raise OverflowError(value)
17,18c18,19
<     elif isinstance(value, float):
<         self.simple_element("real", repr(value))
---
>         case float():
>             self.simple_element("real", repr(value))
20,21c21,22
<     elif isinstance(value, dict):
<         self.write_dict(value)
---
>         case dict():
>             self.write_dict(value)
23,24c24,25
<     elif isinstance(value, (bytes, bytearray)):
<         self.write_bytes(value)
---
>         case bytes() | bytearray():
>             self.write_bytes(value)
26,27c27,28
<     elif isinstance(value, datetime.datetime):
<         self.simple_element("date", _date_to_string(value))
---
>         case datetime.datetime():
>             self.simple_element("date", _date_to_string(value))
29,30c30,31
<     elif isinstance(value, (tuple, list)):
<         self.write_array(value)
---
>         case tuple() | list():
>             self.write_array(value)
32,33c33,34
<     else:
<         raise TypeError("unsupported type: %s" % type(value))
---
>         case _:
>             raise TypeError("unsupported type: %s" % type(value))

versus

2c2,3
<     if isinstance(value, str):
---
>     match value:
>     case str():
5c6
<     elif value is True:
---
>     case  True:
8c9
<     elif value is False:
---
>     case False:
11c12
<     elif isinstance(value, int):
---
>     case int():
17c18
<     elif isinstance(value, float):
---
>     case float():
20c21
<     elif isinstance(value, dict):
---
>     case dict():
23c24
<     elif isinstance(value, (bytes, bytearray)):
---
>     case bytes() | bytearray():
26c27
<     elif isinstance(value, datetime.datetime):
---
>     case datetime.datetime():
29c30
<     elif isinstance(value, (tuple, list)):
---
>     case tuple() | list():
32c33
<     else:
---
>     case _:

Looking at the evolution of the codebase with the latter diff, we are much more confident on what has happened - the new match statement has been adopted, with corresponding simplification.

@gvanrossum
Copy link
Owner

Yup, it's a conundrum. Note that we debated this in #2, though perhaps not enough (also note the very last comment there). It's also mentioned in the PEP under "Use a flat indentation scheme" in "Rejected Ideas".

@gvanrossum
Copy link
Owner

The diff is actually a pretty strong argument.

@jimbaker
Copy link
Author

jimbaker commented Jun 5, 2020

Yes, I saw it was rejected, but I wanted to make sure we considered the above args before finally rejecting it :) I missed issue #2 's comments.

It does seem to be a matter of the following, and where the line is drawn:

Special cases aren't special enough to break the rules.
Although practicality beats purity.

My feeling is in practice, no one will really notice that the match statement doesn't quite follow the same indentation, because of its symmetry with if/elif/else and try/except. Then the two other args, especially the diff argument (a pragmatic reflection of symmetry) will follow.

We also see similar practice in the C code of Python, at least in ceval.c, which heavily uses switch/case. PEP 7 is silent on switch specifically, but it seems to otherwise encourage braces even when not mandatory and indentation after braces. But we do see examples like this in ceval.c:

        case TARGET(RAISE_VARARGS): {
            PyObject *cause = NULL, *exc = NULL;
            switch (oparg) {
            case 2:
                cause = POP(); /* cause */
                /* fall through */
            case 1:
                exc = POP(); /* exc */
                /* fall through */
            case 0:
                if (do_raise(tstate, exc, cause)) {
                    goto exception_unwind;
                }
                break;
            default:
                _PyErr_SetString(tstate, PyExc_SystemError,
                                 "bad RAISE_VARARGS oparg");
                break;
            }
            goto error;
        }

which in turn is part of an enormous switch. Arguing from C usage is likely a bit of a stretch, but certainly the above convention is a practicality argument.

@viridia
Copy link
Collaborator

viridia commented Jun 5, 2020

I would argue in favor of the additional indentation (i.e. having case clauses be indented rather than at the same level as match), for two reasons:

  1. Having a statement that ends with a colon, but no indented statement immediately below it, looks very wrong to me.

  2. Overall I think the net effect of introducing match will be to shorten average line length rather than lengthening it. In the example code I have been writing, the extra 4 spaces of indentation are more than compensated by the succinctness of the code overall.

@Tobias-Kohn
Copy link
Collaborator

I concur with @viridia on this.

The diffs are certainly a good argument. However, only up to a certain point: the introduction of pattern matching is not primarily motivated by the need to simplify or rewrite the Python standard library. IMHO, the "Although practicality beats purity." aspect just does not seem to fully apply, because this refactoring is very short-lived and more or less a singular point in time.

Conceptually, the case clauses "belong" to the match block: they implicitly refer to the expression <expr> evaluated in match <expr>. In other words, the match-block keeps a hidden variable alive, which is removed once the match-block is left.

The suggestion to make the match-statement flat effectively means to remove the end-marker from the match-statement. Although the example in C seems to also be flat, the switch-statement still has a clear end-marker }. Python has a bit less leeway in this because the DEDENT is basically the only end-marker.

Finally, I also think the grammar is much more stable, particularly since case is a soft keyword. Consider a silly example like this:

match x:
case 1:
    ...
case 2:
    ...
case (3)   # <- is this still a case clause?

Is the case(3) here a valid function call or a case clause with the colon missing? Yes, the new parser can surely handle this, but I think it adds unnecessary complexity.

@gvanrossum
Copy link
Owner

If case is a soft keyword, the parser has no choice but to treat that last line as a call to a function named case.

That's actually a good argument for indenting or for using as...

Personally in such cases if a small diff is really important I would resort to using a "half indent" for the newly-inserted case level. Yes, that's cheating, but PBP and such. :-)

Or ignore whitespace in the diff.

@viridia
Copy link
Collaborator

viridia commented Jun 7, 2020

I am fairly skeptical that the 'half-indent' strategy for minimizing indentation levels would be widely adopted, for several reasons:

  • It would difficult for editors that auto-indent to make an exception based on context. Remember, most editors aren't as smart as an auto-formatter, but are smart enough to indent and unindent.
  • People who establish standards for coding style tend to be fairly pedantic in their tastes, and having an anomalous indentation level would seem inconsistent.

And honestly I don't think it's necessary.

@gvanrossum
Copy link
Owner

I don't contest any of that. I'm just reporting what I have done in similar situations. :-)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants