Less INDENT/DEDENT? #59

jimbaker · 2020-06-05T03:56:42Z

Arguably indenting case statements under the head match should not be done:

Other "decision statements" - if/elif/else and try/except - line up vertically. Code like the following from asyncio resembles closely the use of match in the use of isinstance for match, but of course it is not indented:

        try:
            nbytes = sock.recv_into(buf)
        except (BlockingIOError, InterruptedError):
            return  # try again next time
        except (SystemExit, KeyboardInterrupt):
            raise
        except BaseException as exc:
            fut.set_exception(exc)
        else:
            fut.set_result(nbytes)

Additional indentation in decision statements is a visual indication of complexity. Match statements change this assumption. And of course frequently Python developers, myself included, try to avoid increasing indentation by returning early.
Minimization of diffs is a good thing. This is especially true when we consider that we want to encourage refactoring, at the very least in the stdlib. Or the occasional need to blame. Compare the following:

2,3c2,4
<     if isinstance(value, str):
<         self.simple_element("string", value)
---
>     match value:
>         case str():
>             self.simple_element("string", value)
5,6c6,7
<     elif value is True:
<         self.simple_element("true")
---
>         case  True:
>             self.simple_element("true")
8,9c9,10
<     elif value is False:
<         self.simple_element("false")
---
>         case False:
>             self.simple_element("false")
11,15c12,16
<     elif isinstance(value, int):
<         if -1 << 63 <= value < 1 << 64:
<             self.simple_element("integer", "%d" % value)
<         else:
<             raise OverflowError(value)
---
>         case int():
>             if -1 << 63 <= value < 1 << 64:
>                 self.simple_element("integer", "%d" % value)
>             else:
>                 raise OverflowError(value)
17,18c18,19
<     elif isinstance(value, float):
<         self.simple_element("real", repr(value))
---
>         case float():
>             self.simple_element("real", repr(value))
20,21c21,22
<     elif isinstance(value, dict):
<         self.write_dict(value)
---
>         case dict():
>             self.write_dict(value)
23,24c24,25
<     elif isinstance(value, (bytes, bytearray)):
<         self.write_bytes(value)
---
>         case bytes() | bytearray():
>             self.write_bytes(value)
26,27c27,28
<     elif isinstance(value, datetime.datetime):
<         self.simple_element("date", _date_to_string(value))
---
>         case datetime.datetime():
>             self.simple_element("date", _date_to_string(value))
29,30c30,31
<     elif isinstance(value, (tuple, list)):
<         self.write_array(value)
---
>         case tuple() | list():
>             self.write_array(value)
32,33c33,34
<     else:
<         raise TypeError("unsupported type: %s" % type(value))
---
>         case _:
>             raise TypeError("unsupported type: %s" % type(value))

versus

2c2,3
<     if isinstance(value, str):
---
>     match value:
>     case str():
5c6
<     elif value is True:
---
>     case  True:
8c9
<     elif value is False:
---
>     case False:
11c12
<     elif isinstance(value, int):
---
>     case int():
17c18
<     elif isinstance(value, float):
---
>     case float():
20c21
<     elif isinstance(value, dict):
---
>     case dict():
23c24
<     elif isinstance(value, (bytes, bytearray)):
---
>     case bytes() | bytearray():
26c27
<     elif isinstance(value, datetime.datetime):
---
>     case datetime.datetime():
29c30
<     elif isinstance(value, (tuple, list)):
---
>     case tuple() | list():
32c33
<     else:
---
>     case _:

Looking at the evolution of the codebase with the latter diff, we are much more confident on what has happened - the new match statement has been adopted, with corresponding simplification.

The text was updated successfully, but these errors were encountered:

gvanrossum · 2020-06-05T04:06:36Z

Yup, it's a conundrum. Note that we debated this in #2, though perhaps not enough (also note the very last comment there). It's also mentioned in the PEP under "Use a flat indentation scheme" in "Rejected Ideas".

gvanrossum · 2020-06-05T04:27:51Z

The diff is actually a pretty strong argument.

jimbaker · 2020-06-05T14:42:02Z

Yes, I saw it was rejected, but I wanted to make sure we considered the above args before finally rejecting it :) I missed issue #2 's comments.

It does seem to be a matter of the following, and where the line is drawn:

Special cases aren't special enough to break the rules.
Although practicality beats purity.

My feeling is in practice, no one will really notice that the match statement doesn't quite follow the same indentation, because of its symmetry with if/elif/else and try/except. Then the two other args, especially the diff argument (a pragmatic reflection of symmetry) will follow.

We also see similar practice in the C code of Python, at least in ceval.c, which heavily uses switch/case. PEP 7 is silent on switch specifically, but it seems to otherwise encourage braces even when not mandatory and indentation after braces. But we do see examples like this in ceval.c:

        case TARGET(RAISE_VARARGS): {
            PyObject *cause = NULL, *exc = NULL;
            switch (oparg) {
            case 2:
                cause = POP(); /* cause */
                /* fall through */
            case 1:
                exc = POP(); /* exc */
                /* fall through */
            case 0:
                if (do_raise(tstate, exc, cause)) {
                    goto exception_unwind;
                }
                break;
            default:
                _PyErr_SetString(tstate, PyExc_SystemError,
                                 "bad RAISE_VARARGS oparg");
                break;
            }
            goto error;
        }

which in turn is part of an enormous switch. Arguing from C usage is likely a bit of a stretch, but certainly the above convention is a practicality argument.

viridia · 2020-06-05T15:53:41Z

I would argue in favor of the additional indentation (i.e. having case clauses be indented rather than at the same level as match), for two reasons:

Having a statement that ends with a colon, but no indented statement immediately below it, looks very wrong to me.
Overall I think the net effect of introducing match will be to shorten average line length rather than lengthening it. In the example code I have been writing, the extra 4 spaces of indentation are more than compensated by the succinctness of the code overall.

Tobias-Kohn · 2020-06-05T21:19:16Z

I concur with @viridia on this.

The diffs are certainly a good argument. However, only up to a certain point: the introduction of pattern matching is not primarily motivated by the need to simplify or rewrite the Python standard library. IMHO, the "Although practicality beats purity." aspect just does not seem to fully apply, because this refactoring is very short-lived and more or less a singular point in time.

Conceptually, the case clauses "belong" to the match block: they implicitly refer to the expression <expr> evaluated in match <expr>. In other words, the match-block keeps a hidden variable alive, which is removed once the match-block is left.

The suggestion to make the match-statement flat effectively means to remove the end-marker from the match-statement. Although the example in C seems to also be flat, the switch-statement still has a clear end-marker }. Python has a bit less leeway in this because the DEDENT is basically the only end-marker.

Finally, I also think the grammar is much more stable, particularly since case is a soft keyword. Consider a silly example like this:

match x:
case 1:
    ...
case 2:
    ...
case (3)   # <- is this still a case clause?

Is the case(3) here a valid function call or a case clause with the colon missing? Yes, the new parser can surely handle this, but I think it adds unnecessary complexity.

gvanrossum · 2020-06-05T23:46:34Z

If case is a soft keyword, the parser has no choice but to treat that last line as a call to a function named case.

That's actually a good argument for indenting or for using as...

Personally in such cases if a small diff is really important I would resort to using a "half indent" for the newly-inserted case level. Yes, that's cheating, but PBP and such. :-)

Or ignore whitespace in the diff.

viridia · 2020-06-07T05:49:18Z

I am fairly skeptical that the 'half-indent' strategy for minimizing indentation levels would be widely adopted, for several reasons:

It would difficult for editors that auto-indent to make an exception based on context. Remember, most editors aren't as smart as an auto-formatter, but are smart enough to indent and unindent.
People who establish standards for coding style tend to be fairly pedantic in their tastes, and having an anomalous indentation level would seem inconsistent.

And honestly I don't think it's necessary.

gvanrossum · 2020-06-07T16:19:22Z

I don't contest any of that. I'm just reporting what I have done in similar situations. :-)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Less INDENT/DEDENT? #59

Less INDENT/DEDENT? #59

jimbaker commented Jun 5, 2020 •

edited by gvanrossum

Loading

gvanrossum commented Jun 5, 2020

gvanrossum commented Jun 5, 2020

jimbaker commented Jun 5, 2020

viridia commented Jun 5, 2020

Tobias-Kohn commented Jun 5, 2020

gvanrossum commented Jun 5, 2020

viridia commented Jun 7, 2020 •

edited

Loading

gvanrossum commented Jun 7, 2020

Less INDENT/DEDENT? #59

Less INDENT/DEDENT? #59

Comments

jimbaker commented Jun 5, 2020 • edited by gvanrossum Loading

gvanrossum commented Jun 5, 2020

gvanrossum commented Jun 5, 2020

jimbaker commented Jun 5, 2020

viridia commented Jun 5, 2020

Tobias-Kohn commented Jun 5, 2020

gvanrossum commented Jun 5, 2020

viridia commented Jun 7, 2020 • edited Loading

gvanrossum commented Jun 7, 2020

jimbaker commented Jun 5, 2020 •

edited by gvanrossum

Loading

viridia commented Jun 7, 2020 •

edited

Loading