Skip to content
This repository has been archived by the owner on Nov 21, 2022. It is now read-only.

Adding __match_args__ is backwards incompatible #104

Open
gvanrossum opened this issue Jun 25, 2020 · 31 comments
Open

Adding __match_args__ is backwards incompatible #104

gvanrossum opened this issue Jun 25, 2020 · 31 comments
Labels
accepted Discussion leading to a final decision to include in the PEP fully pepped Issues that have been fully documented in the PEP

Comments

@gvanrossum
Copy link
Owner

Pablo pointed out that adding this is usually backwards incompatible. Maybe we need to rethink the special casing of object(c)?

https://mail.python.org/archives/list/[email protected]/message/STJSSAETMTUY7FK5AE53IM73Z2WORNYN/

@thautwarm
Copy link

Consider #100 ? Needs no __match_args__. A default implementation made for object will allow keyword patterns without doing any pre-settings.

For the possible reason that you rejected this...

I think @viridia's idea of returning a proxy object from match that in simple cases can just be self is a winner.

With #100, by providing a utility library:

from patma import PNoKwargs, PFixedPosArg

class MyClass:
      ... # implementations

     @PFixedPosArg(1)
     @PNoKwargs
     def __match__(cls, inst):
         return inst

which is the same as

class MyClass:
    ... # implementations

   def __match__(cls, inst, argc, kwargs):
      assert not kwargs, f"pattern {cls} expects no keyword arguments, got {kwargs}."
      assert argc == 1, f"pattern {cls} expects 1 positional arguments, got {argc}."
      return (inst, )

@brandtbucher
Copy link
Collaborator

brandtbucher commented Jun 26, 2020

@thautwarm, please let it go. Your suggestion is off-topic, has already been rejected twice, and does not solve this issue being addressed here.

@brandtbucher brandtbucher added the open Still under discussion label Jun 26, 2020
@thautwarm
Copy link

Okay, but could you please show this?:

does not solve this issue being addressed here.

@brandtbucher
Copy link
Collaborator

The current protocol is perfectly capable of accommodating this concern: we would just need to remove the nice default behavior which matches the proxy when __match_args__ is missing or None. We're trying to see if we can find a way to keep or modify the nice default behavior without breaking compatibility for classes that later implement more complicated destructuring. Your protocol suffers from the same flaw if set up with the same default behavior for object.

Please follow your own advice in the linked issue:

However I see this is not the current focus and discussions about this can be harmful in this stage. I won't talk about this any more unless higher priority issues got addressed.

@gvanrossum
Copy link
Owner Author

Idea: drop the default behavior and instead add it back just to select standard objects like str and int.

@brandtbucher
Copy link
Collaborator

brandtbucher commented Jun 26, 2020

Yeah, that could work (not a huge fan of more special cases, but I see the pragmatism). I think no matter what, we will need to drop the default behavior.

But it's generally useful enough (and hard enough to get right as a hack) that I'd like to provide a fairly frictionless way of getting it back for user-defined classes, if it isn't too hard.

Perhaps __match_args__ = ... (Ellipsis)? I can't really think of any other magic constants we could use here. It should be on __match_args__, since we probably want either having this behavior or positional matches to be mutually exclusive.

@viridia
Copy link
Collaborator

viridia commented Jun 26, 2020

I'm not sure I understand the issue, could one of you give a brief explanation? I tried the link but it didn't send me to the correct place.

@brandtbucher
Copy link
Collaborator

Original comment and my response.

@brandtbucher
Copy link
Collaborator

Basically, the default match behavior for classes (which allows a single positional match against the whole proxy) means that, as soon as a class adds __match_args__ with one or more args in the future, old code will start silently breaking in strange ways.

@viridia
Copy link
Collaborator

viridia commented Jun 26, 2020

Got it.

I suspect that adding an explicit match_args to every built-in non-reference type is probably the cleaner answer in the long run. I think the motivation for the clever default is partly laziness (in the Perl sense), it avoids having to go and update a whole bunch of value types - but the downside is that you now have a large blast radius for any unforeseen side effects of this new behavior. Probably better to use a scalpel...

@thautwarm
Copy link

thautwarm commented Jun 26, 2020

Clairfy:
Despite of __match_args__, defining __match__ later will also break the backward compatibility.

This cannot get avoided because users can arbitrarily decide the first pos arg x for a pattern C(x) by giving a certain implementation of __match__, no matter what __match_args__ is used, if we can write C.__match__ (this observation suggests us that to keep BC, the default behaviors can be only made for classes having frozen "match" and "match_args").

I saw you all constantly talking about __match_args__, hence I think I should stress this fact here, as this is reason why my proposal cannot solve this issue, even if mine doesn't need any __match_args__.

@Tobias-Kohn
Copy link
Collaborator

Idea: drop the default behavior and instead add it back just to select standard objects like str and int.

I agree, too, that this is probably the most reasonable approach. Pablo's example seems to highlight that this is mainly an issue with "composite" types. As long as we stick to value types for the default behaviour, I think we should be fine. And, luckily, there are not too many of those.

@thautwarm
Copy link

thautwarm commented Jun 26, 2020

Some analyses:

  1. why we want a default behavior?

    ... aims at providing a basic, useful (but still safe) experience with pattern matching out of the box

  2. actual problem concerning use cases?

    While some users use the out-of-box support for types from existing libraries, others might change those types to have more complex matching behaviors. This causes a conflict and breaks the former's code.

    (feel free to add more concerns here)

  3. what can we do?

    • keeping default behaviors for all objects is impossible in general, if the execution of pattern matching will aware the modifications of either __match_args__ or __match__.

    • keeping default behaviors for objects that have immutable and readonly "match_args" and "match" can be safe. Even though by changing other attributes of such an object, the compatibility may also get lost, but I don't see this usually happens in a regular use cases. We don't consider the problems of monkey patches.

    • default behavior support for larger proportion of existing code is still available, in other forms without any/radical change to the PEP.

      Consider why this happens in practice: some users change a class/object from an upstream library, which is also used by other users depending on the default behavior.

      This can be prevent by providing them handy means to create wrapper patterns, like

      nparray = register(np.array, __match__ = __match__)

      As a consequence, a user in need of a complex matching for np.array, can use nparray instead of modifying np.array.__match__.

@gvanrossum
Copy link
Owner Author

For most classes the default behavior is not very useful. It works for str(x) and int(x) because those have in fact behavior where str(x) == x etc. But for user classes that behavior is not very common.

@brandtbucher
Copy link
Collaborator

I agree that it's not correct for most classes, but still I think it makes sense for many. I'll link issue #58 since there was a short discussion on this. The trivial use case of foo(x) can of course be replaced by x := foo(), but I'm considering cases where you want to do something more complex, like numbers.Real(42) or set(.items) in the linked issue.

It's not a huge deal, but it does feel odd to restrict this magic to select built-in types. If we don't like __match_args__ = ..., what about allowing None or "." or something to appear in __match_args__, signifying that that positional argument matches the proxy? There's definitely some weird things you could do with that, but it makes getting this back really simple.

@brandtbucher
Copy link
Collaborator

I'm willing to let this go if others feel differently, though.

@Tobias-Kohn
Copy link
Collaborator

I really like to idea of writing stuff like int(42) to specify that I want to have an int with the given value. This becomes rather cumbersome to express with x := int() syntax. For custom classes, however, there are no literals to do this in the first place, so x := foo() is probably fine. Moreover, I think that the actual data in custom classes is (almost) always organised in attributes, so that it makes sense to prioritise an attribute-oriented approach.

The thing that really sold me on #58 was that it just works out of the box for built-in types (where appropriate). But I can happily live without having similar behaviour (by default) for custom classes.

@brandtbucher
Copy link
Collaborator

brandtbucher commented Jun 26, 2020

Okay, let's kill the default behavior and build a list of classes that will support it instead. I'm tempted to just say "all types in builtins" as a rule, but maybe we want to be more nuanced? Here are all of the non-exception types:

  • bool
  • bytearray
  • bytes
  • classmethod
  • complex
  • dict
  • enumerate
  • filter
  • float
  • frozenset
  • int
  • list
  • map
  • memoryview
  • object
  • property
  • range
  • reversed
  • set
  • slice
  • staticmethod
  • str
  • super
  • tuple
  • type
  • zip

@gvanrossum
Copy link
Owner Author

That looks right.

@brandtbucher
Copy link
Collaborator

I imagine we raise ImpossibleMatchError if a subclass of these also defines __match_args__?

@gvanrossum
Copy link
Owner Author

Why?

@brandtbucher
Copy link
Collaborator

Never mind, I see now that just using the subclass's __match_args__ is correct here.

@brandtbucher
Copy link
Collaborator

brandtbucher commented Jun 26, 2020

By the way, do we still want to accept __match_args__ = None? I don't think it's any different from __match_args__ = () anymore...

We could even use it to enable the old default behavior?

@gvanrossum
Copy link
Owner Author

You’re right, it would remove a bunch of tedious language from the PEP.

I think this is ready for an update to the PEP then?

@brandtbucher
Copy link
Collaborator

Sorry, I updated my comment with the last line. Don't know if you saw.

@gvanrossum
Copy link
Owner Author

Let’s not introduce a shortcut for the old default.

@brandtbucher
Copy link
Collaborator

Okay, I'll update the PEP and implementation.

@brandtbucher brandtbucher added accepted Discussion leading to a final decision to include in the PEP and removed open Still under discussion labels Jun 27, 2020
@brandtbucher brandtbucher added the fully pepped Issues that have been fully documented in the PEP label Jun 27, 2020
@viridia
Copy link
Collaborator

viridia commented Jun 27, 2020

Possibly type and complex. None of the others though. (Thanks for producing that list!)

@gvanrossum
Copy link
Owner Author

Not type, type(t) != t. And complex has two attributes, real and imag, so that would be very confusing — does complex(0) match 0j or any complex with real==0 ?

@brandtbucher
Copy link
Collaborator

Sorry, just to clarify: I should remove type and complex from this list (and we'll do complex.__match_args__ = ("real", "imag"))?

Funny, complex would actually be a decent use case for __match_args_required__ = 2. Oh well.

@gvanrossum
Copy link
Owner Author

I think complex should not be given positional args at all, to avoid confusion. Same for type.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
accepted Discussion leading to a final decision to include in the PEP fully pepped Issues that have been fully documented in the PEP
Projects
None yet
Development

No branches or pull requests

5 participants