-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use predefined sets of naming styles instead of explicit regexps #1046
Conversation
FYI, Travis fails are two misaligned brackets from pylint report and pypy fail, which passes on my local machine. Likely it's an environment issue. |
Hi @rogalski This is really nice, thank you! I will do a thorough review tomorrow. You can ignore the Travis failure for now, at least the one from PyPy ( I didn't have time to investigate it yet). |
pylint/checkers/base.py
Outdated
if overriden_value is not None: | ||
regexps[name_type] = overriden_value | ||
|
||
hints[name_type] = regexps[name_type].pattern |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we define naming convention presets in code, maybe we can provide better hint than pattern
. Maybe hint could be a callable with single argument, where argument is a bad name provided by user. Now we can have non-static hint and suggest that serviceURL
constant for snake_case should be renamed to SERVICE_URL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that would be interesting. So basically instead of
``Invalid name service`
we could have
Invalid name service, use SERVICE instead
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, something like that. Just another step of making Pylint more user friendly (another feature similar to #1035 - don't tell me it's wrong, tell me how to fix it).
I'm aware that in general case it's quite hard and will be some kind of heuristic algorithm , but at the end, we can run heuristic result against predefined regex, and if it won't match just do not include suggestion in message.
Sounds much better than these static hints that are available currently.
pylint/checkers/base.py
Outdated
name_type = name_type.replace('_', '-') | ||
name_options.append(( | ||
'%s-rgx' % (name_type,), | ||
{'default': rgx, 'type': 'regexp', 'metavar': '<regexp>', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have to change the defaults for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because these defaults would override every regular expressions taken from preset. Without removing these defaults, naming-style has no effect at all (condition in line 1285 is always True) .
Now, when --generate-rcfile
is used, naming-style returns user friendly name, and all of custom regular expressions are included as placeholders (commented, without assigned value). Seems fair enough for for both normal users and power users.
Sample :
# Naming style used across a project (snake_case, camel_case).
naming-style=snake_case
# Regular expression matching correct module names
#module-rgx=
# Naming hint for module names
#module-name-hint=
# Regular expression matching correct constant names
#const-rgx=
# Naming hint for constant names
#const-name-hint=
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is different than what the linked issue suggested. We should have naming-styles separated of the regular expressions, one could take precendence over the other though. For instance, if I want to customize how my variables should look like, with a custom pattern, how would I do it now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming-style=snake_case
variable-rgx=MY_NON_STANDARD_REGEX
All but variables are validated based on snake_case
template, variables are validated based on MY_NON_STANDARD_REGEX.
Oh, I re-read both issues and I guess I misunderstood it previously. I think my approach is slightly better - how many projects reasonably require to have two naming conventions in same Python file? |
That is not okay, we need to keep the same behavior for all the individual naming styles. For instance, you can have one naming conventions for classes and another one for methods, in case you are working with frameworks where they used camelCase convention previously (unittest, some packages from stdlib as well). Ideally this patch should add just the naming style conventions, which can be used in turn of regexes when users want to, but it should not replace regexes per se. I am still looking into it and trying it right now. |
OK, thanks for review. I can give a shot for this ideal scenario, it should not be a big deal.
Can you elaborate? I'll stop working on it if you write exact same functionality right now. |
No, no, sorry for being confusing. I'm looking into this pull request, testing it. I'm not planning to write anything. :-) |
@PCManticore I want to clean up on my dangling PR - is this usable in any form or should be just thrown away? |
I think this is a good start, but haven't been able to follow up with a more detailed analysis on how this change should be. Let's leave opened for now and as soon as I make some time, I will come up with a proper review and change plan. If you close it, there's a high chance of me forgetting about this if time passes, so being opened makes me aware of the proposed change. |
@rogalski I managed to review this a bit today. It looks nice! What I think it needs now is a release note in What's new and maybe a bit some more tests to make sure we are covering all the known cases. Feel free to do this at your own pace, since we will probably merge it after 1.7.0. |
# Conflicts: # pylint/checkers/base.py
Why |
@PCManticore let me know what you think about differences between 660b2d9 and 184b58a. I think both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rogalski You mean name-hint
and include-naming-hint
, right? :) As mentioned in the review comments, I'm fine if name-hint
goes away, but I would keep include-naming-hint
for naming styles, since it is the only method of seeing how a pattern is supposed to look like, other than looking over the code of course.
pylint/checkers/base.py
Outdated
|
||
@classmethod | ||
def get_regex(cls, node_type): | ||
return { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to build the dict every time this gets called, let's maybe create it once.
pylint/checkers/base.py
Outdated
@@ -37,14 37,78 @@ | |||
from pylint import reporters | |||
from pylint.reporters.ureports import nodes as reporter_nodes | |||
|
|||
class NamingStyle(object): | |||
"""It may seem counterintuitive that single naming style |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks more like a comment than a docstring
pylint/checkers/base.py
Outdated
if self.config.include_naming_hint: | ||
hint_rgx = getattr(self.config, hint_name.replace('-', '_')) | ||
hint_part = ' (hint: %r)' % hint_rgx | ||
type_label = HUMAN_READABLE_TYPES[node_type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would still keep the naming hint flag (include-naming-hint
) but only for naming styles, since the pattern already covers this. I'm thinking this would be a bit more helpful since there is no other way to see what we decided for a style to contain other than looking over the code (and the values won't be in the configuration either), so --include-naming-hint
would be an user's only chance of seeing what a style value represents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you seen emitted message from current top of this PR?
For out of the box settings:
Function name "BADFUNCTION_name" doesn't conform to snake_case naming style_
For user defined settings:
_Class name "classb" doesn't conform to '(?:(?P<UP>[A-Z] )|(?P<down>[a-z] ))$' pattern_
I personally like them a lot, and to reintroduce include-naming-hint, I'd effectively replace this verbose and precise definitions with something arbitrary like doesn't conform to selected naming style and doesn't conform to custom user-defined regular expression. Personally, I strongly prefer current form.
Include-naming-hint seems like a workaround for older messages where we emitted "Invalid class name 'classb'". With this new approach, where we provide both predefined and advanced mode for users, I don't see much value in keeping it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I saw how it looks. Even if we show snake_case
in the message, how an user will see what that means exactly without looking over the actual code, since these values are hardcoded in the code itself? I disagree about it being a workaround, especially since it's going to work for naming styles only
|
||
regexps[name_type] = NAMING_STYLES[naming_style_name].get_regex(name_type) | ||
|
||
custom_regex_setting_name = "%s_rgx" % (name_type, ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a problem with picking the rgx if it exists, because currently we will always pick regexes, due to already having the -rgx
hints pre-generated in config files. One idea would be to rename the options to something else, so for attr-rgx
for instance, we should have attr-pattern
or something along these lines. Could you think of another approach of how we should handle this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently we will always pick regexes, due to already having the -rgx hints pre-generated in config files
Yes, and I think it's fine, and should be kept this way. By doing it we won't introduce any regressions to current clients. On the same time, new users will happily use this basic functionality.
We simply need reasonable documentation of new capabilities, both in changelog / what's new and in "naming convention" section in docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright lets add documentation about this behaviour then
- AnyStyle - automated tests for generated regexes - reintroduce include-naming-hint
@rogalski This looks super nice, thank you! I just noticed you still have a couple of TODOs left in the tests, do you plan to fix those? Otherwise let's get this merged :) |
Hey Claudiu, Thanks for approval! |
Merged, thanks for all valuable input! |
An idea mentioned in issue #1013.
I am aware it's definitely not ready for merge, but I'd leave it here just to allow idea to sink and to hear feedback from you guys.
There are additional steps required for this change to be possibly merged:
something_rgx
options)CamelCaseStyle
)It may be a good idea to run a change on some big code bases with selected coding styles just to make sure we have covered all corner cases. We should aim for no false-positives (even if some false-negatives occurs), especially given that Pylint 2.0 is supposed to be a lot less nitpicking.