Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add startsWithAny and endsWithAny #3577

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ByteZ1337
Copy link

@ByteZ1337 ByteZ1337 commented Jul 18, 2020

Just some functions I would find useful.

@ByteZ1337 ByteZ1337 changed the title Adds startsWithAny and endsWithAny Add startsWithAny and endsWithAny Jul 18, 2020
@fee1-dead
Copy link
Contributor

fee1-dead commented Jul 22, 2020

Why not just use any(string::startsWith) or with arguments? I don't think this would be helpful if it can be replaced just with one call.

Also, the name "startsWithAny" is confusing, in my opinion the semantics could be better if it were Iterable<CharSequence>.containsPrefixOf(str:CharSequence).

@ByteZ1337
Copy link
Author

Why not just use any(string::startsWith) or with arguments? I don't think this would be helpful if it can be replaced just with one call.

The same reason you use Collection#addAll instead adding every element with forEach. Methods like these are just way more convenient to use and make the code way more readable than using obscure lambdas in if statements.

Also, the name "startsWithAny" is confusing

How so? I think it's pretty easy to understand. I don't really see an issue with the name since there are other methods called "any" in the Standard Library.

Iterable<CharSequence>.containsPrefixOf(str:CharSequence)

I think the receiver should stay CharSequence since the actual string is checked. Again I use Collection#addAll not Collection#addAllTo

@fee1-dead
Copy link
Contributor

fee1-dead commented Jul 22, 2020

Methods like Collection#addAll are way more convenient to use and make the code way more readable

Nope, that's not the case at all. having that method helps with increasing performance, for example if it were an ArrayList it checks the capacity of its backing array and everytime the method add is called it increases the capacity by copying the array if necessary. Using addAll will make the ArrayList perform just one copy action. The method was there for performance, not convenience.

Also, I don't see why this function should be in the standard library. All the functions from kotlin.collections.Collections.kt / kotlin.text.Strings.kt contain their own implementations, not one-liners.
I also don't understand why this function should be in the standard library, because you can just create these function in your own program.

@ByteZ1337
Copy link
Author

ByteZ1337 commented Jul 22, 2020

Sill makes it way more readable than obscure lambda that you don't instantly understand.

contain their own implementations, not one-liners.

What about the empty and null checks? Those could also easily be implemented but still got into the standard library. And I also wouldn't call it its own function. It's just an extension of startsWith. With your logic methods like substringAfter would also be one liner functions.

I also don't understand why this function should be in the standard library, because you can just create these function in your own program.

Well obviously you can implement everything yourself in Strings.kt. Iirc these functions exist so you don't have to implement them every time you need them. I really don't understand why you're fighting this so much? They are just 2 functions that are useful and don't make any major changes to the standard library.

@fee1-dead
Copy link
Contributor

fee1-dead commented Jul 23, 2020

Why are you fighting this so much?

I am not fighting this, I am asking why this should be in the standard library. If you could explain your use case it would be great.
I tried to ask you twice, but you do not seem to understand this at all. Why would users look for such function that does this in the standard library?

Also, looking at contributing.md, you did not:

What about the empty and null checks?

Null checks are helpful because when using the value from elvis/if expression you need to use parenthesis.

Sill makes it way more readable than obscure lambda

How is it more readable? How is string.startsWithAny(listStrings)
more readable than listStrings.any(string::startsWith)?

I think the names are pretty easy to understand

The names are confusing, you think they are easy to understand because you know what they do. By just looking at the function names from the title, I had no idea what these functions are for.

@flimosch
Copy link

I think the function names are easy to understand and improve readability greatly.
Just having string.startsWithAny instantly tells you that you check if your string starts with any (of the following) strings.
If you instead would use listStrings.any you have no idea what you exactly check for. I remember starting in kotlin I was confused by this method and still think that the naming is very poor.

If you just skim through a program the method string.startsWithAny tells you faster and more clearly what is does compared to listStrings.any(string::startsWith)

@ByteZ1337
Copy link
Author

ByteZ1337 commented Jul 23, 2020

First of all, if you're neutral and just curious about why this should be implemented why did you dislike my post?
Second of all, I'm having an open discussion with you right now, aren't I? You're giving me your opinion and I'm telling you mine and respectfully disagree.

For my use cases :

  • Checking file extensions,
  • Parsing a file that has comments but has multiple options to start a comment.

I tried to ask you twice, but you do not seem to understand this at all.

Sorry if I missed it but I never saw you explicitly asking for these?

I didn't open an issue or talk to the employees because I thought 2 small functions would just waste their time. So I just opened this pull request and see if it gets merged. If this was wrong I would like to apologize. I did however not see the part with sample code which I will probably add to this pull request tomorrow.

And I don't know what your "educated guess" is supposed to mean? (Getting a bit personal there huh) I did build it by following the instructions in the readme. But I don't understand why I should build my own forked standard library. Since I mostly do open source stuff this wouldn't make sense. Also, as you mentioned in your second comment I can just implement it myself which would make building the whole standard library a bit overkill.

Also, if you find these names confusing you should probably open an issue to rename CharSequence#indexOfAny.

@fee1-dead
Copy link
Contributor

Checking file extensions:

val extensions = listOf("txt","exe")
val file = File(...)
file.extension in extensions
val fileName = "foo.bar"
fileName.substringAfterLast('.', "") in extensions

Parsing a file: I do not think that is the average use case.

@ByteZ1337
Copy link
Author

Or

path.endsWithAny("exe", "txt", ignoreCase = true)

Which is way nicer to look at and supports ignoreCase.

@fee1-dead

This comment has been minimized.

@ByteZ1337
Copy link
Author

So I ran

println("HashSet check: "   measureTimeMillis {
      val extensions = hashSetOf("exe", "json", "xml", "bin", "kt", "jar", "txt")
      val file = File("test.txt")
      file.extension.toLowerCase() in extensions
})

and

println("endsWithAny: "   measureTimeMillis { "test.txt".endsWithAny("exe", "json", "xml", "bin", "kt", "jar", "txt", ignoreCase = true) })

independently from each other and got these results (average of 10 runs):

HashSet check: 31
endsWithAny: 15

I ran this code on play.kotlinlang.org to make sure it's not because of my hardware or some other stuff.

When adding more extensions to the list both functions take longer at a pretty equal rate. So the time complexity seems to be O(n) for both methods.

Here is the link if you want to try it yourself. Just uncomment the version you want to try.

@fee1-dead

This comment has been minimized.

@x4e
Copy link

x4e commented Jul 24, 2020

benchmark is rigged

@fee1-dead

This comment has been minimized.

@ByteZ1337
Copy link
Author

Well mine shows the actual time it takes to run, not some benchmark

@x4e
Copy link

x4e commented Jul 24, 2020

Also javadoc has tons of spelling mistakes you might wanna fix

@ByteZ1337
Copy link
Author

Something other than the missing plural? I copied the doc from the other methods and just modified it a bit.

@fee1-dead

This comment has been minimized.

@ByteZ1337
Copy link
Author

ByteZ1337 commented Jul 24, 2020

Still, why shouldn't mine be perfectly ok? It's the time it takes to execute the tasks. Also if its rigged then it means nothing either way (Don't have time to check rn).

@fee1-dead

This comment has been minimized.

@flimosch
Copy link

Looking and running your benchmarks they seem fine. (You use startsWith instead of ends if but adjusting the string length to account for that doesn't change the speeds too much). I still would like to have that functions, maybe just with more efficient methods. (This will be hard to implement because your methods can only be used for checking file extension and not general string where you don't have a delimiter)

@ByteZ1337
Copy link
Author

How am I not respecting your opinion? I'm just saying that I don't see anything wrong with my tests and that someone said that yours might be rigged? I never accused you of anything. It just doesn't make sense why these benchmarks would show different results.

@fee1-dead

This comment has been minimized.

@fee1-dead

This comment has been minimized.

@ByteZ1337
Copy link
Author

ByteZ1337 commented Jul 24, 2020

Isn't that what "bigotry" means? English isn't my native language sorry.
And I think when an actual user would check file extension he would probably also initialize a new HashSet and File. Since it's something you probably only do a few times in a project they would probably not make a function for it. And in my personal case, I'm reading a lot of files, and initializing a file for every file would impact the performance a lot. I guess it just depends on the actual context of your project which version is better. But as I said before a simple function that can be called in an if-statement is still nicer than implementing an own function. But that's just my opinion. If you see that, I respect that. It's just that these functions probably won't even bother you when they get implemented but they would help someone like me. So I think arguing if they should be implemented isn't really great. However, I'm always open to tips and suggestions to improve performance.

@fee1-dead
Copy link
Contributor

fee1-dead commented Sep 5, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants