Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[js] HxOverrides.substr #4365

Open
mockey opened this issue Jun 24, 2015 · 28 comments
Open

[js] HxOverrides.substr #4365

mockey opened this issue Jun 24, 2015 · 28 comments
Assignees
Labels
platform-javascript Everything related to JS / JavaScript
Milestone

Comments

@mockey
Copy link
Contributor

mockey commented Jun 24, 2015

It seems that String.substr with negative index seems to work for IE > 8 nowadays. So it might make sense to remove the old override with the js-es5 flag as it is done with Array.indexOf and the like.

On a sidenote the override function has support for negative length as well for some reason:
https://github.com/HaxeFoundation/haxe/blob/development/std/js/_std/HxOverrides.hx#L75
Should it?
But then that only works with pos = 0 due to the first condition:
https://github.com/HaxeFoundation/haxe/blob/development/std/js/_std/HxOverrides.hx#L70

This is some pretty old code from 2006 actually:
2373abc
1c90918
a2c9d5e
Might be worth a revision...;-)

@nadako nadako added the platform-javascript Everything related to JS / JavaScript label Jun 24, 2015
@nadako
Copy link
Member

nadako commented Jun 24, 2015

The specification says that result is unspecified for negative length, so we can remove that check I think (https://github.com/HaxeFoundation/haxe/blob/development/std/String.hx#L141). I"m all for removing the override for js-es5 as well.

cc @aduros, @ncannasse

@Simn
Copy link
Member

Simn commented Jun 24, 2015

We test the negative length with pos 0 case at the moment: https://github.com/HaxeFoundation/haxe/blob/development/tests/unit/src/unitstd/String.unit.hx#L123

I"m not sure if we should un-fix this for JS.

@markknol
Copy link
Member

Can anyone benchmark how having/removing this affects performance before it is getting removed?

@back2dos
Copy link
Member

@markknol What would be a suitable benchmark? I"m sure in isolation one might get results to show it is faster after removal, but for it to be representative, you"d have to benchmark a parser or something, in which case I doubt it will matter.

@markknol
Copy link
Member

I was more thinking of a runtime performance check (https://jsperf.com/ or something), one with normal substr the other with the hxoverride version.

@mockey
Copy link
Contributor Author

mockey commented Jun 24, 2015

We test the negative length with pos 0 case at the moment

Well, if it works on all platforms, that should be added to the specs maybe?
But why only with pos = 0 then?

@nadako
Copy link
Member

nadako commented Jun 28, 2015

What"s the actual use case for that pos == 0 && len < 0 thing? It looks very weird and prevents us from using native ES5 substr (I still added native substr with 7f558a8 tho, but we can revert that).

The documentation says that negative len is unspecified, so I think it"s fine to "unfix" that (for all targets) even if there was a test. Thoughts?

@Simn
Copy link
Member

Simn commented Jun 28, 2015

What pos == 0 && len < 0 are we talking about here? I"m only seeing some weird if( pos != null && pos != 0 && len != null && len < 0 ) return "".

@nadako
Copy link
Member

nadako commented Jun 28, 2015

Well, that logic is a bit twisted, but I think this code actually only executes when pos == 0 and len < 0:

len = s.length + len - pos;

@Simn
Copy link
Member

Simn commented Jun 28, 2015

I see what you mean now.

I"m still reluctant to un-fix this, but I"m also reluctant to specify it. That HxOverrides.substr definitely needs a cleanup either way.

nadako added a commit that referenced this issue Jun 28, 2015
@ncannasse
Copy link
Member

If we have the possibility to specify something in a crossplatform way I don"t see why we shouldn"t do it. It will reduce the amount of bad surprises when moving code from one platform to another, especially when it"s something that is potentially widely used such as substr.

@nadako
Copy link
Member

nadako commented Jun 29, 2015

Yes, but I"m not sure if supporting this strange behaviour (pos == 0 && len < 0) is worth adding a layer over native implementation.

As for "widely used", this particular case is:

  1. Undocumented (actually we documented the opposite - the behaviour with len < 0 is unspecified)
  2. Seems to be not present in native versions of substr and will never be supported by ES.
  3. What"s this for anyway? And why only when pos == 0?

@Simn
Copy link
Member

Simn commented Jun 29, 2015

I used it at least once to cut off the file extension of some dot path string, for which path.substr(0, -4) is quite convenient.

A problem with specifying is that you also have crazy cases like "foo".substr(0, -10) that you have to specify as being unspecified.

@nadako
Copy link
Member

nadako commented Jun 29, 2015

Okay, I reverted the change and will try to cleanup current implementation later. I think we can still provide "lighter" substr implementation when -D js-es5, but I"m not sure if there"ll be any actual performance improvement.

@nadako
Copy link
Member

nadako commented Jun 29, 2015

I cleaned up our substr implementation a bit and removed negative pos handling when -D js-es5, effectively leaving only pos ==0 && len < 0 check before calling native substr. Hope that"s okay.

@mockey
Copy link
Contributor Author

mockey commented Jun 29, 2015

Just for the record: if you look at the original commits (first post, 2006, remember?), you see where the negative length code comes from. It was first implemented as such and the if (pos != 0 && len < 0) return "" condition was added later.
substr with negative length might be useful in some cases, but I think it"s pretty unusual, PHP supports it, though :-)
http://php.net/substr

@ncannasse
Copy link
Member

I think I"m using it sometimes, usually in the following way if( StringTools.endsWith(str,"bar") ) str = str.substr(0,-3);. I think it"s quite self-explaining? If someone can explain me what substr(-1,-2) would do I would be happy that we cover this case as well.

@Simn
Copy link
Member

Simn commented Jun 29, 2015

I agree it is intuitive what str.substr(0, -3) does. I"m not so sure about str.substr(1, -3) because I can see at least two possible interpretations.

@mockey
Copy link
Contributor Author

mockey commented Jun 30, 2015

À la PHP:

"abcdef".substr(1, -3); // "bc"
"abcdef".substr(-3, -1); // "de"
"abcdef".substr(-1, -2); // null, negative position must be smaller than negative length

@kevinresol
Copy link
Contributor

can we make it simple and consistent like this? A negative number -x is simply interpreted as str.length - x

so for a string "abcdef":

  • substr(1, -1) is equivalent to substr(1, 5)
  • substr(-1, -1) is equivalent to substr(5, 5)
  • substring(1, -1) is equivalent to substring(1, 5)
  • substring(-1, -1) is equivalent to substring(5, 5)

@Simn
Copy link
Member

Simn commented Jul 1, 2015

I would be fine with that as long as we don"t have to provide additional checks on many targets.

@back2dos
Copy link
Member

back2dos commented Jul 3, 2015

@kevinresol First of all, it"s important to note that substring is not specified to work with negative values. Neither in Haxe, nor ECMAScript. And while I personally agree that it would be sensible to do, there may be code actually relying on negative indices being clamped and this would introduce bugs into existing code, that are tedious to track down.

As for substr, I like it. It is delightfully simple.

@Simn
Copy link
Member

Simn commented Jul 3, 2015

Ah I didn"t notice there were substring examples too. I agree we should only do this for substr.

@kevinresol
Copy link
Contributor

Oh seems that ECMAScript does specify that negative values are clamped at zero for substring. So please just ignore my suggestion for substring (Although I personally think that treating negative number as offset from the end is more intuitive...)

@Simn
Copy link
Member

Simn commented Jul 3, 2015

Does somebody want to open a pull request that adds the required test cases for substr so we can see how much work this would require?

@Simn Simn added this to the 3.4 milestone Feb 23, 2016
@ncannasse ncannasse removed their assignment Dec 14, 2016
@Simn Simn modified the milestones: 3.4, 4.0 Jan 9, 2017
@Simn Simn modified the milestones: Release 4.0, Design Apr 17, 2018
@Simn Simn modified the milestones: Design, Release 4.0 Sep 6, 2018
@Simn
Copy link
Member

Simn commented Sep 6, 2018

Pulling this to Haxe 4 as messing with the String API seems to be our common hobby at the moment.

@haxiomic haxiomic self-assigned this Sep 9, 2018
@Simn Simn modified the milestones: Release 4.0, Release 4.1 Dec 1, 2018
@Simn
Copy link
Member

Simn commented Dec 1, 2018

And back to 4.1 because we found different hobbies.

@RealyUniqueName
Copy link
Member

ES doesn"t support negative length, but it"s very convenient to have it.
I"ve submitted a PR to test different combinations of negative pos and length for the following spec:

  • if pos is negative it is calculated as s.length + pos
  • if length is negative, then a substring starting at pos and ending at s.length + length (excluding) is returned

Hovewer that case with the negative length is more suitable for String.substring, because negative length denotes an index instead of amount of characters, which changes semantics of length argument. I guess that"s why it"s not supported in ES.

@RealyUniqueName RealyUniqueName modified the milestones: Release 4.2, Backlog Dec 14, 2020
@Simn Simn modified the milestones: Backlog, Later Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform-javascript Everything related to JS / JavaScript
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants