Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep track of container types in assignments #15215

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

LemonBoy
Copy link
Contributor

Introduce runtime checks to ensure any assignment to an existing item of a dictionary or list takes into account the member type.

Closes #15208

The vim9execute.c part can be reviewed by comparing the new functions with list_append_tv and dict_add_tv that are used in the else branch. The eval.c is sligtly more involved but is essentially the same but for the expression evaluator.

Introduce runtime checks to ensure any assignment to an existing item of
a dictionary or list takes into account the member type.

Closes vim#15208
@LemonBoy LemonBoy marked this pull request as draft July 11, 2024 15:39
@LemonBoy LemonBoy marked this pull request as ready for review July 11, 2024 18:30
@zeertzjq zeertzjq added the vim9script Vim9 script related issues label Jul 11, 2024
@LemonBoy LemonBoy marked this pull request as draft July 12, 2024 09:24
@errael
Copy link
Contributor

errael commented Jul 12, 2024

I've been playing around with the patch. Haven't been able to slip anything past it. Tried with both list and dict.

I took a quick look at the patch; don't claim to fully understand it. But here's a few things that caught my attention.

In eval.c there's if (typ != NULL && typ->tt_type != VAR_ANY, why the VAR_ANY check? Isn't
this handled by check_typval_arg_type.

About the names of the new methods, *_set_item_move.
Why _move? Should it be _copy or _replace or _reference or _assign?

I ran a performance test, xxx[1] = 3 is around 25% slower. STOREINDEX is in the ballpark of twice as slow. Correct behavior is most important, but I wonder if the *_set_item_move could be optimized. I attached the profiling script.

This is based on

var ln: list<number> = [1, 2, 3]
var la: list<any> = [ 'a', ln]
def ListAssign()
    var xxx: any = la[1]
    xxx[1] = 3                      ###-F
enddef

The statement in question disassembles as

    xxx[1] = 3
  14 PUSHNR 3
  15 PUSHNR 1
  16 LOAD $0
  17 STOREINDEX any

Don't have instuction level profiling (briefly talked to Bram about it, but never got there). Assuming that each of the four instructions takes the same amount of time without this patch (which is probably not valid, STOREINDEX is probably the slowest, but gives a feel), we can compute that STOREINDEX takes around twice as long.

NO CHECK

=== list assign check micro-prof, 5 run/params ===
    x50   x100   x200 : argXnLoop (usec/op)
  0.054  0.053  0.053 : xxx[1] = 3  ###-F

WITH CHECK

=== list assign check micro-prof, 5 run/params ===
    x50   x100   x200 : argXnLoop (usec/op)
  0.073  0.071  0.070 : xxx[1] = 3  ###-F

list-assign-check.vim.txt

@errael
Copy link
Contributor

errael commented Jul 13, 2024

Something I didn't think to examine is the following case. The compiler can determine that no type checking is needed. So no runtime check is needed and it shouldn't suffer performance degradation.

In this case there's STOREINDEX list, in the other there's STOREINDEX any. I wonder if the list vs any is useful.

vim9script

def F()
    var l: list<number> = [1, 2, 3]
    l[1] = 3
enddef
F()

@errael
Copy link
Contributor

errael commented Jul 13, 2024

Wow, I didn't realize how simple it was to reproduce. Without this patch there's

vim9script

var ln: list<number> = [1, 2, 3]
def OUCH()
    var xxx: any = ln
    xxx[1] = 'foo'
enddef
OUCH()
echo 'OUCH' ln

with output showing a corrupted list

OUCH [1, 'foo', 3]

@errael
Copy link
Contributor

errael commented Jul 13, 2024

I didn't think to examine is the following case. The compiler can determine that no type checking is needed

This case

var ln: list<number> = [1, 2, 3]
def F()
    var xxx: list<number> = ln
    xxx[1] = 3
enddef

where the compiler knows that the types are OK also has the performance degradation.

I'm starting to think of this as a compiler, not runtime, bug.

Shouldn't/couldn't the compiler add a CHECKTYPE instruction if the target of the assignment
might have typeinfo that hasn't been checked?

@yegappan Any thoughts on compiler vs runtime and how to detect the situations vulnerable to corruption?

@errael
Copy link
Contributor

errael commented Jul 13, 2024

Shouldn't/couldn't the compiler add a CHECKTYPE instruction if the target of the assignment
might have typeinfo that hasn't been checked?

I guess at runtime if the target is any, it might have been assigned something with a type, and so should be checked. I still think it's a compiler issue, but this might be a runtime workaround that avoids the corruption but doesn't degrade performance when the compiler has done type checking.

@errael
Copy link
Contributor

errael commented Jul 14, 2024

At the entry to execute_storeindex(isn_T *iptr, ectx_T *ectx) there is SOURCING_LNUM = iptr->isn_lnum;.

The assignment appears two more times after this first time. It seems like they are not needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
vim9script Vim9 script related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[vim9script] dict corruption using items() return value
3 participants