Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeatable output (idempotent output) #61

Closed
Justin-W opened this issue Oct 24, 2019 · 16 comments · Fixed by #72 or #78
Closed

Repeatable output (idempotent output) #61

Justin-W opened this issue Oct 24, 2019 · 16 comments · Fixed by #72 or #78
Labels
enhancement New feature or request

Comments

@Justin-W
Copy link

goplantuml should generate identical output when repeatedly executed against identical inputs/targets.

For example, when running the following 3 commands, the diff should detect no differences.

goplantuml -recursive . > goplantuml.a.puml
goplantuml -recursive . > goplantuml.b.puml
diff goplantuml.a.puml goplantuml.b.puml

However, currently the diff detects many differences between the 2 files.

Such non-idempotent output makes the output less usable for a variety of purposes. For example:

And if there is some reason or benefit from the current behavior (i.e. randomized output), then a new CLI flag could be added to give the user a choice between idempotent vs randomized output. However, I think that idempotent output should be the default behavior.

@jfeliu007
Copy link
Owner

jfeliu007 commented Oct 24, 2019

@Justin-W In order to speed up the process of creating the diagram, the project uses maps. Due to the nature of the randomness of maps in Golang, it is very difficult to maintain an order. Maps are great for O(1) operations, but they have that particularity in Golang. Other languages mimic the same patterns when using maps, but I believe Golang intentionally makes them random. There are so many pieces in a big project that I rather keep this using maps for the sake of speed. I might consider this in the future, but I would have to come up with a way of consistently changing a map into an array (Maybe alphabetically, before rendering. But it will definitely be for a much later update :).

@Justin-W , you are giving me great feedback. I am glad this tool is useful to some people.

@Justin-W
Copy link
Author

the project uses maps. Due to the nature of the randomness of maps in Golang, it is very difficult to maintain an order. ... I would have to come up with a way of consistently changing a map into an array (Maybe alphabetically, before rendering. ...

I don't think so. I think you would just need to modify a few funcs slightly. For example, in the renderStructures func (goplantuml/parser/class_parser.go:263), instead of doing for name, structure := range structures { ... }, you would simply modify the for loop to iterate over a slice of the keys (sorted by name), and then call the renderStructure func as you currently do. Code example

I suspect you would only need to do this in a handful of places to get fully idempotent output.

@jfeliu007
Copy link
Owner

Hmmm. That could work. And We only need to implement it on the render functions.

Good tip. Thanks. I’ll try that.

@jfeliu007
Copy link
Owner

Brilliant. I am reducing the amount of testing support files I need for this project.

jfeliu007 pushed a commit that referenced this issue Oct 26, 2019
Fixes #61
Make sure we iterate over ordered keys when iterating maps. This allows for a more consisten rendering. Thanks @Justin-W for this tip.
This update also remove now unnecessary testing supporting files that were checking different variants of the same rendering.
@Justin-W
Copy link
Author

FYI: #61 isn't completely fixed yet, per latest tests of master.

The following functions still aren't producing repeatable output:

  • renderStructMethods
  • renderStructFields
  • renderAliases
    Looks like you simply forgot to call the sort function in this last func. Also, the sort here probably needs to be multi-level (sorting by alias.Name, aliasString, and alias.AliasOf). Similar multi-level sort may need to be performed in each of the other funcs that render edges, too.

@jfeliu007
Copy link
Owner

renderStuctMethofs, and renderStructFields will render the methods and fields in order of appearance. They are already handled by arrays. I like that the fields and methods appear in the same order they appear in the code. At least that's what I have generated consistently with the code I have ran the parser with. As for render aliases, I do see that I am missing the sort function. I will reopen this issue for that one. But for the other two, I am not so sure.

@jfeliu007 jfeliu007 reopened this Oct 28, 2019
@Justin-W
Copy link
Author

renderStuctMethofs, and renderStructFields will render the methods and fields in order of appearance

I'm still seeing the order of class methods vary between consecutive runs. I just assumed fields was also a problem.

I like that the fields and methods appear in the same order they appear in the code.

I think that is a perfectly acceptable default order for fields and methods. But it's not the behavior I'm currently seeing.

@jfeliu007
Copy link
Owner

I'm still seeing the order of class methods vary between consecutive runs. I just assumed fields was also a problem.

That's is very interesting. I can't see anything like that in what I am running :( .

You can see the example in the Generated Diagram on the read me. You will see the parser's order of methods is the same as then ones in the class itself. Notice that the Functions and Fields in the Struct struct are actually arrays and they are filled in order of appearance. Maybe I am running and outdated version of the ast library, and now they do not guarantee the order of the fields and methods?

jfeliu007 pushed a commit that referenced this issue Oct 28, 2019
Fixes #61
This will make sure aliases are rendered in order of the combination name->package->aliasOf
@jfeliu007 jfeliu007 added the enhancement New feature or request label Oct 28, 2019
@Justin-W
Copy link
Author

Getting closer. Alias edges seem to be fixed now, but I'm still seeing 2 issues RE: method order and type values.

Non-repeatable method order

For consecutive tests against the same pacakge, I am seeing the order of certain public and private methods vary each run. For example, these 2 goplantuml output excerpts are from the puml class for the same struct:

    - consumeResourceCacheEvents() 
    - mqSetup() error

      StartMQ() 
    ' ... (all public methods except StartMQ)
    - mqSetup() error
    - consumeResourceCacheEvents() 

      StartMQ() 
    ' ... (all public methods except StartMQ)

Note how the 2 private methods get swapped, and how the StartMQ method swaps between being the first vs last public method listed (despite all of the other 10 public methods seeming to be in the same order every time).

I noticed that the StartMQ and consumeResourceCacheEvents funcs are both defined in a different .go file from the rest of the method funcs.

So I suspect the non-repeatable output is because you are processing the go files within a package in random order each run.

Type value inconsistency

This same project uses protobufs, and I am seeing the 'type' of many fields and method parameters vary between runs.

For example, here are some excerpts from one of our .go files:

import "github.com/golang/protobuf/ptypes"
import "aaa.com/bbb/ccc/pb" //a custom proto package
//...
type FakeProvider struct {
//...
    GetFileResponse *pb.File
//...
}

And here are corresponding puml output excerpts from 2 consecutive runs:

class FakeProvider << (S,Aquamarine) >> {
'...
      GetFileResponse *proto.File
class FakeProvider << (S,Aquamarine) >> {
'...
      GetFileResponse *pb.File

Note the same Field has a different 'type' in each run. I'm seeing this flip-flopping in the type's prefix (always between *pb. and *proto.) for many different field and parameter types. However, while separate blocks of the puml have different types between runs, consecutive lines in the same 'block' of methods usually seem to have the same prefix in a given run. For example, it seems like all of the methods in a single class will consistently use either one or the other of the 2 prefixes in a single run, but some other classes may use the other prefix in that same run.

I don't know if the root cause is in goplantuml vs ast vs other, but it's definitely causing non-repeatable output. And I couldn't find any evidence of different import aliases of the same package in the code I was scanning, so I don't think that the issue is caused by an inconsistency within the scanned package's code.

@Justin-W
Copy link
Author

Justin-W commented Oct 30, 2019

More info:

func (st *Struct) AddMethod(method *ast.Field, aliases map[string]string) is the func which appends items to the array which determines the order in which class methods get rendered. That func is called from func (p *ClassParser) parseFileDeclarations(node ast.Decl) within func (p *ClassParser) parsePackage(node ast.Node), which iterates over an unsorted map of the package's files.

Therefore, any package with a struct that has methods defined in more than 1 go file will render as a puml class with methods in a non-repeatable order.

Since you are want to render the class methods in the order they are defined in the code, and since rendered output is affected by the order in which a package's go files are parsed, you therefore need to parse the go files in a repeatable order within func (p *ClassParser) parsePackage(node ast.Node).

@Justin-W
Copy link
Author

you therefore need to parse the go files in a repeatable order within func (p *ClassParser) parsePackage(node ast.Node)

Note that this will also partially fix #84.

@Justin-W
Copy link
Author

@jfeliu007 Could you re-open this issue until it's fully fixed? Thanks.

@jfeliu007 jfeliu007 reopened this Oct 31, 2019
@jfeliu007
Copy link
Owner

jfeliu007 commented Jul 1, 2020

@Justin-W it's been a while since I have looked at these issues. I will work on this one now. This has been bugging me for some time but I have been busy with other projects. I am going to go with the suggestion of parsing the files in a specific order within each package. That is a very good catch.

@jfeliu007
Copy link
Owner

I also think there might be other inconsistencies in the project regarding this very issue. I'll keep an eye on anything that pops up.

jfeliu007 added a commit that referenced this issue Jul 1, 2020
@Justin-W
Copy link
Author

Justin-W commented Jul 1, 2020

Thanks @jfeliu007 :)

jfeliu007 added a commit that referenced this issue Jul 1, 2020
Added code to sort the names of the classes we generate due to complex naming that might conflict with plantuml code. These were also not ordered and where rendered based on a map
@jfeliu007
Copy link
Owner

I think I have fixed this for good. I used the elasticsearch library which would usually provide different results and now it seems to be very consistent. I Discovered that some classes we generated with names that did not conflict with the plantuml specifications (due to having dots in the names) where also being rendered based on a map. I also added the solution to the functions that you mentioned. I never thought of that :) .

jfeliu007 added a commit that referenced this issue Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
2 participants