Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make source code readable #24

Closed
FSMaxB opened this issue Sep 11, 2016 · 39 comments
Closed

Make source code readable #24

FSMaxB opened this issue Sep 11, 2016 · 39 comments

Comments

@FSMaxB
Copy link
Collaborator

FSMaxB commented Sep 11, 2016

Let's face it, cJSON's source code is looking quite horrible in the current form. The way it is it's bad for potential contributors because it is hard to read and it also makes tracking changes with git harder.

I would volunteer to reformat the code if you let me. I will follow the format you like, but here's a suggestion:

Currently:

/* Parse the input text to generate a number, and populate the result into item. */
static const char *parse_number(cJSON *item,const char *num)
{
    double n=0,sign=1,scale=0;int subscale=0,signsubscale=1;

    if (*num=='-') sign=-1,num  ;   /* Has sign? */
    if (*num=='0') num  ;           /* is zero */
    if (*num>='1' && *num<='9') do  n=(n*10.0) (*num   -'0');   while (*num>='0' && *num<='9'); /* Number? */
    if (*num=='.' && num[1]>='0' && num[1]<='9') {num  ;        do  n=(n*10.0) (*num   -'0'),scale--; while (*num>='0' && *num<='9');}  /* Fractional part? */
    if (*num=='e' || *num=='E')     /* Exponent? */
    {   num  ;if (*num==' ') num  ; else if (*num=='-') signsubscale=-1,num  ;      /* With sign? */
        while (*num>='0' && *num<='9') subscale=(subscale*10) (*num   - '0');   /* Number? */
    }

    n=sign*n*pow(10.0,(scale subscale*signsubscale));   /* number =  /- number.fraction * 10^ /- exponent */

    item->valuedouble=n;
    item->valueint=(int)n;
    item->type=cJSON_Number;
    return num;
}

After reformatting:

/* Parse the input text to generate a number, and populate the result into item. */
static const char *parse_number(cJSON *item,const char *num) {
    double n = 0;
    double sign = 1;
    double scale = 0;
    int subscale = 0;
    int signsubscale = 1;

    /* Has sign? */
    if (*num == '-') {
        sign = -1;
        num  ;
    }

    /* is zero */
    if (*num == '0') {
        num  ;
    }
    if ((*num >= '1') && (*num <= '9')) {
        do {
            n = (n * 10.0)   (*num   - '0');
        } while ((*num >= '0') && (*num <= '9')); /* Number? */
    }
    /* Fractional part? */
    if ((*num == '.') && (num[1] >= '0') && (num[1] <= '9')) {
        num  ;
        do {
            n = (n * 10.0)   (*num   - '0');
            scale--;
        } while ((*num >= '0') && (*num <= '9'));
    }
    /* Exponent? */
    if ((*num == 'e') || (*num == 'E')) {
        num  ;
        /* With sign? */
        if (*num == ' ') {
            num  ;
        } else if (*num == '-') {
            signsubscale = -1;
            num  ;
        }
        while ((*num >= '0') && (*num <= '9')) {/* Number? */
            subscale = (subscale * 10)   (*num   - '0');
        }
    }

    /* number =  /- number.fraction * 10^ /- exponent */
    n = sign * n * pow(10.0, (scale   subscale * signsubscale));

    item->valuedouble = n;
    item->valueint = (int) n;
    item->type = cJSON_Number;
    return num;
}
@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 11, 2016

The most important guidelines I want to follow:

  • No oneliners
    • no oneline loops
    • no oneline ifs
    • no multiple variable declarations in one line
  • use parenthesis for every control structure
  • don't rely on operator precedence rules but make it obvious how an expression is evaluated eg.: (a > b) && (c < d) instead of a>b && c<d

@DaveGamble
Copy link
Owner

It seems I have an allergy to whitespace.
I find the first version a lot easier to read - it takes up less space on the screen and all the nested structures are very simple patterns.
To my eyes, the second version looks much longer and more complicated.

Is this just me?

@hustanhuang
Copy link

A person who is quite familiar with these code won't have much trouble reading the first version.

But it would be a tough task for a beginner to read them.

This repository was recommended to me a few months ago and I really love it. But I do have to say that I spent a lot of time reformatting it during my reading.
(luckily I've got my AStyleFormatter)

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 12, 2016

I just noticed that I'm not the first person to suggest a change of the formatting style, @blakemcbride did it before me in #21.

@DaveGamble I don't think it's just you, but you're in the minority by a large margin.

But let me explain why a change of the formatting is required in my opinion.

Consistency

When reading code it is important to be able to see patterns rightaway without having to manually parse them.
Therefore code should be at least consistent in the way it is formatted.

As long as it's consistent, the coding style can be one of many but it's important to stick to that.
That is why most projects follow one style and at least most bigger ones have an actual document that describes how code should look like.
Some examples:

This is just to show you that most projects out there do have their own style that they stick to, I don't particularly agree with some of them though in some of the details.

But there's more to consider.

Why multiple lines instead of one

I think most people think of source code in terms of lines, so usually you want one line to only do one thing.
But there is another thing to consider here: What happens if you make changes to your oneliners?
Most source code management tools like git work with code in terms of lines.
They are much better at merging changes if they have been made in different lines.
Even tools like diff usually look at code as a bunch of separate lines.

Let's take this example, if for some reason num should be replaced with num (only for demonstration purposes):
Before the change:

if (*num=='e' || *num=='E')     /* Exponent? */
{   num  ;if (*num==' ') num  ; else if (*num=='-') signsubscale=-1,num  ;      /* With sign? */
    while (*num>='0' && *num<='9') subscale=(subscale*10) (*num   - '0');   /* Number? */
}

After the change:

if (*num=='e' || *num=='E')     /* Exponent? */
{   num  ;if (*num==' ')   num; else if (*num=='-') signsubscale=-1,  num;      /* With sign? */
    while (*num>='0' && *num<='9') subscale=(subscale*10) (*  num - '0');   /* Number? */
}

Now the diff looks like this:

2,3c2,3
< {   num  ;if (*num==' ') num  ; else if (*num=='-') signsubscale=-1,num  ;      /* With sign? */
<     while (*num>='0' && *num<='9') subscale=(subscale*10) (*num   - '0');   /* Number? */
---
> {   num  ;if (*num==' ')   num; else if (*num=='-') signsubscale=-1,  num;      /* With sign? */
>     while (*num>='0' && *num<='9') subscale=(subscale*10) (*  num - '0');   /* Number? */

Can you see what has changed?

Now let's take the reformatted version and do the same change:
Before the change:

/* Exponent? */
if ((*num == 'e') || (*num == 'E')) {
    num  ;
    /* With sign? */
    if (*num == ' ') {
        num  ;
    } else if (*num == '-') {
        signsubscale = -1;
        num  ;
    }
    while ((*num >= '0') && (*num <= '9')) {/* Number? */
        subscale = (subscale * 10)   (*num   - '0');
    }
}

After the change:

/* Exponent? */
if ((*num == 'e') || (*num == 'E')) {
      num;
    /* With sign? */
    if (*num == ' ') {
          num;
    } else if (*num == '-') {
        signsubscale = -1;
          num;
    }
    while ((*num >= '0') && (*num <= '9')) {/* Number? */
        subscale = (subscale * 10)   (*  num - '0');
    }
}

Look at the diff now:

3c3
<     num  ;
---
>       num;
6c6
<         num  ;
---
>           num;
9c9
<         num  ;
---
>           num;
12c12
<         subscale = (subscale * 10)   (*num   - '0');
---
>         subscale = (subscale * 10)   (*  num - '0');

And once two people start changing the same line, things get really ugly.
It's much easier to do the merging if the changes are in separate lines, which is why it is a good idea to write only one thing per line.

Why use parenthesis for every control structure

Some style guides prefer not to use parenthesis for control structures, OpenSSL and GNU do this as far as I know.

I still don't think this is a good idea.
This can mislead people for example.

/*
 * this could mislead people into thinking that both function
 * calls are covered by the condition.
 */
if (condition)
    do_something();
    do_something_outside_the_condition();

And stuff like this can happen by accident, sometimes even with severe security implications, see Goto Fail.

Also if you want to add one line to such kind of if statement, you have to actually change 3 lines.
This will also be visible in the diff.

Before the change:

if (condition)
    do_something();

After the change:

if (condition) {
    do_something();
    do_something_else()
}

Diff:

1c1
< if (condition)
---
> if (condition) {
2a3,4
>     do_something_else()
> }

If parentheses are used in the first place, the diff looks like this:

2a3
>     do_something_else()

Another argument for always using parentheses is because it is a more uniform and easier convention to follow.

Why you shouldn't rely on operator precedence

For the example code I reformatted in my earlier comment it might not look like it's necessary, but take a look at this example:

if (a||b && c||d) {
    do_something();
}

This looks like the programmer intended it to be:

if ((a || b) && (c || d)) {
    do_something();
}

But in reality it is:

if (a || (b && c) || d) {
    do_something();
}

You can avoid these kinds of errors by just always specifying your intent clearly.

I hope this explains some of the reasoning behind what I was saying earlier.

@DaveGamble
Copy link
Owner

Max, would you be up for taking this on? Could I make you a collaborator for the repo?

@DaveGamble
Copy link
Owner

DaveGamble commented Sep 12, 2016

NB: I agree that:

if (something)
    do_something();

is entirely dangerous. I'm less convinced that:
if (something) do_something();

is problematic.

On all other points, I think I can largely agree.

@blakemcbride
Copy link

if (something) do_something();
Is problematic because you can't put a break point on do_something()

On Mon, Sep 12, 2016 at 1:34 PM, Dave Gamble [email protected]
wrote:

NB: I agree that:

if (something)
do_something();

is entirely dangerous. I'm less convinced that:
if (something) do_something();

is problematic.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#24 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGKwAA1RoN28hS3be6IJVjiF3wc_nERDks5qpZtOgaJpZM4J6D9d
.

@blakemcbride
Copy link

On Mon, Sep 12, 2016 at 1:12 PM, Philip Taylor [email protected]
wrote:

....

Subjectively, I think cJSON's style is (in my opinion) clearly ugly, and
such an obvious lack of taste makes me question the judgement of the
developers and the quality of the code and makes me assume there are deeper
problems that aren't as obvious. Maybe that's not fair, but it's how I tend
to react when reading code like this.

Dave's response to my original comment was basically "It's my code. I do
what the hell I like with it. It's too bad if you don't like it."

My guess is he writes it like he does to make the point that he does what
he wants and isn't controlled by anyone. Do you know people like that?

I just moved on.

Blake

@DaveGamble
Copy link
Owner

I'm recanting on my position, having had a few more requests. Given that, your reply feels rather uncalled for. It wasn't my intention to ruin your day by not immediately acquiescing to a single request, however reasonable you felt it was.

Is there some kind of game in the open source world where you write unkind comments to people volunteering their free time, that I'm not in on?

@DaveGamble
Copy link
Owner

In practical terms, this comes down to making the library better. White space alone does not achieve that. On the other hand, collaboration does. Max has made excellent and valuable contributions, and hence I'm very interested to hear his input. The statistics suggest that I'm responsible for the majority of fixes and features. Perhaps that could change, although for a library with such limited scope, it's not entirely clear what that would mean. So there's balance to strike between making the code accessible to newcomers vs leaving it as something I am able to maintain for the much larger majority of users who use it without needing to trace the code.

If there were numerous issues we would see more bug reports. If people were having to debug the library, they might just mention it. Perhaps there needs to be more focus on reporting parse errors (something I've added recently), or on compliance (upcoming commit improves conformance benchmark score).

@blakemcbride
Copy link

On Mon, Sep 12, 2016 at 6:57 PM, Dave Gamble [email protected]
wrote:

I'm recanting on my position, having had a few more requests. Given that,
your reply feels rather uncalled for. It wasn't my intention to ruin your
day by not immediately acquiescing to a single request, however reasonable
you felt it was.

Is there some kind of game in the open source world where you write unkind
comments to people volunteering their free time, that I'm not in on?

Admittedly, I was rude. I apologize. I appreciate your work. At least
one of my contributions took more than two years of much more than 40 hours
per week. Followed by many years of enhancements. It includes a 350 page
and a 250 page manual. So, I feel your pain. (
https://github.com/blakemcbride/Dynace )

Blake

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 13, 2016

Let's return to the discussion on hand.

Yes, I'd be glad to do the reformatting.

About if (condition) do_something;:
It is going against the principle of always using parenthesis for control structures. And it makes the diff look worse if a change is made. Also as blakemcbride mentioned, you cannot set a breakpoint in a debugger. Actually I haven't thought of this aspect at all.

I also found another potential problem:

#define multiline_macro(parameter) do_something(parameter);\
    do_some_more();

if (condition) multiline_macro(parameter);

Although there are no macros like that in cJSON and it's better to write macros that are surrounded by curly braces, using the curly braces for every condition is still a good safeguard.

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 13, 2016

So let's discuss the style that we are going for. My suggestion:

if. while, do while, for

if (condition) {
    do_something();
}

for (int i = 0; i < 10; i  ) {
    do_something();
}

do {
    something();
} while (condition);

while (condition) {
    do_something();
}

switch case:

switch (number) {
    case '0':
        do_something();
        break;
    case '1':
    case '2'":
        do_something_else();
        break;
    default:
        whatever();
}

assignments and declarations:
```c
double one = 1;
double two = 2;
double three = one   two;

Tabs/spaces?
Currently cJSON is using both, but mostly tabs, so we should stick with tabs I guess? (I personally prefer tabs because it leaves the decision of how wide the indentation is up to the individual programmer).

conditions:

bool condition = (a || b) && (c || d);

But if you prefer less spaces this can be done of course.

Anything you'd like me to do differently?

Also you were writing that you are currently working on some more changes, so should I wait until you're done with that before I do the reformatting?

@fgimian
Copy link

fgimian commented Sep 13, 2016

If I could please also add just a few minor suggestions:

  • I think it's better to switch to 4 space soft indent instead of tabs (makes it easier to read on GitHub and will always look the same in any editor)
  • I personally prefer that the opening brace is on the following line (and think Dave does too)

e.g.

if (condition)
{
    do_something();
}

for (int i = 0; i < 10; i  )
{
    do_something();
}

do
{
    something();
} while (condition);

while (condition)
{
    do_something();
}

Another little point for me is:

  • don't rely on operator precedence rules

I would argue that we SHOULD rely on operator precedence, particularly for logical operators.

e.g.

    if (*num >= '1' && *num <= '9')
    {
        do
        {
            n = (n * 10.0)   (*num   - '0');
        } while (*num >= '0' && *num <= '9'); /* Number? */
    }

Apart from that, I personally feel that @FSMaxB has made some good suggestions here.

Kindest Regards
Fotis

@benlachman
Copy link

Ah we've arrived at the point where we get to niggle about age old disputes. Personally, I dislike the added (and IMO useless) whitespace that braces on their one line create. But, I think this sort of thing, as well as the indentation style, should default to @DaveGamble's preference since there are tons of people on both sides of small style nits like this and no generally agreed norm.

@fgimian
Copy link

fgimian commented Sep 14, 2016

Agreed, I can adapt to both styles. My suggestions were based on my knowledge of Dave's codin style (I work with him) 😄

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 14, 2016

@DaveGamble As soon as you make your decision I can start reformatting.

Let's break it up in some boolean choices to make it is easier:

A) Parenthesis for functions and control structures:

if (condition) {
    do_something();
}
if (condition)
{
    do_something();
}

B) Operators

double a=b c;
double a = b   c;

C) Function parameters

function(a, b, c);
function(a,b,c);

D) Indentation

  1. Tabs
  2. Spaces (how many?)

And a list of other things that should be changed from my proposal of course (if there are any).

@fgimian
Copy link

fgimian commented Sep 15, 2016

Thanks @FSMaxB 😄

@DaveGamble just an fyi; in our codebase, I'm personally using A2, B2, C1, D 4 spaces 😄

@blakemcbride
Copy link

blakemcbride commented Sep 15, 2016

It is my opinion that going back and forth on various coding styles is antithetical to the problem. Any of the standard styles is a vast improvement over what is there now, and no one is ever going to come to a consensus as to the best style. Let the author pick his favorite standard style and run with it, period.

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 27, 2016

Ping @DaveGamble

@DaveGamble
Copy link
Owner

DaveGamble commented Sep 27, 2016

@FSMaxB did you reach a decision as to whether you'd accept becoming a collaborator and taking this task on?

@DaveGamble
Copy link
Owner

Alternatively, if anyone wants to generate a pull request, I'm open to it :)

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 27, 2016

Yes, I'd like to do this as soon as you decide on the coding style.

@DaveGamble
Copy link
Owner

A2, B2, C1, D4 please. :)

Collaborator request sent :)

@vikerian
Copy link

I'm working on it, btw i found something "hardly understandable" in
source code, i made comments like /* [email protected] .....*/ to that places.

Viker

On 09/27/2016 03:01 PM, Dave Gamble wrote:

Alternatively, if anyone wants to generate a pull request, I'm open to
it :)


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#24 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AIiwOFLmFakQnmxE7VxP0lVHxKuOaszIks5quROsgaJpZM4J6D9d.

S pozdravem Vilem Kebrt
email: [email protected]
tel: 420 604 550 132

@DaveGamble
Copy link
Owner

@vikerian @FSMaxB are you both happy to interact on this?

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 27, 2016

I'll start working on this. It's probably best to do function by function commits so it is easier to merge later. I'll create a WIP pull request.

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 27, 2016

Is it Ok if I add comments if I think something needs more explanation?

@DaveGamble
Copy link
Owner

Of course! That would be great! :)

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 27, 2016

@vikerian: If you want to contribute, why don't you start from the bottom of the file and I do it from the top? If we do one commit per function, we can merge it later.

@vikerian
Copy link

already started from top.
damn. ok. i will pull actual and start from bottom :)

@vikerian
Copy link

pull created (2 functions from bottom), doing it in work, so not enough time to make more, will continue :-)

@fgimian
Copy link

fgimian commented Sep 29, 2016

clang-format will help you greatly guys 😄

Here's the config I'm using which I believe aligns with Dave's request:

---
# We'll use defaults from the LLVM style, but with 4 columns indentation.
BasedOnStyle: LLVM

# Maximum line length
ColumnLimit: 99

# Indentation settings
IndentWidth: 4

# Ensure that opening braces are always on their own line
BreakBeforeBraces: Allman

# Ensure that individual cases in a switch are indented
IndentCaseLabels: true

# Ensure that class keywords public, private and protected at all the way to the
AccessModifierOffset: -4

# Ensure that arguments for long lines start on the following line
AlignAfterOpenBracket: AlwaysBreak

# Don't allow one line functions
AllowShortFunctionsOnASingleLine: None

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 29, 2016

I've been trying to use ClangFormat and uncrustify, but it messes up the comments and I take the reformatting as an opportunity to read and understand the code again and add more comments.

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Sep 30, 2016

Status:

@fgimian
Copy link

fgimian commented Oct 1, 2016

Looks awesome so far @FSMaxB ! 😄

@DaveGamble
Copy link
Owner

Awesome work @FSMaxB !!! :D:D

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Oct 10, 2016

I will continue with the tests and cJSON_Utils as soon as I find the time.

@FSMaxB
Copy link
Collaborator Author

FSMaxB commented Oct 29, 2016

Done, all the C code has been reformatted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants