Your Web News in One Place

Help Webnuz

Referal links:

Sign up for GreenGeeks web hosting
April 27, 2023 03:08 pm GMT

Musings on C & C Declarations

Im still uncertain about the language declaration syntax, where in declarations, syntax is used that mimics the use of the variables being declared. It is one of the things that draws strong criticism, but it has a certain logic to it.

Dennis M. Ritchie, Creator of C

I consider the C declarator syntax an experiment that failed.

Bjarne Stroustrup, Creator of C++

Prologue

Often, I come across explanations of C and C++ declarations that try to simplify things for beginners. For example, declarations are often explained (incorrectly) like:

type name ;

However, if you look at the formal grammar for C, nothing like that appears. Instead, you find this:

declaration-specifiers init-declarator-list ;

A declaration-specifier includes a base type (like int, char, etc.), optional qualifiers (like const), and an optional storage class (like static, extern, etc.). Specifically, it does not include [] for arrays, () for functions, or * for pointers those things are part of the declarator.

Thats indisputably more complicated, so I understand the motivation for trying to simplify things for beginners. However, in the long run, the simplification is a disservice since it eventually makes complicated declarations harder for beginners to understand because they have the wrong mental model for declarations. Its better to explain C declarations as they actually are.

Introduction

As part of designing a programming language, you generally need to design a separate syntax for declaring things (variables, constants, functions, etc.). The advantage of a separate syntax is that its usually clear; the (slight) disadvantage is that a separate syntax doesnt tell you how to use the thing being declared. For example, to declare api as an array of pointers to integer in Pascal:

api: array[0..4] of ^integer;

which is crystal clear, but to use the variable, youd write something like:

api[0]^ := 42;

Notice that:

  1. In the declaration, api and [ are not adjacent (whereas in the use they are).
  2. In the declaration, ^ is prefix (whereas in the use its postfix).

Pascal was chosen for this example since it was the dominant language used for computer science education in the 1970s when C had only just been recently invented plus Kernighan famously doesnt like Pascal.

As the epigraph suggests, Ritchie took a different approach for C. To declare api as an array of pointers to integer in C, you write the name as if its being used in an expression (part of the main syntax for the language), then prepend a base type to the whole thing the type of the expression:

int *api[4];    // array of pointer to integer

That is *api[...] is how youd use it to yield an int. While a bit strange, it does, as Ritchie noted, have a certain logic to it. However, once the declarations get more complicated and once things like const and function prototypes were added to C (neither of which existed in the original version of C) declarations infamously get harder to read.

ANSI C & C++ Complications

By virtually any measure, ANSI C improved upon the original C (often referred to as K&R C from the first edition of The C Programming Language). For declarations, the addition of const, void, and function prototypes were improvements overall but they made declarations slightly more complicated in some cases and violate the spirit of Ritchies design in others.

const

The addition of const made pointer declarations more complicated because there are two things that can be constant: the value pointed to (pointer to const), the pointer itself (const pointer), or both:

const char *pcc;         // pointer to const charchar *const cpc;         // const pointer to charconst char *const cpcc;  // const pointer to const char

Such declarations are also inconsistent in that for the base type, const is often written to the left of the type (const char), but for pointers, const must be written to the right of the *. To make things more consistent, some people (myself included) prefer right (or east) const so that const always appears to the right (east) of what its making constant:

char const *pcc;         // equals: char const *pccchar const *const cpcc;  // equals: const char *const cpcc

Read from right-to-left, the second declaration is: cpcc is a constant pointer to a constant character.

void

While the addition of void allowed pointers to raw, untyped memory, pointer-to-void declarations violate the spirit of Ritchies design of making declarations mimic their use. Consider:

void *p;

The problem is that *p can never appear in use because its illegal to dereference a pointer to void because void objects dont exist.

Function Prototypes

The addition of function prototypes from C++ was most certainly an improvement overall, but its syntax is inconsistent with non-prototype declarations. Non-prototype declarations allow multiple things to be declared in the same declaration:

int i, j;int *p, *q;int k, a[4], *r, f(), *g();

In such declarations, commas are used to separate declarations having the same base type. However, prototype declarations for functions having more than one parameter like:

int lcd( int i, int j );

use commas to separate declarations even when the base type is the same. This means you cant declare multiple parameters having the same base type specifying the base type only once. Attempting to do so is likely a mistake in C:

double f( double x, y );         // means: double x, int y

The y is an int because the base type is missing and a missing base type in C defaults to int. Fortunately, C compilers warn about this. (In C++, this is an error.)

Personally, I think Stroustrup should have made prototype declarations use the same syntax as non-prototype declarations. For example:

double f( double x, y; int r );  // alternate syntax

That would allow multiple parameters having the same base type to re-use it. Semicolons would be used to separate parameters only when the base type changes. Such a syntax would also have been closer to Ritchies original function definition parameter syntax:

double f( x, y, r )              // K&R C function definition    double x, y;    int r;

The only difference would have been to move the declarations inside the ().

C++ References

The addition of references in C++ enabled the ability to pass large objects efficiently as function arguments transparently, particularly for operator overloading. However, while reference declarations like:

int i;int &r = i;

are consistent in the sense that you replace * for a pointer declaration with & for a reference declaration, they violate the spirit of Ritchies design since & in expressions does not mean dereference but instead means address of.

Here be Dragons

You might think a declaration like int *api[4] isnt that bad; however, if you want to declare a pointer to an array of integer, youd have to write:

int (*pai)[4];  // pointer to array of integer

Specifically, you need to add () to get the precedence right. The problem stems from the fact that * is a prefix operator whereas [] is a postfix operator. (If * were a postfix operator as ^ is in Pascal, this problem wouldnt exist.)

Declarations can get infinitely more complicated. For example:

char *(*strtab[4])();

where strtab is an array of pointer to function returning pointer to char; or even worse:

void (*signal(int sig, void (*f)(int)))(int);

where signal is a function (sig as int, f as pointer to function (int) returning void) returning pointer to function (int) returning void.

Fortunately, Ritchie also invented typedef that can be used to slay such dragons:

typedef char (*PF_C)(); // pointer to function returning charPF_C strtab[4];typedef void (*sig_t)(int);sig_t signal( int sig, sig_t f );

Therefore, declarations generally arent that bad in practice.

Additionally, you can use cdecl both to decipher and compose declarations.

West Pointers

Despite the reality that C declarations are not:

type name ;

some well-intentioned people try to make things appear to be so by putting the * in pointer declarations to the left (west) of the space:

char* s;     // as opposed to: char *s

While such declarations work since the C compiler doesnt care about whitespace, it also doesnt care about:

char* s, t;  // t is just char

where you likely meant for t to be char* also. The same people then tend to say that you shouldnt declare multiple things in the same declaration anyway and instead do:

char* s;char* t;     // verbose

Personally, I find that needlessly verbose for what otherwise would be trivial declarations.

For an analogy: when learning Spanish, you learn that adjectives go after nouns. Whether you want adjectives to go before to match your English-centric view is irrelevant. You have to speak Spanish the way it is, not the way youd prefer it to be. So too with C.

Epilogue

C is quirky, flawed, and an enormous success.

Dennis Ritchie

When teaching C, its best in the long run to teach it warts and all as it actually is.


Original Link: https://dev.to/pauljlucas/musings-on-c-c-declarations-169o

Share this article:    Share on Facebook
View Full Article

Dev To

An online community for sharing and discovering great ideas, having debates, and making friends

More About this Source Visit Dev To