Generic Programming, Typedefs & Macros in C programming Class 44

Chapter 14

Generic Programming

Writing generic and reusable software is one of the pinnacles of software design. It permits the creation of a toolbox of generally useful code modules or libraries that may be used time and again.
The antithesis of generic design is a set of functions that possess virtually identical code, but di er in terms of the types they operate on. The basic purpose of generic programming is to avoid such code repetition; to define an algorithm or data-structure and that works with a variety of di erent types. Coding for the general case tends to reduce the overall coding e ort, reduce the likelihood of errors, and increase code modularity and reuse.

Some of the topics presented below have been covered in previous chapters. This chapter brings them together under the guise of writing generic programs. The most powerful feature in the C language for generic programming is the ability to cast between di erent types, particularly pointer types, and, most significantly, the generic pointer type void*. This chapter examines the use of void* in generic software design in terms of two examples: an expandable array and the standard library function qsort(). The discussion of qsort() also shows how function pointers can be used to enhance the flexibility of a generic design.

14.1 Basic Generic Design: Typedefs, Macros, and Unions

Typedefs, macros, and unions facilitate limited forms of generality, with each permitting increased flexibility in a specific area of code design. Typedefs provide type-name synonyms, macros circum-vent C type-checking, and unions define variant types.

14.1.1 Typedefs

The keyword typedef permits the creation of new type names for existing types. These type syn-onyms permit an algorithm to be written such that it can work with di erent types by simply changing the appropriate typedefs. For example, by defining the name

typedef char Item;

an algorithm defined in terms of Item can be changed to work with doubles by simply changing the typedef to

typedef double Item;

This technique is demonstrated in the expandable vector example in Section 11.7, where the variable type contained by the vector is specified by a typedef.

The limitation of typedef is that its type is fixed at compile time, and an algorithm cannot be used with di erent types in a single program.

14.1.2 Macros

Macros can be used with di erent types in a given program because they avoid the type-constraints of the C compiler. The C preprocessor expands macro text inline, so that macros essentially automate code duplication. For example, given the macro

#define MAX(x,y) ((x)<(y) ? (y) : (x))

and a set of variables

int x = 1, y = 2, z;

double a = 3.2, b = -9.1, c;

then the following code

z = MAX(x,y);

c = MAX(a / 2.7,b + 1);

expands to

z = ((x)<(y) ? (y) : (x));

c = ((a / 2.7)<(b + 1) ? (b + 1) : (a / 2.7));

As another example, the macro below returns the number of elements in an array x.

#define NELEMS(x) (sizeof(x)/sizeof(x[0]))

This macro works for arrays of any type including arrays of doubles, pointers, structures, and so forth.

Using macros for anything but the most trivial of operations is not advised. To compose any significant algorithm as a macro is both notationally inconvenient and error prone. Macros do not obey the type and scope rules imposed by the C language, and can introduce cryptic syntactical errors if not used with caution.

14.1.3 Unions

Unions permit the definition of variant types—types that may represent one of several di erent types. This facility is useful if one wishes to pass various types at di erent times through the same function interface. It is also used to create a collection of di erent types (i.e., to create a heterogeneous array).

An example of a heterogeneous array is given in Section 11.8, where we noted that unions are usually wrapped in a structure, which defines the type of the value currently stored by the union. The following example shows a similar design for passing variant types to a function.

1 enum VariantType { INT, FLOAT, STRING };

2

3 typedef struct {

4enum VariantType type;

5 union {
6 int i;

7 float f;

8 char *s;

9 } value;

10 } Variant;

11

12 void interpret command(Variant command)

13 {

14 switch (command.type)

15 {

16 case INT: printf("Got integer %d\n", command.value.i); break;
17 case FLOAT: printf("Got float %f\n", command.value.f); break;
18 case STRING: printf("Got string %s\n", command.value.s); break;
19 default: printf("Unknown type %d\n", command.type); break;

20 }

21 }

The function interpret_command() might be passed information as follows.

Variant v, w;

v.type = INT; v.value.i = -5;

w.type = STRING; w.value.s = "COMMAND";

interpret_command(v);

interpret_command(w);

Aside. Notice that the definition of the structure Variant (lines 3 to 10) contains an unnamed union type with a single variable value. Structures and unions do not have to be named if they define a set of variables and are never referred to again.

14.2 Advanced Generic Design: void *

The void* pointer type, in conjunction with the ability to cast between pointer types, permits a very powerful and flexible form of generic programming. With these facilities we can implement algorithms and data-structures that operate on any number of di erent types. In addition, we show how function pointers can be used to specialise certain parts of an algorithm at run-time.

This discussion is presented in terms of two examples. The first is the expandable array Vector that has been developed in Sections 9.5 and 11.7. The second is the standard library function qsort(), which is used to sort the elements of an array.

14.2.1 Case Study: Expandable Array

The following implementation of the expandable array overcomes the limitations of the previous versions. It permits any number of Vector objects to be defined, and each of these may contain elements of a di erent type. There is no longer the constraint of one item-type per program that was imposed by typedef. This increase in flexibility comes at the price of a slight decrease in e ciency and a significant loss of compile-time type-safety. However, the issue of type-safety can be redeemed somewhat by using wrapper functions, as we shall see.

The first thing to notice when reading this source code is how similar it is to the original im-plementation, which permitted just one vector and could only contain integers (see Section 9.5). A fully generic implementation is thus possible without any great increase in code complexity.

The code below shows the contents of the public interface declared in vector.h. The header exports all the declarations that should be visible to the client in order to use this module. Notice that the definition of struct Vector is not exported. All operations on Vector are performed by its associated functions, and the client only ever uses a pointer as a handle to a Vector object. Particularly, the client never has access to the structure members. Therefore, only the structure declaration, which is called an “incomplete type” or “opaque type”,1 needs to be visible to the client.

1 Opaque types, which hide the structure internals from the client, facilitate a valuable form of object-oriented-style encapsulation that presents a high-level abstraction and hides implementation details. However, it should not be seen as the only way to design a module, and sometimes it is expedient to permit clients direct access to structure members, in which case the entire structure definition should be placed in the public interface (i.e., in the header). In general, no design principle should be used blindly, but always as part of an overall plan to improve program functionality, maintainability, etc.

1 #ifndef GENERIC EXPANDABLE VECTOR H

2 #define GENERIC EXPANDABLE VECTOR H

3

4 #include <stddef.h> /* for size t */ 5 typedef struct Vector Vector;

6

7 /* Vector creation and destruction. */

8 Vector *vector create(size t capacity, size t itemsize); 9 void vector destroy(Vector *v);

10

11 /* Vector access operations. */

12 int vector push back(Vector *v, const void *item);

13 void vector pop back(Vector *v, void *item);

14 void* vector get element(Vector *v, size t index);

15 void* vector get begin(Vector *v);

16 void* vector get end(Vector *v);

17

18 /* Size operations. */

19 size t vector get item size(const Vector *v);

20 size t vector get size(const Vector *v);

21 size t vector get capacity(const Vector *v);

22 int vector set size(Vector *v, size t size);

23 int vector set capacity(Vector *v, size t size);

24
25 #endif

The next section of code is from the top of the source file vector.c. Lines 6 to 14 comprise the private interface, which contains implementation details that are internal to the Vector functionality and are hidden from the client.

1 #include <stdlib.h>

2 #include <string.h>

3 #include <assert.h>

4 #include "vector.h"

5

6 static const size t StartSize = 5; /* initial vector capacity */

7 static const float GrowthRate = 1.6f; /* geometric growth of vector capacity */

8

9 typedef struct Vector {
10 void *data;/* pointer to vector elements */

11 size t capacity;/* current reserved memory for vector */

12 size t size; /* current size of vector (number of stored items) */

13 size t itemsize;/* size of each item */

14 } Vector;

An important advantage of hiding the members of struct Vector in the private interface is that it permits di erent implementations to be used without having any e ect on client code. For example, the structure might one day be changed to

typedef struct Vector {
void *begin; /* pointer to beginning of allocated memory */
void *end; /* pointer to one-past-end of allocated memory */
void *last; /* pointer to one-past-end of last vector item */
size_t itemsize; /* size of each item */
} Vector;

This structure definition would permit alternative implementations of the functions described below that might be slightly more e cient. The key point is that these changes would be contained within the module and would not be visible to clients.

The next two functions create and destroy a Vector, respectively. Their implementation is trivial. The function vector_create() is passed two variables. The first is an initial capacity—the number of elements for which to allocate space. The second is the size of each element which, because the function is generic, is not known to the compiler and must be stated explicitly. For example, to create a Vector of integers with initial space for five values, we write

Vector v;

v = vector_create(5, sizeof(int));

15 Vector *vector create(size t capacity, size t itemsize)

16 /* Initialise a vector with the specified capacity for items of size itemsize. */

17 {

18 Vector *v = (Vector *)malloc(sizeof (Vector));

19 if (v) {

20 v->data = malloc(capacity*itemsize);
21 v->capacity = (v->data == NULL) ? 0 : capacity;
22 v->size = 0;
23 v->itemsize = itemsize;
24 }

25 return v;

26 }

27

28 void vector destroy(Vector *v)

29 /* Release memory owned by vector. */

30 {

31 free(v->data);

32 free(v);

33 }

The functions below provide access to the elements of Vector. Of these, the implementation of vector_push_back() is the most complex, but it is described thoroughly in Section 9.5. The only new features are the conversions between void* and char* pointer types, and the use of the standard library function memcpy() (see line 49).

34 int vector push back(Vector *v, const void *item)

35 /* Add element to back of vector. Return index of new element if successful, and -1 if fails. */

36 {

37 /* If out-of-space, allocate more. */

38 if (v->size == v->capacity) {

39 size t newsize = (v->capacity == 0) ? StartSize : (size t)(v->capacity*GrowthRate + 1.0);

40 void *p = realloc(v->data, newsize*v->itemsize);
41 if (p == NULL)

42 return -1;
43 v->capacity = newsize; /* allocate succeeds, update data-structure */
44
45 v->data = p;
46 }

47
48 /* We have enough room. */

49 memcpy((char*)v->data + v->size * v->itemsize, item, v->itemsize);

50 return v->size++;

51 }

52
53 void vector pop back(Vector *v, void *item)

54 /* Return element from back of vector, and remove it from the vector. */

55 {

56 assert(v->size > 0);

57 --v->size;

58 memcpy(item, (char*)v->data + v->size * v->itemsize, v->itemsize);
59 }

60

61 void* vector get element(Vector *v, size t index)

62 /* Return pointer to the element at the specified index. */

63 {

64 assert(index >= 0 && index < v->size);

65 return (char*)v->data + index * v->itemsize;
66 }

67

68 /* Return pointer to beginning of array (ie, to first element of array). */

69 void* vector get begin(Vector *v) { return v->data; }

70

71 /* Return pointer to one-past-end of array. */

72 void* vector get end(Vector *v) { return (char*)v->data + v->size*v->itemsize; }

The conversion to a char* pointer is necessary because it is not possible to perform pointer arithmetic with a void* pointer. That is, a void* pointer is simply a variable large enough to hold any pointer type; it does not specify any particular pointed-to type and does not inform the compiler as to the size of the pointed-to type. Thus, an expression such as ptr + index has no meaning for a void* pointer as the compiler cannot compute the pointer o set. Furthermore, it is not possible to dereference a void* pointer.

By converting a void* pointer to char*, we are able to perform pointer arithmetic over each byte of the pointed-to data. This gives us full control over the array of elements in Vector. For example, consider the statement on line 65, which returns a pointer to the element specified by index.

return (char *)data + index * itemsize;

This statement might equivalently have been written

char *tmp = (char *)data;

return &tmp[index * itemsize];

Notice that, since we are operating on bytes, the required pointer o set must be computed explicity: index * itemsize. However, if the type of the array elements were known, the array pointer could be cast to this type and the compiler would compute the o set automatically. For example, if the array was known to be of type double, the above statement could be rewritten as

double *tmp = (double *)data;

return &tmp[index];

but, as we don’t know the type at compile time, we must compute the o set ourselves. To copy bytes of data from one place to another, we use the standard function memcpy(), which has the following interface.

void *memcpy(void *dest, const void *src, size_t size);

This function copies size bytes from the region of memory pointed to by src to the region of memory pointed to by dest.

The functions vector_get_begin() and vector_get_end() return pointers to the first element in the array and to the one-past-the-end of the array, respectively. The use of one-past-the-end pointers is uncommon in C but ubiquitous in C++. It permits the implementation of the following idiom to iterate over an array.

float *begin, *end, *pf;

begin = (float *)vector_get_begin(v);

end = (float *)vector_get_end(v);

for (pf = begin; pf != end; ++pf)

printf("%f ", *pf);

The final set of functions shown below are provided for completeness. Their implementations present no new details not described previously.

73 /* Inquire after size of vector item. */

74 size t vector get item size(const Vector *v) { return v->itemsize; }

75
76 /* Inquire after vector size and capacity */

77 size t vector get size(const Vector *v) { return v->size; }

78 size t vector get capacity(const Vector *v) { return v->capacity; }

79
80 int vector set size(Vector *v, size t size)

81 /* Set vector size. Return 0 if successful, -1 if fails. */

82 {

83 if (size > v->capacity) {

84 void *p = realloc(v->data, size*v->itemsize);
85 if (p == NULL)

86 return -1;
87 v->capacity = size; /* allocate succeeds, update data-structure */
88
89 v->data = p;
90 }

91

92 v->size = size;
93 return 0;

94 }

95

96 int vector set capacity(Vector *v, size t size)

97 /* Shrink or grow allocated memory reserve for array.

98 * A size of 0 deletes the array. Return 0 if successful, -1 if fails. */

99 {

100 if (size != v->capacity) {

101 void *p = realloc(v->data, size*v->itemsize);
102 if (p == NULL && size > 0)

103 return -1;
104 v->capacity = size;
105
106 v->data = p;
107 }

108

109 if (size < v->size)

110 v->size = size;
111 return 0;

112 }

Techprogramers School

Generic Programming, Typedefs & Macros in C programming Class 44

Generic Programming

14.1 Basic Generic Design: Typedefs, Macros, and Unions

14.1.1 Typedefs

14.1.2 Macros

14.1.3 Unions

14.2 Advanced Generic Design: void *

14.2.1 Case Study: Expandable Array

No comments:

Post a Comment

Follow On YouTube

Categories

Techprogramers School

Generic Programming, Typedefs & Macros in C programming Class 44

Generic Programming

14.1 Basic Generic Design: Typedefs, Macros, and Unions

14.1.1 Typedefs

14.1.2 Macros

14.1.3 Unions

14.2 Advanced Generic Design: void *

14.2.1 Case Study: Expandable Array

No comments:

Post a Comment

Social Profiles

Follow On YouTube

Categories