Ethical Hacking Programming, Blogging, Hosting, All Computer Software, PC Software Download, JAVA in hindi, HTML, PHP, C, C++, Free Learning, Software's Download, Technical Videos, Technical Tricks and Tips, How Make Money

Example: An Expandable Array in C programming Class 34

9.5 Example: An Expandable Array

This section presents another extended example of dynamic memory allocation, this time to demon-strate the realloc() function. realloc() is a versatile function; it can perform the tasks of both malloc() and free() and, in addition, can adjust the size of an existing dynamically allocated memory block.

In C, an array has fixed size, which is determined at compile-time. Even dynamically allocated arrays have fixed size once they have been created. In certain situations, it may be desirable to have an array that expands on demand as new elements are added to it. That is, to have an array where elements may be added to the end one-by-one, and the array grows to match the number of elements it contains; thus, the array can be built-up without knowing the number of elements in advance.

In this section, we implement an expandable array of integers. The public interface for this array is exported in a header file vector.h, which is shown below.


1 /* Expandable vector, grows in-place as new elements are added to the back. 2 * The vector memory allocation is exponential to avoid excessive copying. */

3

4   /* Vector access operations. */

5   int push back(int item);

6   int pop back(void);

7   int* get element(int index);

8

9   /* Manual resizing operations. */

10 int get size(void);

11 int set size(int size);

12 int get capacity(void);

13 int set capacity(int size);


5–7 The vector provides three functions for element access. The first, push_back(), adds a new element to the end of the array; the second, pop_back(), returns the last element and then removes it from the array; and the third, get_element(), returns a pointer to a specified array element. The get_element() function permits the expandable vector to be accessed via pointer arithmetic like an ordinary C array.

10–13 In addition to growing the vector by adding elements to the end, there are four functions to provide control of the vector size. The first of these, get_size() and set_size(), obtain and set the size of the vector, respectively. The last two, get_capacity() and set_capacity(), directly control the reserve memory allocation for the array. These functions are explained in more detail below.

The implementation of the expandable array is defined in file vector.c. This file hides the private interface, which consists of several constants and external variables, and also hides the module’s dependence on standard headers stdlib.h (for realloc()) and assert.h (for assert()).

Arrays by definition occupy a contiguous region of memory, and this creates an issue for devel-oping an expandable array. As the array grows, it becomes necessary to request more memory, and often a region can only be extended so far before there is no more space available.2 At this point,


2 The heap permits blocks of memory to be allocated and deallocated in any order, and the dynamic allocation functions internally manage the location and size of each block. Thus, the heap tends to become fragmented over time, so that any given block may have a limited region of extra space available before it runs into a region allocated to another block.

if realloc() is asked to further extend the memory block, it will allocate a completely new block and copy the data from the old block across to the new one. An array that copies its data from one place to another every time it grows will be very ine cient, and a naive implementation will become increasingly slow. However, with an appropriate allocation scheme, an expandable array can be made to have constant-time complexity on average. The key is to allocate memory in geometrically increasing chunks, so that there is an ever growing amount of free-space allocated as the array size increases. This means that the frequency of allocation, and hence the cost of copying data, decreases with array size at the expense of some wasted space.


1 #include "vector.h"

2   #include <stdlib.h>

3   #include <assert.h>

4

5   static const int StartSize = 1; /* initial vector size */

6   static const float GrowthRate = 1.5; /* geometric growth of vector capacity */

7

8   static int *data = NULL;  /* pointer to vector elements */

9 static int vectorsize = 0; /* current size of vector */

10 static int capacity = 0;/* current reserved memory for vector */

11
12 int push back(int item)

13 /* Add element to back of vector. Return index of new element if successful, and -1 if fails. */

14 {

15 /* If out-of-space, allocate more. */

16 if (vectorsize == capacity) {

17 int newsize = (capacity == 0) ? StartSize : (int)(capacity*GrowthRate + 1.0);

18 int *p = (int *)realloc(data, newsize*sizeof (int));

19 if (p == NULL)
20 return −1;

21

22

23

24 }
25
26 /* We have enough room. */

27 data[vectorsize] = item;

28 return vectorsize++;

29 }


8–10 The external variables data, vectorsize, and capacity define the current state of the vector; data points to the array’s dynamic memory block; vectorsize defines the current array size; and capacity records the amount of memory allocated. The value of vectorsize must always be less than or equal to capacity. Notice that these variables are all given initial values, even though they would be initialised to these values by default. Explicit initialisation improves code readability by making ones intentions clear.

16 When a new element is added to the end of the array, its size increases by one. If the vector size is already equal to the amount of available memory, additional memory must be allocated.

17 There are two cases where new memory must be allocated. One, when the very first element is pushed onto the array and, two, when an element is added that exceeds the available space. In the first case, the amount of memory requested will be some initial value StartSize and, in the second, the available memory will be increased by a multiplicative factor GrowthRate.3 The geometric growth value is increased by one to ensure the allocated size actually increases for small values of capacity.


3Many expanding vector implementations use a geometric factor of two, thereby doubling the available space each allocation, but a smaller value may provide a more optimal space-time tradeo . Of course, the geometric factor must always be greater than one.

18–23 The behaviour of realloc() has a few subtle points. If its first argument is NULL, then it acts like malloc() and returns a new block of memory. If the first argument is not NULL, it adjusts the passed memory block to the specified size and returns a pointer to the adjusted block. If the memory request fails, then realloc() returns NULL and the original memory block is unchanged. For this reason, the return value is assigned to a temporary pointer p, rather than

data = (int *)realloc(data, newsize*sizeof(int));

as, in the latter case, the original memory would be lost if realloc() failed. By using a temporary, push_back() can signal a failure while retaining its original state, and will only update its internal data-structures after checking that memory allocation succeeds.

27–28 Once su cient space is known to be available, the new element is added to the array and the array size is incremented. Notice that the returned value is the value of vectorsize prior to increment, so that it represents the index of the new element.


30 int pop back(void)

31 /* Return element from back of vector, and remove it from the vector. */

32 {

33 assert(vectorsize > 0);

34 return data[−−vectorsize];

35 }

36

37 int* get element(int index)

38 /* Return pointer to the element at the specified index. */

39 {

40 assert(index >= 0 && index < vectorsize);

41 return data + index;

42 }


30–35 The function pop_back() returns the last element of the array and then deletes it. This operation reduces the size of the array by one, but does not release any memory; capacity remains the same.

37–42 The function get_element() returns a pointer to the element specified by index. This allows the expandable array to be accessed directly and used like an ordinary array. However, pointers returned by this function should be used with caution if the array capacity subsequently changes. When realloc() allocates memory, it may copy the existing data to a new location and then release the old memory. At this instance, any pointers referring to the old memory block are invalidated and should not be dereferenced. Furthermore, if the capacity of the array is reduced, even if realloc() adjusts the memory bounds in-place, pointers referencing beyond the new bounds will be invalid.


43 /* Manual size operations. */

44 int  get size(void) { return vectorsize; }

45 int  get capacity(void) { return capacity; }

46
47 int set size(int size)

48 /* Set vector size. Return 0 if successful, -1 if fails. */

49 {

50 if (size > capacity) {

51 int *p = (int *)realloc(data, size*sizeof (int));

52 if (p == NULL)

53 return −1;
54
55 capacity = size; /* allocate succeeds, update data-structure */
56 data = p;

57 }

58
59 vectorsize = size;

60 return 0;

61 }


62

63 int set capacity(int size)

64 /* Shrink or grow allocated memory reserve for array.

65 * A size of 0 deletes the array. Return 0 if successful, -1 if fails. */

66 {

67 if (size != capacity) {

68 int *p = (int *)realloc(data, size*sizeof (int));

69 if (p == NULL && size > 0)

70 return −1;
71
72 capacity = size;
73 data = p;

74 }

75
76 if (size < vectorsize)

77 vectorsize = size;

78 return 0;

79 }


44–45 These two trivial functions return the current state of the vector in terms of its size and available space, respectively.

47–61 The function set_size() changes the size of the array to the specified size. If this size is greater than the current capacity, then extra memory is allocated to suit. However, if the size is reduced, capacity is left unchanged; this function cannot decrease the available storage.

63–79 The function set_capacity() provides direct access to the memory allocation of the vector. It can be used to increase or decrease the available storage. If the requested size is less than the current vector size, the vector is truncated to fit. If the requested size is zero, realloc() will release the memory pointed to by its first argument and return NULL, e ectively deleting the vector. Notice that a request of size zero, will cause data to become NULL (line 73), and vectorsize and capacity to become zero—thus, returning the vector to its original state.

The above implementation of an expandable array is a good example of modular programming as far as it goes, but it is subject to a couple of serious limitations. First, it provides a single instance of the vector only, and we cannot have more than one in any given program. Second, the vector can only contain elements of type int. To create a vector for other types would require a new, virtually identical, implementation for each type. Neither of these limitations are insurmountable, but they require the use of language features not yet covered, such as structs, unions, typedef, and the void* pointer. The application of these features to writing generic code, is the subject of Chapter 14.




9.5 Example: An Expandable Array


This section presents another extended example of dynamic memory allocation, this time to demon-strate the realloc() function. realloc() is a versatile function; it can perform the tasks of both malloc() and free() and, in addition, can adjust the size of an existing dynamically allocated memory block.

In C, an array has fixed size, which is determined at compile-time. Even dynamically allocated arrays have fixed size once they have been created. In certain situations, it may be desirable to have an array that expands on demand as new elements are added to it. That is, to have an array where elements may be added to the end one-by-one, and the array grows to match the number of elements it contains; thus, the array can be built-up without knowing the number of elements in advance.

In this section, we implement an expandable array of integers. The public interface for this array is exported in a header file vector.h, which is shown below.


1 /* Expandable vector, grows in-place as new elements are added to the back. 2 * The vector memory allocation is exponential to avoid excessive copying. */

3

4   /* Vector access operations. */

5   int push back(int item);

6   int pop back(void);

7   int* get element(int index);

8

9   /* Manual resizing operations. */

10 int get size(void);

11 int set size(int size);

12 int get capacity(void);

13 int set capacity(int size);


5–7 The vector provides three functions for element access. The first, push_back(), adds a new element to the end of the array; the second, pop_back(), returns the last element and then removes it from the array; and the third, get_element(), returns a pointer to a specified array element. The get_element() function permits the expandable vector to be accessed via pointer arithmetic like an ordinary C array.

10–13 In addition to growing the vector by adding elements to the end, there are four functions to provide control of the vector size. The first of these, get_size() and set_size(), obtain and set the size of the vector, respectively. The last two, get_capacity() and set_capacity(), directly control the reserve memory allocation for the array. These functions are explained in more detail below.

The implementation of the expandable array is defined in file vector.c. This file hides the private interface, which consists of several constants and external variables, and also hides the module’s dependence on standard headers stdlib.h (for realloc()) and assert.h (for assert()).

Arrays by definition occupy a contiguous region of memory, and this creates an issue for devel-oping an expandable array. As the array grows, it becomes necessary to request more memory, and often a region can only be extended so far before there is no more space available.2 At this point,


2 The heap permits blocks of memory to be allocated and deallocated in any order, and the dynamic allocation functions internally manage the location and size of each block. Thus, the heap tends to become fragmented over time, so that any given block may have a limited region of extra space available before it runs into a region allocated to another block.

if realloc() is asked to further extend the memory block, it will allocate a completely new block and copy the data from the old block across to the new one. An array that copies its data from one place to another every time it grows will be very ine cient, and a naive implementation will become increasingly slow. However, with a proper allocation plan, a formal array can be created on an average, for the continuous complexity of time. The key is geometrically allocating memory in increasing quantities, so that the array size increases as an increasing amount of space allocated. This means that the frequency of allocation, and hence the cost of copying data, decreases with array size at the expense of some wasted space.


1 #include "vector.h"

2   #include <stdlib.h>

3   #include <assert.h>

4

5   static const int StartSize = 1; /* initial vector size */

6   static const float GrowthRate = 1.5; /* geometric growth of vector capacity */

7

8   static int *data = NULL;  /* pointer to vector elements */

9 static int vectorsize = 0; /* current size of vector */

10 static int capacity = 0;/* current reserved memory for vector */

11
12 int push back(int item)

13 /* Add element to back of vector. Return index of new element if successful, and -1 if fails. */

14 {

15 /* If out-of-space, allocate more. */

16 if (vectorsize == capacity) {

17 int newsize = (capacity == 0) ? StartSize : (int)(capacity*GrowthRate + 1.0);

18 int *p = (int *)realloc(data, newsize*sizeof (int));

19 if (p == NULL)
20 return −1;

21

22

23

24 }
25
26 /* We have enough room. */

27 data[vectorsize] = item;

28 return vectorsize++;

29 }


8–10 The external variables data, vectorsize, and capacity define the current state of the vector; data points to the array’s dynamic memory block; vectorsize defines the current array size; and capacity records the amount of memory allocated. The value of vectorsize must always be less than or equal to capacity. Notice that these variables are all given initial values, even though they would be initialised to these values by default. Explicit initialisation improves code readability by making ones intentions clear.

16 When a new element is added to the end of the array, its size increases by one. If the vector size is already equal to the amount of available memory, additional memory must be allocated.

17 There are two cases where new memory must be allocated. One, when the very first element is pushed onto the array and, two, when an element is added that exceeds the available space. In the first case, the amount of memory requested will be some initial value StartSize and, in the second, the available memory will be increased by a multiplicative factor GrowthRate.3 The geometric growth value is increased by one to ensure the allocated size actually increases for small values of capacity.


3Many expanding vector implementations use a geometric factor of two, thereby doubling the available space each allocation, but a smaller value may provide a more optimal space-time tradeo . Of course, the geometric factor must always be greater than one.

18–23 The behaviour of realloc() has a few subtle points. If its first argument is NULL, then it acts like malloc() and returns a new block of memory. If the first argument is not NULL, it adjusts the passed memory block to the specified size and returns a pointer to the adjusted block. If the memory request fails, then realloc() returns NULL and the original memory block is unchanged. For this reason, the return value is assigned to a temporary pointer p, rather than

data = (int *)realloc(data, newsize*sizeof(int));

as, in the latter case, the original memory would be lost if realloc() failed. By using a temporary, push_back() can signal a failure while retaining its original state, and will only update its internal data-structures after checking that memory allocation succeeds.

27–28 Once su cient space is known to be available, the new element is added to the array and the array size is incremented. Notice that the returned value is the value of vectorsize prior to increment, so that it represents the index of the new element.


30 int pop back(void)

31 /* Return element from back of vector, and remove it from the vector. */

32 {

33 assert(vectorsize > 0);

34 return data[−−vectorsize];

35 }

36

37 int* get element(int index)

38 /* Return pointer to the element at the specified index. */

39 {

40 assert(index >= 0 && index < vectorsize);

41 return data + index;

42 }


30–35 The function pop_back() returns the last element of the array and then deletes it. This operation reduces the size of the array by one, but does not release any memory; capacity remains the same.

37–42 The function get_element() returns a pointer to the element specified by index. This allows the expandable array to be accessed directly and used like an ordinary array. However, pointers returned by this function should be used with caution if the array capacity subsequently changes. When realloc() allocates memory, it may copy the existing data to a new location and then release the old memory. At this instance, any pointers referring to the old memory block are invalidated and should not be dereferenced. Furthermore, if the capacity of the array is reduced, even if realloc() adjusts the memory bounds in-place, pointers referencing beyond the new bounds will be invalid.


43 /* Manual size operations. */

44 int  get size(void) { return vectorsize; }

45 int  get capacity(void) { return capacity; }

46
47 int set size(int size)

48 /* Set vector size. Return 0 if successful, -1 if fails. */

49 {

50 if (size > capacity) {

51 int *p = (int *)realloc(data, size*sizeof (int));

52 if (p == NULL)

53 return −1;
54
55 capacity = size; /* allocate succeeds, update data-structure */
56 data = p;

57 }

58
59 vectorsize = size;

60 return 0;

61 }


62

63 int set capacity(int size)

64 /* Shrink or grow allocated memory reserve for array.

65 * A size of 0 deletes the array. Return 0 if successful, -1 if fails. */

66 {

67 if (size != capacity) {

68 int *p = (int *)realloc(data, size*sizeof (int));

69 if (p == NULL && size > 0)

70 return −1;
71
72 capacity = size;
73 data = p;

74 }

75
76 if (size < vectorsize)

77 vectorsize = size;

78 return 0;

79 }


44–45 These two trivial functions return the current state of the vector in terms of its size and available space, respectively.

47–61 The function set_size() changes the size of the array to the specified size. If this size is greater than the current capacity, then extra memory is allocated to suit. However, if the size is reduced, capacity is left unchanged; this function cannot decrease the available storage.

63–79 The function set_capacity() provides direct access to the memory allocation of the vector. It can be used to increase or decrease the available storage. If the requested size is less than the current vector size, the vector is truncated to fit. If the requested size is zero, realloc() will release the memory pointed to by its first argument and return NULL, e ectively deleting the vector. Notice that a request of size zero, will cause data to become NULL (line 73), and vectorsize and capacity to become zero—thus, returning the vector to its original state.

The above implementation of an expandable array is a good example of modular programming as far as it goes, but it is subject to a couple of serious limitations. First, it provides a single instance of the vector only, and we cannot have more than one in any given program. Second, the vector can only contain elements of type int. To create a vector for other types would require a new, virtually identical, implementation for each type. Neither of these limitations are insurmountable, but they require the use of language features not yet covered, such as structs, unions, typedef, and the void* pointer. The application of these features to writing generic code, is the subject of Chapter 14.





Share:

No comments:

Post a Comment

Follow On YouTube