Ethical Hacking Programming, Blogging, Hosting, All Computer Software, PC Software Download, JAVA in hindi, HTML, PHP, C, C++, Free Learning, Software's Download, Technical Videos, Technical Tricks and Tips, How Make Money

Arrays and Strings & Character Arrays and Strings in C Programming Class 29

Chapter 8

Arrays and Strings


An array is a group of variables of a particular type occupying a contiguous region of memory. In C, array elements are numbered from 0, so that an array of size N is indexed from 0 to N − 1. An array must contain at least one element, and it is an error to define an empty array.

double empty[0]; /* Invalid. Won’t compile. */

8.1 Array Initialisation

As for any other type of variable, arrays may have local, external or static scope. Arrays with static extent have their elements initialised to zero by default, but arrays with local extent are not initialised by default, so their elements have arbitrary values.
It is possible to initialise an array explicitly when it is defined by using an initialiser list. This is a list of values of the appropriate type enclosed in braces and separated by commas. For example,

int days[12] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 };

If the number of values in the initialiser list is less than the size of the array, the remaining elements of the array are initialised to zero. Thus, to initialise the elements of an array with local extent to zero, it is su cient to write

int localarray[SIZE] = {0};

It is an error to have more initialisers than the size of the array.

If the size of an array with an initialiser list is not specified, the array will automatically be allocated memory to match the number of elements in the list. For example,

int days[] = { 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 };

the size of this array will be twelve. The size of an array may be computed via the sizeof operator,

int size = sizeof(days); /* size equals 12 * sizeof(int) */

which returns the number of characters of memory allocated for the array. A common C idiom is to use sizeof to determine the number of elements in an array as in the following example.

nelems = sizeof(days) / sizeof(days[0]);

for(i = 0; i<nelems; ++i)

printf("Month %d has %d days.\n", i+1, days[i]);

This idiom is invariant to changes in the array size, and computes the correct number of elements even if the type of the array changes.1 For this reason, an expression of the form

sizeof(array) / sizeof(array[0])

is preferred over, for example,

sizeof(array) / sizeof(int)

as array might one day become an array of type unsigned long.

Note. sizeof will only return the size of an array if it refers to the original array name. An array name is automatically converted to a pointer in an expression, so that any other reference to the array will not be an array name but a pointer. For example,

int *pdays = days;

int size1 = sizeof(days); /* size1 equals 12 * sizeof(int) */

int size2 = sizeof(days + 1); /* size2 equals sizeof(int *) */

int size3 = sizeof(pdays); /* size3 equals sizeof(int *) */

Similarly, if an array is passed to a function, it is converted to a pointer.

int count_days(int days[], int len)

{

int total=0;

/* assert will fail: sizeof(days) equals sizeof(int *) and len equals 12 */ assert(sizeof(days) / sizeof(days[0]) == len); while(len--)

total += days[len];

return total;

}

8.2 Character Arrays and Strings


Character arrays are special. They have certain initialisation properties not shared with other array types because of their relationship with strings. Of course, character arrays can be initialised in the normal way using an initialiser list.

char letters[] = { ’a’, ’b’, ’c’, ’d’, ’e’ };

But they may also be initialised using a string constant, as follows.

char letters[] = "abcde";

The string initialisation automatically appends a \0 character, so the above array is of size 6, not 5. It is equivalent to writing,

char letters[] = { ’a’, ’b’, ’c’, ’d’, ’e’, ’\0’ };

Thus, writing

char letters[5] = "abcde"; /* OK but bad style. */


1 Furthermore, the expression sizeof(days) / sizeof(days[0]) is a compile-time constant. That is, the value of the expression is known at compile time, and evaluated by the compiler as a constant. Thus, it incurs zero runtime overhead.

While there is no error, there is a very bad style, because the size of the array is too small for its initial list.

An important property of string constants is that they are allocated memory; they have an address and may be referred to by a char * pointer. For constants of any other type, it is not possible to assign a pointer because these constants are not stored in memory and do not have an address. So the following code is incorrect.

double *pval = 9.6; /* Invalid. Won’t compile. */ int *parray = { 1, 2, 3 }; /* Invalid. Won’t compile. */

However, it is perfectly valid for a character pointer to be assigned to a string constant.

char *str = "Hello World!\n"; /* Correct. But array is read-only. */

This is because a string constant has static extent—memory is allocated for the array before the program begins execution, and exists until program termination—and a string constant expression returns a pointer to the beginning of this array.

Note. A string constant is a constant array; the memory of the array is read-only. The result of attempting to change the value of an element of a string constant is undefined. For example,

char *str = "This is a string constant";

str[11] = ’p’; /* Undefined behaviour. */

The static extent of string constants leads to the possibility of various unusual code constructs. For example, it is legitimate for a function to return a pointer to a string constant; the string is not destroyed at the end of the function block.

char * getHello()

/* Return a pointer to an array defined within the function */

{

char *phello = "Hello World\n";

return phello;

}

It is also valid to directly index a string constant, as demonstrated in the following function, which converts a decimal value to a value in base b, and, for bases 11 to 36, correctly substitutes letters for digits where required.


1 void print base b (unsigned x, unsigned b)

2 /* Convert decimal value x to a representation in base b. */

3   {

4char buf[BUF SIZE];

5int q=x, i=0;

6assert(b >= 2);

7

8/* Calculate digit for each place in base b */


9 do {
10 assert(i < BUF SIZE);
11 x = q;
12 q = x/b;
13 buf[i++] = "0123456789abcdefghijklmnopqrstuvwxyz"[x − q*b];

14 } while (q>0);

15
16 /* Print digits, in reverse order (most-significant place first) */

17 for (−−i; i>=0; −−i)
18 printf("%c", buf[i]);

19 printf("\n");

20 }

 So, for a pointer to a string constant, the string constant is read-only. However, for a character array initialised by a string constant, the result is read-writable. This is because, with an array definition, the compiler first allocates memory for the character array and then copies the elements of the string constant into this memory region. Note, the only time a string is copied automatically by the compiler is when a char array is initialised. In every other situation, a string has to be manually copied character-by-character (or by functions such as strcpy() or memcpy()).

A collection of valid operations on various array types is shown below.

short val = 9;

short *pval = &val; /* OK */

double array[] = {1.0, 2.0, 3.0 };

double *parray = array; /* OK */

char str[] = "Hello World!\n"; /* Correct. Array is read-write. */

str[1] = ’a’; /* OK */

8.3 Strings and the Standard Library


The standard library contains many functions for manipulating strings, most of which are declared in the header-file string.h. This section describes several of the more commonly-used functions.

size_t strlen(const char *s). Returns the number of characters in string s, excluding the terminating ’\0’ character. The special unsigned type size_t is used instead of plain int to cater for the possibility of arrays that are longer than the maximum representable int.

char *strcpy(char *s, const char *t). Copies the string t into character array s, and returns a pointer to s.

int strcmp(const char *s, const char *t). Performs a lexicographical2 comparison of strings s and t, and returns a negative value if s < t, a positive value if s > t, and zero if s == t.

char *strcat(char *s, const char *t). Concatenates the string t onto the end of string s. The first character of t overwrites the ’\0’ character at the end of s.

char *strchr(const char *s, int c). Returns a pointer to the first occurrence of charac-ter c in string s. If c is not present, then NULL is returned.

char *strrchr(const char *s, int c). Performs the same task as strchr() but starting from the reverse end of s.

char *strstr(const char *s, const char *t). Searches for the first occurrence of sub-string t in string s. If found, it returns a pointer to the beginning of the substring in s, otherwise it returns NULL.

The functions strncpy(), strncmp(), and strncat() perform the same tasks as their coun-terparts strcpy(), strcmp(), and strcat(), respectively, but include an extra argument n, which limits their operations to the first n characters of the right-hand string.
A standard function that can perform the operations of both strcpy() and strcat(), and even more, is sprintf(). It is a general purpose string formatting function that behaves identically to printf(), but copies the resulting formatted string to a character array rather than sending it to stdout. sprintf() is a very versatile string manipulation function.


2 Strings are compared up to their first di ering character, and string s is lexicographically less than string t if its character is of a lower value (i.e., the value of its character code, say ASCII). Also, s is lexicographically less than string t if all of its characters match with t but it is shorter than t (i.e., if s is a substring of t).

Aside. In general, the concatenation of two strings requires the use of a function like strcat(). However, string constants may be concatenated at compile time by placing them adjacent to one another. For example, "this is " "a string" is equivalent to "this is a string". Compile-time concatenation is useful for writing long strings, since typing a multi-line string constant like

"this is

a string"

is an error. An alternative way to write multi-line string constants is to write

"this is \

a string"

where the first character of the second half of the string occurs in the first column of the next line without preceding white-space. (This is one occasion where white-space matters in a C program.) Usually the adjacency method is preferred over the ’\’ method

Share:

No comments:

Post a Comment

Follow On YouTube