Chapter 16
This text has covered most of the core ISO C language and its use. However, virtually all useful software systems make use of some form to extension to standard C. This chapter provides a sampling of the virtually limitless field of extensions and related topics with regard to writing C programs in the real world. Knowledge of the core language is the foundation upon which all these additional topics rely.
TODO: complete this chapter...
There are many details of ISO C and the standard library that are not covered in this text. For the most part, these topics are peripheral, and do not impinge on the majority of application programming. They include:
• Complete rules of operator precedence and order of evaluation.
• Keywords such as register and volatile.
• Memory alignment and padding.
• Changes to the standard with ISO C99. For the most part, this standard to backward com-patible with C89, and the older standard currently remains the more important language in practice.
One topic that is fundamental but cannot be adequately covered in this book is the standard library; the majority of standard functions are not even mentioned. These functions are frequently useful and are worthy of study. They include, input and output (stdio.h), mathematical func-tions (math.h), strings (string.h), utilities (stdlib.h), time (time.h), floating-point specifications (float.h), errors (errno.h), assertions (assert.h), variable-length argument lists (stdarg.h), sig-nal handling (signal.h), non-local jumps (setjmp.h), etc.
For more on these and other topics, consult a good reference textbook. A complete and au-thoritative reference is [HS95, HS02], and is highly recommended for practicing programmers. An excellent FAQ [Sum95] on the C language discusses many of the more di cult aspects. It is worth noting that many C idioms are not recorded in any textbook and can only be discovered from practical experience and reading the source code of others.
Note. Di erent compilers may conform to the standard to di erent extent. They might not permit conforming code to compile, or it might exhibit non-standard behaviour. This is less likely with modern compilers. More likely is allowing non-standard code to compile. As a rule, it is wise to compile code on several di erent compilers to ensure standard conformance.
The C language was created in the early 1970s and, over the next decade or so, grew and evolved substantially, until finally being standardised in 1989. Prior to 1989, the original language reference was defined by the first edition of The C Programming Language by Kernighan and Ritchie in 1978. This version is now called “classic C” or “K&R C”, and has significant di erences to ISO C. The most noticeable di erence is that functions did not have prototypes, and a function definition that we would now write as
double func(int a, int b, char c)
{
...
}
would be written as
double func(a, b, c)
int a, b;
char c;
{
...
}
Standard C, with the introduction of prototypes, provides far stronger type-checking than was previously available.
Make-files manage the organisation of a C program, which may consist of numerous source and header files and possibly other precompiled libraries. Makefiles manage compilation dependencies and linking, and permit partial compilation, so that only those parts of the program that have changed need to be recompiled. Makefiles can be complicated, and a simple example would be of limited value. They are platform dependent, and some compiler environments (e.g., Microsoft Visual Studio) manage project Makefiles automatically via the IDE. For more information, consult a textbook or read existing makefiles.
Two examples worth examining are GNU-Make (GMAKE) and CMAKE. The first is ...
TODO - add reference to GMAKE - add reference to CMAKE - cross-platform makefile gener-ation
The standard C language provides the foundation on which ...
- but is limited in the capabilities of the standard library.
- API (application programming interface)
- standard APIs: POSIX - platforms: Win32 API, etc - Non-standard extensions. Graphics (OpenGL, VTK), GUI frameworks, threads, interrupts, real-time, hardware, audio, serial comms, sockets, file-structure—directories.
- to write portable code, isolate non-portable parts in modules in separate files and write wrapper interfaces, then to port just need to write a few platform specific internals.
16.5 Interfacing With Libraries
- many open-source C libraries - other repositories: - source forge - planet source code - www.program.com/source
- linux?? - netlib
- Separate ISO C conforming code from proprietry or platform specific - Interface with precom-piled libraries, open-source libraries, - discuss libraries as an example of modular design.
There arise situations where a C program must call a set of routines written in another programming language, such as assember, C++, FORTRAN, Matlab, etc.
- Interfacing C with FORTRAN, assembler, C++, MatLab, etc. - binding
Historically, instruction count was a premium. Computer processors were slow and memory was tiny, and the speed of an algorithm was directly proportional to the number of instructions it required. Programmers spent a lot of e ort finding ways to minimise instruction count. Most algorithm textbooks today continue to use this measure in their analysis of algorithm complexity.
Modern computers, with fast CPUs, are no longer constrained primarily by instruction execution. Today, the bottleneck is memory access. While ever waiting for instructions or data to be fetched from memory, the CPU is idle and cycles are wasted. To minimise idle time, modern computer architectures employ a memory hierarchy, a set of memory levels of di erent size and speed to permit faster access to frequently used information. This hierarchy, from fastest to slowest, consists of registers, cache, main random access memory (RAM), hard-disk, and magnetic tape. Very fast memory is small and expensive, while cheap large-scale memory, such as RAM, is relatively slow. Each level in the hierarchy is typically slower than the level above by several orders-of-magnitude. Information is transferred up and down the memory hierarchy automatically by the operating system, with the exception of magnetic tape, which is usually reserved for memory backup. Essentially all modern operating systems manage the transfer of data between RAM and hard-disk, so that the hard-disk appears as additional, albeit slow, RAM known as virtual memory.
As the CPU accesses instructions or data, the required information is transferred up the hierarchy. If the information is already in registers, it can be executed immediately. If it resides in cache, it is moved up to the registers and the old register data is transferred back to cache. Similarly “lines” of RAM are moved up into cache, and “pages” of hard-disk memory are moved up to RAM. Since the amount of information that can be stored in the upper levels is limited, data that has not been accessed recently is passed back down to lower levels. For example, if all cache lines are full and a new line is required from RAM, the least recently used cache line is returned to RAM.
Information is transferred between levels in blocks, so that when a particular item is accessed, it brings with it a neighbourhood of instructions or data. Thus, if the next item required was a neighbour, that item is already in cache and is available for immediate execution. This property is called “locality of reference” and has significant influence on algorithm speed. An algorithm with a large instruction count but good locality may perform much faster than another algorithm with smaller instruction count. Some algorithms that look fast on paper are slow in practice due to bad cache interaction.
There are various factors that a ect locality of reference. One is program size. There is usually a tradeo between size and speed, whereby to use less memory the program requires the execution of more instructions and vice-versa. However, a program that is optimised for size, that attempts to occupy minimal space, may also achieve better speed as it is better able to fit within cache lines. Another factor is data-structures. Some data-structures such as link-lists may develop bad locality if naively implemented, whereas others, such as arrays, possess very good locality. A third
factor is the way an algorithm utilises a data-structure. For example, in numerical computing, matrix multiplication can be sped up by orders-of-magnitude by using “blocking” algorithms, which operate over sub-matrix blocks rather than over an entire large matrix.
The use of dynamic memory—allocation from the heap—can significantly a ect program exe-cution speed. Access to operating system resources, such as dynamic memory and input-output, is generally slow and should be minimised as a rule. (Functions like printf() manage this au-tomatically by bu ering characters and only sending them to the OS when the bu er is full or when explicitly flushed.) Allocating many small objects on the heap is very ine cient, in time and space, as each allocation involves a search for space and bookkeeping records. Also, over successive allocations and deallocations, the heap tends to become fragmented and develops bad locality-of-reference. Heap allocation becomes even slower in multi-threaded environments, as each allocation involves locking and unlocking operations for thread synchronisation.
One approach to alleviating heap allocation overhead is to use arenas. An arena is a data-structure that wraps the global memory allocator and allocates an internal cache of memory in large chunks. Clients use the arena to perform local allocations, and obtain portions of the arena cache. Arenas possess several advantages: They can avoid the space overhead of general-purpose allocator records, they avoid the time overhead of thread-safe allocation, and they prevent memory leaks by providing centralised deallocation.
The data structures presented in Chapter 15 represent arguably the most common and useful con-structs for most programs. However, there exist a vast number of more sophisticated data structures for more specialised problems. Examples include red-black binary trees, B-trees, graphs, finite state machines, etc.
The literature on advanced algorithms is also vast. On the topic of sorting alone, there are many variations of Quicksort that alleviate its O(n2) worst-case behaviour. There are also special-case sorting algorithms that have linear-time complexity for problems with appropriate structure. Other forms of algorithms include searching, selection, numerical computation, etc. For further reading, the following texts are recommended [Sed98, CLRS01, Knu98a, PTVF92].
C in the Real World
This text has covered most of the core ISO C language and its use. However, virtually all useful software systems make use of some form to extension to standard C. This chapter provides a sampling of the virtually limitless field of extensions and related topics with regard to writing C programs in the real world. Knowledge of the core language is the foundation upon which all these additional topics rely.
TODO: complete this chapter...
16.1 Further ISO C Topics
There are many details of ISO C and the standard library that are not covered in this text. For the most part, these topics are peripheral, and do not impinge on the majority of application programming. They include:
• Complete rules of operator precedence and order of evaluation.
• Keywords such as register and volatile.
• Memory alignment and padding.
• Changes to the standard with ISO C99. For the most part, this standard to backward com-patible with C89, and the older standard currently remains the more important language in practice.
One topic that is fundamental but cannot be adequately covered in this book is the standard library; the majority of standard functions are not even mentioned. These functions are frequently useful and are worthy of study. They include, input and output (stdio.h), mathematical func-tions (math.h), strings (string.h), utilities (stdlib.h), time (time.h), floating-point specifications (float.h), errors (errno.h), assertions (assert.h), variable-length argument lists (stdarg.h), sig-nal handling (signal.h), non-local jumps (setjmp.h), etc.
For more on these and other topics, consult a good reference textbook. A complete and au-thoritative reference is [HS95, HS02], and is highly recommended for practicing programmers. An excellent FAQ [Sum95] on the C language discusses many of the more di cult aspects. It is worth noting that many C idioms are not recorded in any textbook and can only be discovered from practical experience and reading the source code of others.
Note. Di erent compilers may conform to the standard to di erent extent. They might not permit conforming code to compile, or it might exhibit non-standard behaviour. This is less likely with modern compilers. More likely is allowing non-standard code to compile. As a rule, it is wise to compile code on several di erent compilers to ensure standard conformance.
16.2 Traditional C
The C language was created in the early 1970s and, over the next decade or so, grew and evolved substantially, until finally being standardised in 1989. Prior to 1989, the original language reference was defined by the first edition of The C Programming Language by Kernighan and Ritchie in 1978. This version is now called “classic C” or “K&R C”, and has significant di erences to ISO C. The most noticeable di erence is that functions did not have prototypes, and a function definition that we would now write as
double func(int a, int b, char c)
{
...
}
would be written as
double func(a, b, c)
int a, b;
char c;
{
...
}
Standard C, with the introduction of prototypes, provides far stronger type-checking than was previously available.
16.3 Make Files
Make-files manage the organisation of a C program, which may consist of numerous source and header files and possibly other precompiled libraries. Makefiles manage compilation dependencies and linking, and permit partial compilation, so that only those parts of the program that have changed need to be recompiled. Makefiles can be complicated, and a simple example would be of limited value. They are platform dependent, and some compiler environments (e.g., Microsoft Visual Studio) manage project Makefiles automatically via the IDE. For more information, consult a textbook or read existing makefiles.
Two examples worth examining are GNU-Make (GMAKE) and CMAKE. The first is ...
TODO - add reference to GMAKE - add reference to CMAKE - cross-platform makefile gener-ation
16.4 Beyond the C Standard Library
The standard C language provides the foundation on which ...
- but is limited in the capabilities of the standard library.
- API (application programming interface)
- standard APIs: POSIX - platforms: Win32 API, etc - Non-standard extensions. Graphics (OpenGL, VTK), GUI frameworks, threads, interrupts, real-time, hardware, audio, serial comms, sockets, file-structure—directories.
- to write portable code, isolate non-portable parts in modules in separate files and write wrapper interfaces, then to port just need to write a few platform specific internals.
16.5 Interfacing With Libraries
- many open-source C libraries - other repositories: - source forge - planet source code - www.program.com/source
- linux?? - netlib
- Separate ISO C conforming code from proprietry or platform specific - Interface with precom-piled libraries, open-source libraries, - discuss libraries as an example of modular design.
16.6 Mixed Language Programming
There arise situations where a C program must call a set of routines written in another programming language, such as assember, C++, FORTRAN, Matlab, etc.
- Interfacing C with FORTRAN, assembler, C++, MatLab, etc. - binding
16.7 Memory Interactions
Historically, instruction count was a premium. Computer processors were slow and memory was tiny, and the speed of an algorithm was directly proportional to the number of instructions it required. Programmers spent a lot of e ort finding ways to minimise instruction count. Most algorithm textbooks today continue to use this measure in their analysis of algorithm complexity.
Modern computers, with fast CPUs, are no longer constrained primarily by instruction execution. Today, the bottleneck is memory access. While ever waiting for instructions or data to be fetched from memory, the CPU is idle and cycles are wasted. To minimise idle time, modern computer architectures employ a memory hierarchy, a set of memory levels of di erent size and speed to permit faster access to frequently used information. This hierarchy, from fastest to slowest, consists of registers, cache, main random access memory (RAM), hard-disk, and magnetic tape. Very fast memory is small and expensive, while cheap large-scale memory, such as RAM, is relatively slow. Each level in the hierarchy is typically slower than the level above by several orders-of-magnitude. Information is transferred up and down the memory hierarchy automatically by the operating system, with the exception of magnetic tape, which is usually reserved for memory backup. Essentially all modern operating systems manage the transfer of data between RAM and hard-disk, so that the hard-disk appears as additional, albeit slow, RAM known as virtual memory.
As the CPU accesses instructions or data, the required information is transferred up the hierarchy. If the information is already in registers, it can be executed immediately. If it resides in cache, it is moved up to the registers and the old register data is transferred back to cache. Similarly “lines” of RAM are moved up into cache, and “pages” of hard-disk memory are moved up to RAM. Since the amount of information that can be stored in the upper levels is limited, data that has not been accessed recently is passed back down to lower levels. For example, if all cache lines are full and a new line is required from RAM, the least recently used cache line is returned to RAM.
Information is transferred between levels in blocks, so that when a particular item is accessed, it brings with it a neighbourhood of instructions or data. Thus, if the next item required was a neighbour, that item is already in cache and is available for immediate execution. This property is called “locality of reference” and has significant influence on algorithm speed. An algorithm with a large instruction count but good locality may perform much faster than another algorithm with smaller instruction count. Some algorithms that look fast on paper are slow in practice due to bad cache interaction.
There are various factors that a ect locality of reference. One is program size. There is usually a tradeo between size and speed, whereby to use less memory the program requires the execution of more instructions and vice-versa. However, a program that is optimised for size, that attempts to occupy minimal space, may also achieve better speed as it is better able to fit within cache lines. Another factor is data-structures. Some data-structures such as link-lists may develop bad locality if naively implemented, whereas others, such as arrays, possess very good locality. A third
factor is the way an algorithm utilises a data-structure. For example, in numerical computing, matrix multiplication can be sped up by orders-of-magnitude by using “blocking” algorithms, which operate over sub-matrix blocks rather than over an entire large matrix.
The use of dynamic memory—allocation from the heap—can significantly a ect program exe-cution speed. Access to operating system resources, such as dynamic memory and input-output, is generally slow and should be minimised as a rule. (Functions like printf() manage this au-tomatically by bu ering characters and only sending them to the OS when the bu er is full or when explicitly flushed.) Allocating many small objects on the heap is very ine cient, in time and space, as each allocation involves a search for space and bookkeeping records. Also, over successive allocations and deallocations, the heap tends to become fragmented and develops bad locality-of-reference. Heap allocation becomes even slower in multi-threaded environments, as each allocation involves locking and unlocking operations for thread synchronisation.
One approach to alleviating heap allocation overhead is to use arenas. An arena is a data-structure that wraps the global memory allocator and allocates an internal cache of memory in large chunks. Clients use the arena to perform local allocations, and obtain portions of the arena cache. Arenas possess several advantages: They can avoid the space overhead of general-purpose allocator records, they avoid the time overhead of thread-safe allocation, and they prevent memory leaks by providing centralised deallocation.
16.8 Advanced Algorithms and Data Structures
The data structures presented in Chapter 15 represent arguably the most common and useful con-structs for most programs. However, there exist a vast number of more sophisticated data structures for more specialised problems. Examples include red-black binary trees, B-trees, graphs, finite state machines, etc.
The literature on advanced algorithms is also vast. On the topic of sorting alone, there are many variations of Quicksort that alleviate its O(n2) worst-case behaviour. There are also special-case sorting algorithms that have linear-time complexity for problems with appropriate structure. Other forms of algorithms include searching, selection, numerical computation, etc. For further reading, the following texts are recommended [Sed98, CLRS01, Knu98a, PTVF92].
No comments:
Post a Comment