13.1.3 String Formatting
The functions sprintf() and sscanf() perform essentially the same operations as printf() and scanf(), respectively, but, rather than interact with stdout or stdin, they operate on a character array argument. They present the following interfaces.
int sprintf(char *buf, const char *format, ...);
int sscanf(const char *buf, const char *format, ...);
The sprintf() function stores the resulting formatted string in buf and automatically appends this string with a terminating \0 character. It returns the number of characters stored (excluding \0). This function is very useful for a wide range of string manipulation operations. For example, the following code segment creates a format string at runtime, which prevents scanf() from overflowing its character bu er.
char buf[100], format[10];
sprintf(format, "%%%ds", sizeof(buf)-1); /* Create format string "%99s". */
scanf(format, buf); /* Get string from stdin. */
The input string is thus limited to not more than 99 characters plus 1 for the terminating \0. sscanf() extracts values from the string buf according to the format string, and stores the results
in the additional argument list. It behaves just like scanf() with buf replacing stdin as the source of input characters. An attempt to read beyond the end of string buf for sscanf() is equivalent to reaching the end-of-file for scanf(). The sscanf() function is often used in conjunction with a line input function, such as fgets(), as in the following example.
char buf[100];
double dval;
fgets(buf, sizeof(buf), stdin); /* Get a line of input, store in buf. */
sscanf(buf, "%lf", &dval); /* Extract a double from buf. */
13.2 File IO
The C language is closely tied to the UNIX operating system; they were initially developed in parallel, and UNIX was implemented in C.6 Thus, much of the standard C library is modelled on UNIX facilities, and in particular the way it performs input and output by reading or writing to files.
In the UNIX operating system, all input and output is done by reading or writing files, because all peripheral devices, even keyboard and screen, are files in the file system. This means that a single homogeneous interface handles all communication between a program and peripheral devices [KR88, page 169].
13.2.1 Opening and Closing Files
A file is referred to by a FILE pointer, where FILE is a structure declaration defined with a typedef in header stdio.h.7 This file pointer “points to a structure that contains information about the file, such as the location of a bu er, the current character position in the bu er, whether the file is being read or written, and whether errors or end-of-file have occurred” [KR88, page 160]. All these
6An interesting discussion on the history and philosophy of C is given in [Rit93], available online.
7 The standard implementation of FILE is a good example of object-oriented coding style. The FILE data type is an opaque type, only ever referred to by a pointer, and algorithms operating on it are hidden within associated standard functions. Thus, only the FILE type-name and the function declarations need be exported to the public interface, and the structure definition and algorithm details may be hidden in the private interface.
implementation details are hidden from users of the standard library via the FILE type-name and the associated library functions.
A file is opened by the function fopen(), which has the interface
FILE *fopen(const char *name, const char *mode);
The first argument, name, is a character string containing the name of the file. The second is a mode string, which determines how the file may be used. There are three basic modes: read "r", write "w" and append "a". The first opens an existing file for reading, and fails if the file does not exist. The other two open a file for writing, and create a new file if it does not already exist. Opening an existing file in "w" mode, first clears the file of its existing data (i.e., overwrites the existing file). Opening in "a" mode preserves the existing data and adds new data to the end of the file.
Each of these modes may include an additional “update” specification signified by a + character (i.e., "r+", "w+", "a+"), which enables the file stream to be used for both input and output. This ability is most useful in conjunction with the random access file operations described in Section 13.2.4 below.
Some operating systems treat “binary” files di erently to “text” files. (For example, UNIX handles binary and text files the same; Win32 represents them di erently.) The standard C library caters for this variation by permitting a file to be explicitly marked as binary with the addition of a b character to the file-open mode (e.g., "rb" opens a binary file for reading).
If opening a file is successful, fopen() returns a valid FILE * pointer. If there is an error, it returns NULL (e.g., attempting to open a file for reading that does not exist, or attempting to open a file without appropriate permissions). As with other functions that return pointers to limited resources, such as the dynamic memory allocation functions, it is prudent to always check the return value for NULL.
To close a file, the file pointer is passed to fclose(), which has the interface
int fclose(FILE *fp);
This function breaks the connection with the file and frees the file pointer. It is good practice to free file pointers when a file is no longer needed as most operating systems have a limit on the number of files that a program may have open simultaneously. However, fclose() is called automatically for each open file when a program terminates.
13.2.2 Standard IO
When a program begins execution, there are three text streams predefined and open. These are standard input (stdin), standard output (stdout) and standard error (stderr). The first two signify “normal” input and output, and for most interactive environments are directed to the keyboard and screen, respectively. Their input and output streams are usually bu ered, which means that characters are accumulated in a queue and sent in packets, minimising expensive system calls. Bu ering may be controlled by the standard function setbuf(). The stderr stream is reserved for sending error messages. Like stdout it is typically directed to the screen, but its output is unbu ered.
13.2.3 Sequential File Operations
Once a file is opened, operations on the file—reading or writing—usually negotiate the file in a sequential manner, from the beginning to the end. The standard library provides a number of di erent operations for sequential IO.
The simplest functions process a file one character at a time. To write a character there are the functions
int fputc(int c, FILE *fp);
int putc(int c, FILE *fp);
int putchar(int c);
where calling putchar(c) is equivalent to calling putc(c, stdout). The functions putc() and fputc() are identical, but putc() is typically implemented as a macro for e ciency. These functions return the character that was written, or EOF if there was an error (e.g., the hard disk was full).
To read a character, there are the functions
int fgetc(FILE *fp);
int getc(FILE *fp);
int getchar(void);
which are analogous to the character output functions. Calling getchar() is equivalent to calling getc(stdin), and getc() is usually a macro implementation of fgetc().8 These functions return the next character in the character stream unless either the end-of-file is reached or an error occurs. In these anomalous cases, they return EOF. It is possible to push a character c back onto an input stream using the function
int ungetc(int c, FILE *fp);
The pushed back character will be read by the next call to getc() (or getchar() or fscanf(), etc) on that stream.
Note. The symbolic constant EOF is returned by standard IO functions to signal either end-of-file or an IO error. For input functions, it may be necessary to determine which of these cases is being flagged. Two standard functions, feof() and ferror(), are provided for this task and, respectively, they return non-zero if the prior EOF was due to end-of-file or an output error.
Formatted IO can be performed on files using the functions
int fprintf(FILE *fp, const char *format, ...); int fscanf(FILE *fp, const char *format, ...);
These functions are generalisations of printf() and scanf(), which are equivalent to the calls fprintf(stdout, format, ...) and fscanf(stdin, format, ...), respectively.
Characters can be read from a file a line at a time using the function
char *fgets(char *buf, int max, FILE *fp);
which reads at most max-1 characters from the file pointed to by fp and stores the resulting string in buf. It automatically appends a \0 character to the end of the string. The function returns when it encounters a \n character (i.e., a newline), or reaches the end-of-file, or has read the maximum number of characters. It returns a pointer to buf if successful, and NULL for end-of-file or if there was an error.
Character strings may be written to a file using the function
int fputs(const char *str, FILE *fp);
which returns a non-negative value if successful and EOF if there was an error. Note, the string need not contain a \n character, and fputs() will not append one, so strings may be written to the same line with successive calls.
8 While putc() is equivalent to fputc() and getc() is equivalent to fgetc(), it is important to note that the line IO functions puts() and gets() are not equivalent to their counterparts fputs() and fgets(). In fact, the function gets() is inherently flawed in its inability to limit the size of an input string and fgets() should always be used in preference.
For reading and writing binary files, a pair of functions are provided that enable objects to be passed to and from files directly without first converting them to a character string. These functions are
size_t fread(void *ptr, size_t size, size_t nobj, FILE *fp);
size_t fwrite(const void *ptr, size_t size, size_t nobj, FILE *fp);
and they permit objects of any type to be read or written, including arrays and structures. For example, if a structure called Astruct were defined, then an array of such structures could be written to file as follows.
struct Astruct mystruct[10];
fwrite(&mystruct, sizeof(Astruct), 10, fp);
13.2.4 Random Access File Operations
The previous file IO functions progress through a file sequentially. The standard library also provides a means to move back and forth within a file to any specified location. These file positioning functions are
long ftell(FILE *fp);
int fseek(FILE *fp, long offset, int from);
void rewind(FILE *fp);
The first, ftell(), returns the current position in the file stream. For binary files this value is the number of characters preceding the current position. For text files the value is implementation defined. In both cases the value is in a form suitable for the second argument of fseek(), and the value 0L represents the beginning of the file.
The second function, fseek(), sets the file position to a location specified by its second argument. This parameter is an o set, which shifts the file position relative to a given reference location. The reference location is given by the third argument and may be one of three values as defined by the symbolic constants SEEK_SET, SEEK_CUR, and SEEK_END. These specify the beginning of the file, the current file position, and the end of file, respectively. Having shifted the file position via fseek(), a subsequent read or write will proceed from this new position.
For binary files, fseek() may be used to move the file position to any chosen location. For text files, however, the set of valid operations is restricted to the following.
fseek(fp, 0L, SEEK_SET); /* Move to beginning of file. */
fseek(fp, 0L, SEEK_CUR); /* Move to current location (no effect). */ fseek(fp, 0L, SEEK_END); /* Move to end of file. */ fseek(fp, pos, SEEK_SET); /* Move to pos. */
In the last case, the value pos must be a position returned by a previous call to ftell(). Binary files, on the other hand, permit more arbitrary use, such as
fseek(fp, -4L, SEEK_CUR); /* Move back 4 bytes. */
The program below shows an example of ftell() and fseek() to determine the length of a file in bytes. The file itself may be plain text, but it is opened as binary so that ftell() returns the number of characters to the end-of-file.9
1 /* Compute the length of a file in bytes. From Snippets (ansiflen.c) */
2 long flength(char *fname)
3 {
9 This code may not be entirely portable, as the ISO standard does not require compilers to “meaningfully” support the reference SEEK END for binary files. That is, all three symbolic constants are supported for text files, but only the first two are strictly valid for binary files. However, in practice, SEEK END is supported by most compilers.
4 long length = −1L;
5FILE *fptr;
6
7fptr = fopen(fname, "rb");
8if (fptr != NULL) {
9 fseek(fptr, 0L, SEEK END);
10 length = ftell(fptr);
11 fclose(fptr);
12 }
13 return length;
14 }
The third function, rewind(), returns the position to the beginning of the file. Calling rewind(fp) is equivalent to the statement fseek(fp, 0L, SEEK_SET).
Two other file positioning functions are available in the standard library: fgetpos() and fsetpos(). These perform essentially the same tasks as ftell() and fseek(), respectively, but are able to han-dle files too large for their positions to be representable by a long integer.
13.3 Command-Shell Redirection
Often programs are executed from a command-interpreter environment (also called a shell). Most operating systems possess such an interpreter. For example, Win32 has a DOS-shell and UNIX-like systems have various similar shell environments such as the C-shell, the Bourne-shell, the Korn-shell, etc. Most shells facilitate redirection of stdin and stdout using the commands < and >, respectively. Redirection is not part of the C language, but an operating system service that supports the C input-output model.
1 #include <stdio.h>
2
3 /* Write stdin to stdout */
4 int main(void)
5 {
6 int c;
7 while ((c = getchar()) != EOF)
8 putchar(c);
9 }
Consider the example program above. It simply reads characters from stdin and forwards them to stdout. Normally this means the characters typed at the keyboard are echoed on the screen after the user hits the “enter” key. Assume the program executable is named “repeat”.
repeat
type some text 123
type some text 123
However, a file may be substituted for the keyboard by redirection.
repeat <infile.txt
display contents of infile.txt
Alternatively, a file may be substituted for the screen, or for both keyboard and screen as in the following example, which copies the contents of infile.txt to outfile.txt.
repeat <infile.txt >outfile.txt
Further redirection commands are >> and |. The former redirects stdout but, unlike >, appends the redirected output rather than overwriting the existing file contents. The latter is called a “pipe”, and it directs the stdout of one program to the stdin of another. For example,
prog1 | prog2
prog1 executes first and its stdout is accumulated in a temporary bu er and, once the program has terminated, prog2 executes with this set of output as its stdin.
The stderr stream is not redirected, and so will still print messages to the screen even if stdout is redirected
No comments:
Post a Comment