Tags: string.h, string processing, strings in C
The string.h library provides functions for working with zero-terminated strings in C, as well as several functions for working with arrays, which greatly simplify life. Let's look at the functions with examples.
Copy
void * memcpy (void * destination, const void * source, size_t num);Copies a piece of memory from source to destination, num bytes in size. The function is very useful; using it, for example, you can copy an object or move a section of an array, instead of element-by-element copying. The function performs a binary copy, the data type is not important. For example, let's remove an element from the array and shift the rest of the array to the left.
#include
A function swaps two variables
#include
Here I would like to note that the function allocates memory for a temporary variable. This is an expensive operation. To improve performance, it is worth passing the function a temporary variable that will be created once.
#include
Copies a block of memory from source to destination with size num bytes, with the difference that areas can overlap. During copying, an intermediate buffer is used to prevent overlapping areas.
#include
Copies one string to another, including the null character. Also returns a pointer to destination.
#include
You can copy it in another way
#include
Copies only the first num letters of the string. 0 is not automatically added to the end. When copying from a string to the same string, the parts must not intersect (if they intersect, use memmove)
#include
String concatenation
char* strcat(char * destination, const char * source);Adds the string source to the end of destination, while overwriting the first character with a null character. Returns a pointer to destination.
Char* strncat(char * destination, const char * source, size_t num);
Adds the second string characters to the end of the destination num string. A null character is added to the end.
#include
String comparison
int strcmp(const char * str1, const char * str2);Returns 0 if the strings are equal, greater than zero if the first string is greater, less than zero if the first string is less. String comparisons occur character by character, numerical values are compared. To compare strings in a specific language, strcoll is used
Int strcoll(const char * str1, const char * str2); int strncmp(const char * str1, const char * str2, size_t num);
comparing strings by first num characters
Example - sorting an array of strings by the first three characters
#include
Transformation of a string according to the locale. num transformed characters of the source string are copied into the destination string and its length is returned. If num == 0 and destination == NULL, then simply the length of the string is returned.
#include
Search
void* memchr (void * ptr, int value, size_t num);Searches among the first num bytes of the memory location referenced by ptr for the first occurrence of value that is treated as an unsigned char. Returns a pointer to the found element, or NULL.
#include
Returns a pointer to the first occurrence of character in str. Very similar to the memchr function, but works with strings rather than an arbitrary block of memory.
Size_t strcspn(const char * str1, const char * str2);
Returns the address of the first occurrence of any letter from str2 in str1. If no inclusions are found, it will return the length of the string.
Example - find the position of all vowels in a line
#include
Here, notice the i++ line after printf. If it were not there, then strcspn would always return 0, because there would be a vowel at the beginning of the line, and a loop would occur.
For solving this problem, a function that returns a pointer to the first vowel was much better suited.
Char* strpbrk (char * str1, const char * str2)
The function is very similar to strcspn, only it returns a pointer to the first character from the string str1, which is in the string str2. Print all vowels in a line
#include
Returns a pointer to last occurrence character in a string.
Size_t strspn(const char * str1, const char * str2);
Returns the length of a piece of the string str1, starting from the beginning, which consists only of the letters of the string str2.
An example is to print the number that appears in a string.
#include
Returns a pointer to the first occurrence of str2 in str1.
#include
Splits a string into tokens. In this case, tokens are considered to be sequences of characters separated by characters included in the delimiter group.
#include
More functions
void * memset(void * ptr, int value, size_t num); Fills a memory block with num number of value symbols. For example, you can fill an array or structure with zeros. #includeMost Popular Feature
Size_t strlen(const char * str);
Returns the length of a string - the number of characters from the beginning to the first occurrence of null.
Number-string and string-number conversion.
int atoi(const char * str);Converts a string to an integer
#include
Converts a string to a double.
Long int atol(const char * str);
Converts a string to a long
All functions of this kind are called XtoY, where X and Y are type abbreviations. A stands for ASCII. Accordingly, there is a reverse function itoa (not anymore :)). There are a lot of such functions in the stdlib.h library; there is not enough space to consider them all.
Formatted input and buffer output
We can also distinguish two functions sprintf and sscanf. They differ from printf and scanf in that they print data and read it from a buffer. This, for example, allows you to convert a string to a number and a number to a string. For example
#include
In general, working with strings is a more global task than one might imagine. One way or another, almost every application is related to text processing.
Working with locale
char* setlocale(int category, const char* locale);Sets the locale for this application. If locale is NULL, then setlocale can be used to get the current locale.
A locale stores language and region information specific to the operation of input, output, and string transformation functions. When the application runs, a locale called "C" is installed, which is the same as the default locale settings. This locale contains a minimum of information, and the program's operation is as predictable as possible. The "C" locale is also called "". The category constants determine what is affected by a locale change.
It is no coincidence that I placed the topic about strings in the “Arrays” section. Since a string is essentially an array of characters. Here's an example:
char str = "This is just a string";
For greater understanding, the same line can be written like this:
char str = ("E","t","o"," ","p","r","o","s","t","o","","s", "t", "r", "o", "k", "a");
Those. still the same array, only consisting of characters. Therefore, you can work with it, just like with integer arrays.
Now let's try work with strings in c. In the introductory lessons, we learned that symbols belong to integer types, i.e. each character has its own numerical value. Here is an example and its solution:
- you need to convert the entered word to upper case:
#include
#include
Int main()
{
char str = "sergey";
str[i] -= 32;
}
for (int i=0; str[i] != "\0";i++)(
printf("%c", str[i]);
}
getch();
Return 0;
}
To get the code of a number, simply use the %d specifier in the printf function. Yes, and one more important point: ending any lines is a null terminator, which is denoted by a special character - "\0".
Another way to specify a string is to declare it using char*. Here's an example:
char *str = "wire";
Those. a pointer to a string is created and located somewhere in memory.
And here’s how you can enter strings through the scanf operator, which is already familiar to us:
char str; scanf("%s", str);
There are two subtleties here:
- the address taking sign is not needed here, since the name of the array, as we already know, is the address
- The length of the input string should not exceed 15 characters, since the last must be a null terminator. Moreover, the compiler itself will fill in this symbol after your last entered symbol.
Since the C language is a structural language, there are already built-in functions for working with strings and with symbols. To process strings you will need to include the file: ctype.h. The file contains functions for determining case and character format. Basically, everything you might need to know about a character can be done using the functions in the ctype.h file
Sometimes you may need to convert a string to another data type. To convert strings to other types, there is the stdlib library. Here are its functions:
- int atoi (char *str)
- long atol (char *str)
- double atof (char *str)
Sometimes these functions are very helpful, for example, when you need to extract the year or digital value from a string. Working with strings in c (si) is a very important topic, so try to understand this lesson.
Please suspend AdBlock on this site.
So, strings in C language. There is no separate data type for them, as is done in many other programming languages. In C, a string is an array of characters. To mark the end of a line, the "\0" character is used, which we discussed in the last part of this lesson. It is not displayed on the screen in any way, so you won’t be able to look at it.
Creating and Initializing a String
Since a string is an array of characters, declaring and initializing a string are similar to similar operations with one-dimensional arrays.
The following code illustrates the different ways to initialize strings.
Listing 1.
Char str; char str1 = ("Y","o","n","g","C","o","d","e","r","\0"); char str2 = "Hello!"; char str3 = "Hello!";
Fig.1 Declaration and initialization of strings
On the first line we simply declare an array of ten characters. It's not even really a string, because... there is no null character \0 in it, for now it is just a set of characters.
Second line. The simplest way initialization in the forehead. We declare each symbol separately. The main thing here is not to forget to add the null character \0 .
The third line is analogous to the second line. Pay attention to the picture. Because There are fewer characters in the line on the right than there are elements in the array, the remaining elements will be filled with \0 .
Fourth line. As you can see, there is no size specified here. The program will calculate it automatically and create an array of characters of the required length. In this case, the null character \0 will be inserted last.
How to output a string
Let's expand the code above into a full-fledged program that will display the created lines on the screen.
Listing 2.
#include
Fig.2 Various ways displaying a string on the screen
As you can see, there are several basic ways to display a string on the screen.
- use the printf function with the %s specifier
- use the puts function
- use the fputs function, specifying the standard stream for output as stdout as the second parameter.
The only nuance is with the puts and fputs functions. Note that the puts function wraps the output to the next line, but the fputs function does not.
As you can see, the conclusion is quite simple.
Entering strings
String input is a little more complicated than output. The simplest way would be the following:
Listing 3.
#include
The gets function pauses the program, reads a string of characters entered from the keyboard, and places it in a character array, the name of which is passed to the function as a parameter.
The gets function exits with the character corresponding to the enter key and written to the string as a null character.
Noticed the danger? If not, then the compiler will kindly warn you about it. The problem is that the gets function only exits when the user presses enter. This is fraught with the fact that we can go beyond the array, in our case - if more than 20 characters are entered.
By the way, buffer overflow errors were previously considered the most common type of vulnerability. They still exist, but using them to hack programs has become much more difficult.
So what do we have? We have a task: write a string to an array of limited size. That is, we must somehow control the number of characters entered by the user. And here the fgets function comes to our aid:
Listing 4.
#include
The fgets function takes three arguments as input: the variable to write the string to, the size of the string to be written, and the name of the stream from which to get the data to write to the string, in this case stdin. As you already know from Lesson 3, stdin is the standard input stream usually associated with the keyboard. It is not at all necessary that the data must come from the stdin stream; in the future we will also use this function to read data from files.
If during the execution of this program we enter a string longer than 10 characters, only 9 characters from the beginning and a line break will still be written to the array, fgets will “cut” the string to the required length.
Please note that the fgets function does not read 10 characters, but 9! As we remember, in strings the last character is reserved for the null character.
Let's check it out. Let's run the program from the last listing. And enter the line 1234567890. The line 123456789 will be displayed on the screen.
Fig. 3 Example of the fgets function
The question arises. Where did the tenth character go? And I will answer. It hasn't gone away, it remains in the input stream. Run the following program.
Listing 5.
#include
Here is the result of her work.
Fig.4 Non-empty stdin buffer
Let me explain what happened. We called the fgets function. She opened the input stream and waited for us to enter the data. We entered 1234567890\n from the keyboard (\n I mean pressing the Enter key). This went to the stdin input stream. The fgets function, as expected, took the first 9 characters 123456789 from the input stream, added the null character \0 to them and wrote it to the string str . There are still 0\n left in the input stream.
Next we declare the variable h. We display its value on the screen. Then we call the scanf function. Here it is expected that we can enter something, but... there is 0\n hanging in the input stream, then the scanf function perceives this as our input and writes 0 to the variable h. Next we display it on the screen.
This is, of course, not exactly the behavior we expect. To deal with this problem, we need to clear the input buffer after we have read the user's input from it. A special function fflush is used for this. It has only one parameter - the stream that needs to be cleared.
Let's fix the last example so that it works predictably.
Listing 6.
#include
Now the program will work as it should.
Fig.4 Flushing the stdin buffer with the fflush function
To summarize, two facts can be noted. First. On this moment using the gets function is unsafe, so it is recommended to use the fgets function everywhere.
Second. Don't forget to clear the input buffer if you use the fgets function.
This concludes the conversation about entering strings. Go ahead.
The C and C++ library of functions includes a rich set of string and character processing functions. String functions operate on character arrays terminated by null characters. In C language for use string functions it is necessary to include a header file at the beginning of the program module
Since the C and C++ languages do not automatically control the violation of their boundaries when performing operations with arrays, all responsibility for array overflow falls on the programmer's shoulders. Neglecting these subtleties can lead the program to crash.
In C and C++, printable characters are the characters displayed on the terminal. In ASCII environments, they are located between space (0x20) and tilde (OxFE). Control characters have values between zero and Ox1F; these also include the symbol DEL(Ox7F).
Historically, the arguments of character functions were integer values, of which only the low byte was used. Character functions automatically convert their arguments to unsigned char. Of course, you are free to call these functions with character arguments, since characters are automatically elevated to the rank of integers when the function is called.
In the title
C99 added the restrict qualifier to some parameters of several functions originally defined in C89. When reviewing each such function, its prototype used in the C89 environment (as well as in the C++ environment) will be given, and parameters with the restrict attribute will be noted in the description of this function.
List of functions
Check for affiliation
isalnum- Checking whether a symbol is alphanumeric
isalpha- Checking whether a symbol belongs to letters
isblank- Check for empty character
iscntrl- Checking whether a symbol is a control symbol
isdigit- Checking whether a symbol is digital
isgraph- Checking whether a character is a printed character but not a space
islower- Checking whether a character is lowercase
isprint- Checking whether a symbol is a printed one
ispunct- Checking whether a symbol belongs to punctuation marks
isspace- Checking whether a character is a whitespace character
isupper- Checking whether a character is uppercase
isxdigit- Checking whether a character is a hexadecimal character
Working with character arrays
memchr- Searches the array to find the first occurrence of a character
memcmp- Compares a certain amount of characters in two arrays
memcpy- Copies characters from one array to another
memmove- Copies characters from one array to another, taking into account the overlap of arrays
memset- Fills a certain number of characters in an array with a given one
String manipulation
strcat- Appends a copy of one line to a given one
strchr- Returns a pointer to the first occurrence of the low byte of the given parameter
strcmp- Compares two strings in lexicographic order
strcoll- Compares one string with another according to the setlocale parameter
strcpy- Copies the contents of one line to another
strcspn- Returns a string that does not contain the specified characters
strerror- Returns a pointer to a line containing a system error message
strlen- Returns the length of a null-terminated string