C

  • Created by Dennis M. Ritchie at Bell Labs, between 1969 and 1973
  • Successor of the B language, which was introduced around 1970
  • Invented to write the UNIX Operating System (completed in 1973)
  • Basis of many other programming languages
  • Good to know before C++
  • Compiled language

Compiling

Compiling is the act of taking a source code text file and running it through a compiler program to produce a program that can run on the computer.

 

 

Programmer writes the source code in plain text, often in an IDE (Integrated Development Environment)

 

The program is then compiled (converted) to a language that the computer can understand, and run as a program.

 

This is normally a two step process that consists of:

  1. Pulling in any specified header files and creating Assembly (or object) code native to the processor
  2. Using a linker program to pull in any other object or library files required, to finally output machine code (1's and 0's)

This resulting code is usually known as a binary or executable file.

 

The program can then be run (or executed) on the computer.

 

The gnu C compiler is invoked by opening a terminal and entering gcc on the command line.

 

Typically gcc takes a number of arguments, such as the file to be compiled and the output name:

 

gcc hello_world.cpp -o hello_world

 

*in windows, the output file type would need an .exe extension added to tell windows that its an executable program:

 

gcc hello_world.cpp -o hello_world.exe

Anatomy of a C program

The basic outline of a C program consists of:

  • Comments
  • Preprocessor commands
  • Type definitions
  • Function prototypes
  • Variables
  • Functions
  • All programs must have a main() function
  • All variables and functions must be declared before they are used in main()

 

Comments
Use // for a single line
Use /* for

Multiple lines
*/

 

Preprocessor
#include <stdio.h>
Brings in all the code from the specified file <stdio.h>, so that all of its functions are available to the program

 

The main() function is reserved in C. All C programs must have a main() function. Its where execution begins. Requires a return type, in this case int, which can be seen in the last line of the code block returning a value of 0.

 

The { curly braces } indicate the start and end of the code block for the function.
Statements within the code block are terminated with a semi-colon ;

 

The printf() function is made available via #include <stdio.h> to send the arguments within its parentheses to the standard output (usually the screen). Other data can be included as arguments, such as variables, utilising format specifiers within the output string, as required for the type of variable, sent to the screen.

 

Whitespace, to make code more readable

 

return 0 ; causes the function to return an error code of 0 (no error) to the shell that started execution.

Statements & Expressions

A Statement is a line of code that does something / a unit of execution (e.g. a for loop), and is terminated with a semi-colon;

 

An Expression is a unit of computation, which evaluates something to a value (e.g. x = 1 + 2). Thus an expression has a value (which might be void in some cases).

 

Some statements are also expressions, and vice-versa!

 

Most statements are expressions, but what makes an expression a statement is the fact that it is terminated by a semi-colon;

 

Compound Statements are enclosed within {curly braces} on multiple lines; aka a statement block or a code block.

Preprocessor directives

Compiler instructions to be carried out before the actual compilation.

 

Always begins with a hash symbol #

 

Not terminated with a semi-colon ;

 

Used when compiling the program.

 

The most common preprocessor directives are: #include and #define

 

#define Defines a preprocessor macro
#include Substitues a preprocessor macro
#undef Undefines a macro
#ifdef Returns TRUE if macro is defined
#ifndef Returns TRUE if macro is not defined
#if Tests if a compile time condition is TRUE
#else the alternative for #if
#elif #else and #if in one statement
#endif Ends preprocessor conditional
#error Prints error message on stderr

#include

#include instructs the compiler to merge the specified file before the program is converted into machine code

 

#include <stdio.h>

 

By including the stdio.h (.h being for header file), access to its various functions become available to the program.

 

For example, functions to help reading and writing data, including printf() for sending data to the screen and scanf() for receiving data from the keyboard.

#define

The #define preprocessor takes the form: #define identifier value

Compile & run:

A circle with a radius of 20cm has an area of: 1256 cm.

printf()

The print function from the stdio.h library enables data to be sent to the stdout (usually the screen).

 

printf() takes a "control string" (enclosed within double quotes) followed by optional data, often in the form of variables as its arguments, to be printed on the screen

 

takes the format of: printf("control string"data) ;

 

e.g.

printf(“Hello, World!n”) ;

printf(“The data is the integer: %d \n”16) ;

printf(“The data is the integer: %d \n”myVar) ;

 

The %d is a format specifier (aka conversion characters), that indicates the format of the data being passed in to the string.

 

Each format specifier will take the next value to the right of the control string.

 

e.g.  To insert three values into the control string, use their appropriate format specifier at the position required in the control string:

 

printf(“The first number is: %d, followed by a floating point number: %f, and the character %c.”1799.98, ‘D’) ;

 

Which would output the following on the screen:

The first number is: 17, followed by a floating point number: 99.98, and the character D.

Predefined Macros

Macro Description
__DATE__ The current date as a character literal in "MMM DD YYYY" format
__TIME__ The current time as a character literal in "HH:MM:SS" format
__FILE__ This contains the current filename as a string literal.
__LINE__ This contains the current line number as a decimal constant.
__STDC__ Defined as 1 when the compiler complies with the ANSI standard.

e.g.:

Compile & run:

File :test.c 

Date :Jun 2 2012
Time :03:36:24
Line :8
ANSI :1

Data Types

Informs the compiler how much memory to allocate for a variable.

char 

short

int

long

float

double

bool

void

Single character 

Short integer

Integer whole number

Long integer

Floating point number correct to 6 decimal places

Floating point number correct to 10 decimal places

Boolean value of TRUE (1) or FALSE (0)

No value returned

D 

65535

4294967295

4294967295

17.031963

12.0270140242

True or 1, False or 0

-

1 Byte 

2 Bytes

4 Bytes

4 Bytes

4 Bytes

8 Bytes

1 Byte

-

0-255 

0-65535

4294967295

4294967295

+/-3.4e+/-38

+/-1.7e+/-308

true or false

-

Type Casting

Explicitly converts data types

 

(target data type)identifier

 

Compile & run:

2
2.470588

Format Specifiers

%c character char
%d, %i integer int
%e, %E, %f, %F, %g, %G floating point float
%s String string
%o Octal unsigned int
%x, %X Hex unsigned int
%p pointer
%n number of characters written so far
%% Outputs a % sign

 

%5d integer, at least 5 wide
%7f floating point, at least 7 wide
%.3f floating point, with 3 decimal places after the point
%4.2 floating point, at least 4 wide and 2 decimal places

Escaped Characters

Specific characters need escaping, by prefixing a backslash before the desired character:

Escape sequence Meaning
\\ \ character
\' ' character
\" " character
\? ? character
\a Alert or bell
\b Backspace
\f Form feed
\n Newline
\r Carriage return
\t Horizontal tab

scanf()

The scan function from the stdio.h library enables data to be read from the stdin (usually the keyboard).

 

scanf() takes a "format string" (enclosed within double quotes) followed by a variable name as its arguments, to be read data from the keyboard.

 

The scanf() format string requires a format specifier, that indicates the format of the data being passed in to the variable. e.g. %d

 

takes the format of: scanf("format string", &variable) ;

 

scanf() requires that an & (address of operator) is used for all variables, apart from strings and pointers

 

e.g.

scanf(%d, &myVar) ;

scanf(%s, myString) ;

 

Note that strings do not require an &, since strings use an array which already comprises of an address pointer

Keywords

auto else long switch
break enum register typedef
case extern return union
char float short unsigned
const for signed void
continue goto sizeof volatile
default if static while
do int struct _Packed
double

Identifiers

Used give names to variables, functions, labels, defined types

 

Consists of the 26 letters of the Latin alphabet (upper and lower case), plus the 10 western Arabic numerals, and the ASCII underscore character:

Cannot start with a number

Cannot be any of the language's reserved words

 

Case sensitive

 

Variables

Container / bucket for information / data, whose size in memory is determined by their data type.

Variables are created by declaring their data type and giving them a name: int myVar ;
They are then defined by assigning a value to the variable name: 

Using the = assignment operator.

myVar = 42 ;
Or together: int myVar = 42 ; 

char myLabel = “d” ;

The action of declaring a variable tells the compiler how much memory space is required for this variable, and giving it a name acts like an alias for its memory location.

 

For example, a variable of the data type int takes up 4 bytes (32 bits) of memory. When compiled and run, the program instructs the OS to allow 4 bytes of memory space for that int, and so on for all the other variables in the program.

 

All variables must, at least, be declared before they are used, and can be defined later. If a variable is declared after main() but attempted to be used within main(), the compiler will produce an error.

 

First off, we declare the variable by specify its data type and giving it a name:

int myVar ; 

 

The use of "myVar" is simply an identifier, a label, and any alphanumeric combination can be used.

 

Now that the variable "myVar" has been declared, we can assign data to it, by use of a SINGLE equals = character.

*(Note that assignment uses a single = character. This often gets confused with double == characters used for equality testing)

 

myVar = 42 ; 

 

Giving us the required data in the variable:

 

New data can be assigned to the variable, replacing the old data (which is then discarded if nothing else is done with it beforehand):

myVar = 17 ; 

 

Leaving us with the new data in the variable:

Declarations

All variables must be declared before they can be used

 

A declaration specifies a data type and a list of one, or more identifiers (names) to call the variable(s)

 

int myVar ;

 

or:

int myVar, cat, dog, potato ;

 

Multiple identifiers are separated by a comma. The above declares four variables of the data type int called myVar, cat, dog, potato.

Definitions

Definitions – aka initialisation

 

Variables are defined/initialised by use of the assignment operator =

 

int myVar = 500 ;

Constants

Using const prefix, in the form: const data type identifier = value ;

Compile & run:

value of area : 80

 

Good practice to define a constant's identifier in UPPERCASE

typedef

Allows a user defined alias to be used based on existing data types.

 

The typedef keyword is followed by an existing data type and the user defined name, and is then used as an alias of the declaration.

 

typedef unsigned short USHORT ;

 

Thus USHORT is now an alias for the longer 'unsigned short' data type.

 

USHORT myVar = 2 ;

 

instead of writing:

 

unsigned short myVar = 2 ;

enum

Enumerated data types

 

Used to define a list of integer constants

Compile & run:

Stoner is at position 2

 

Like arrays, the first position starts from ZERO.

 

Other integer values can be assigned:

Compile & run:

Pedrosa has an integer value of 63

 

 

If left unitialised each would have a unique value

Storage Classes

Defines when, where and how storage is allocated and de-allocated for a variable

 

auto

  • Default storage class
  • Temporarily allocated storage for the duration of the block it is contained within
    • i.e. allocated upon initialisation within the block, and deallocated at the end of the block
  • New storage is allocated each time the program executes the specific 'local' block
    • Upon completion of a specific local block local variables are released and no longer have any meaningful value

Compile & Run:

myVar is: 42
myVar is: 42
myVar is: 42

 

 

static

  • Default for Global variables
  • Permanently allocated storage for the duration of the program
  • Value of a static variable in a function is retained between repeated function calls to the function

Compile & Run:

myVar is: 42
myVar is: 43
myVar is: 44

 

 

register

  • Used to define variables that should be stored in a register rather RAM
  • Therefore maximum size is the same as the register size
  • Can not have the unary & operator applied, as it has no address
  • Used for frequent/fast access, e.g. counter

Compile & Run:

value of i in register is: 1
value of i in register is: 2
value of i in register is: 3

 

 

extern

 

Used to call upon a variable that is defined within an external file

 

Compile & Run (gcc extern-1.c extern-2.c -o target):

myVar is 5

Operators

Arithmetic
+
-
*
/
%
++
– –
Addition
Subtraction
Multiplication
Division
Modulus
Increment
Decrement
a
a
a
a
a
a
a
+
-
*
/
%
++
– –
b
b
b
b
b
b
b
Assignment
=
+=
-=
*=
/=
%=
Assign to
Add and assign
Subtract and assign
Multiply and assign
Divide and assign
Modulus and assign
a
a
a
a
a
a
=
+=
-=
*=
/=
%=
b
b
b
b
b
b
Comparison
==
!=
>
<
>=
<=
Equality
Inequality
Greater than
Less than
Greater than or equal
Less than or equal
a
a
a
a
a
a
==
!=
>
<
>=
<=
b
b
b
b
b
b
Logical
&&
||
!
And
OR
NOT
a
a
a
&&
||
!
b
b
b

 

 

& address of

* derefencing

. member

-> infix operator

Prefix / Postfix

The use of the shorthand ++ incrementer or - - decrementer, either before (Prefix) or after (Postfix) the variable it is being applied to.

 

Prefix:  ++myVar

 

Postfix:  myVar++

 

Prefix will increment the variable before it is used

 

Postfix will use the variable before it is incremented, then increment the variable

 

If used on their own, the result is identical. It is only when combined with other expressions that differences appear:

Compile & run:

myVar starts off as: 5
Prefixing myVar makes a: 6
myVar is now: 6
Postfixing myVar makes b: 6
myVar is now: 7

Precedence

Simplified!

  1. * / before + -
  2. ( ) around everything else

If in doubt ( use these )

if

Consists of a boolean expression (evaluating to true or false), followed by one or more statements.

 

Can be followed by an (optional) else statement.

 

 

Simple code to show if condition is true:

 

Simple code to show if condition is false:

Ternary

Similar to the if statement, tests if a condition is true or false.

 

condition ? true : false ;

 

Compile & run:

5 is less than 7

for

The for loop will iterate through the loop, starting from the specified initial condition and will iterate through the statements enclosed within the { code block } until the test is false. The initial condition, test and iterator are separated with a ;

 

for (initialiser; test; iterator ) {
statements ;
}

 

Compile & run:

The value of i is now: 0
The value of i is now: 1
The value of i is now: 2
The value of i is now: 3
The value of i is now: 4

while

Loops through the statements enclosed within the { code block } until the test expression is false:

 

while (test expression) {
statements ;
}

 

do

Loops through the statements enclosed within the { code block } until the following while test expression is false:

 

do {
statements ;
} while (test expression) ;

 

*note ending ; after the while test expression

 

 

break

Terminates execution of the nearest enclosing do, for, switch, or while statement in which it appears.

 

 

continue

Passes control to the next iteration of the nearest enclosing do, for, or while statement in which it appears, bypassing any remaining statements within that code block.

The above will continue with the for loop, bypassing the next printf() statement when the value of i equals 5

switch

Tests a condition against any number of case conditions.

 

Break must be used to terminate the switch, or the next statement will be executed.

 

A default case is utilised for when no condition is met.

 

 

Scope

Validity, Lifetime or Visibility of a variable.

 

Identifies where an object can be used.

 

2 types:

  • Global
  • Local

Any variable defined outside of any function in the program is considered global throughout the program.

 

Variables defined inside a function are considered locally scoped, and are valid within that function only. That is, the scope of their validity is only within the function in which they exist.

 

A locally scoped variable will override that of a global variable.

 

I have purposefully used the same name for the unique variable, to show its scoped validity.

 

Each function uses its own value for the same named variable, and thus do not use the global variable of the same name.

 

As soon as the function has executed, it’s data is lost and control is passed back to the main program with its set of variables.

 

Compiling & running the above would produce:

 

The value of the unique global variable is: 42
The value of the unique myFunction variable is: 5
The value of the unique myFunction variable is: 7
The value of the unique myFunction variable is: 12

Functions

Also known as sub-routines!

 

All programs must at least have the main( ) function.

 

A function takes the form of:

 

return_data_type function_name ( argument, list)

{

body of the function ;

}

 

The return_data_type is the data type of the value to be returned by the function.

 

The function_name is your preferred name for the function. The function signature, consists of the function name and argument list.

 

The argument list is a list of none or more parameters / arguments that are passed to the function, specifying its data type and name to be used within the function.

 

The function body consists of one or more statements within the function's { statement block } that define what the function does.

 

To invoke a function, provide the name of the function along with any arguments within its parentheses.

Prototypes

All variables and functions must be declared before they are used in main()

 

Declaring a function prototype informs the compiler of what to expect in terms of memory resources for that function, by letting main() know what data type it returns and the argument data types the function expects.

 

A function prototype, consists of the return data type and signature of the function, terminated by a semi-colon ; without the body of the function.

 

The function can then be defined after main()

 

 

This methodology is called function prototyping, and provides a neat way to structure your programs with as many functions as required following main().

 

This gives the code good readability, as one can easily dip in and out of main() when reading the code, to view the required function, understand what its doing and what its returning (if anything) back into main()

Pass by Value

If the argument within the parentheses of a function call is a variable, constant (or some other expression) it means that it is first evaluated and a COPY is then passed to the function as its input.

 

The function uses the copy of the passed in argument, and executes any statements upon that copy locally within the function.

 

Once the execution within the function is complete, control returns to the point of which the call was made, and any locally changed data within the called function is lost.

Compile & run:

1st value of myVar in main: 42
1st value of myVar in function: 42
2nd value of myVar in function: 17
2nd value of myVar in main: 42

Pass by Reference

To allow a function to act upon data that is being passed to it directly, the ampersand & 'address of' operator is used on the variable as the argument within the function call.

 

function( &myVar ) ;

 

The function's interface then accepts the address of the passed in argument as a pointer:

 

function( *pointer );

 

which allows direct access to the variable being passed in.

Compile & run:

1st value of myVar in main: 42
1st value of myVar in function: 42
2nd value of myVar in function: 17
2nd value of myVar in main: 17

Recursion

Process of a function that calls itself.

 

Must have an ending point, or it will go into an infinite loop.

 

Compile & run:

Recursion: 5
Recursion: 4
Recursion: 3
Recursion: 2
Recursion: 1
Factorial 5! = 120

Pointers

Pointers are variables that contain the memory address of the variable/object they point to.

 

All variables exist within a location in memory. This memory location is normally expressed as a hexadecimal number, such as: 22fd6c

 

The value of the memory location can be accessed by using the ampersand & 'address of' operator:

e.g.

printf("The hex address of myVar is %p n", &myVar) ;

 

Pointers are declared by providing the data type * and identifier:

int *myVar ;

int* myVar ;

int * myVar ;

 

All of the above are equivalent. The positioning of the * is down to programming style.

 

Pointers must be declared with the matching data type for the object they point to, i.e. a pointer to an int must be an int pointer, a pointer to a char must be a char pointer:

 

int *myInt ;  //pointer to an int

char *myChar ;  //pointer to a char

float *myFloat ;  //pointer to a float

 

This is important because it tells the compiler how much memory to allocate for operations on the item being pointed at, e.g. dereferencing an int should read 4 bytes from where the address it is pointing to, whereas a char pointer should read just the 1 byte (for the char data type) from the address it is pointing to.

 

The content of the variable being pointed to, is accessed using the * 'dereferencing' operator.

& Address of Operator
* Dereferencing Operator

A pointer is defined by assigning the address of the target variable to the pointer:

 

int *p_myVar = &myVar ;

 

*p_myVar can now be used as if one were directly referring to myVar itself.

 

The contents of myVar can now be changed by assigning a new value either directly or via the pointer *p_myVar

 

Compile & run:

Memory address of myVar is: 000000000022FD6C
Contents of myVar is: 5
Memory address of the pointer p_myVar is: 000000000022FD60
Contents being pointed to by p_myVar is: 5 

Memory address of myVar is: 000000000022FD6C
Contents of myVar is: 42
Memory address of the pointer p_myVar is: 000000000022FD60
Contents being pointed to by p_myVar is: 42

Memory address of myVar is: 000000000022FD6C
Contents of myVar is: 17
Memory address of the pointer p_myVar is: 000000000022FD60
Contents being pointed to by p_myVar is: 17

 


Useful to remember:   *(&x) == x ;

Arrays

Sequential collection of same data type variables.

 

Declared by use of [square brackets]: data type array Name[array size] ;

 

Values in the array are referred to as elements, and ALWAYS START FROM ZERO!

 

int myArray[10] ;  //declares an array called myArray that can contains 10 elements of the int data type

 

Defined by assiging values, separated by a comma, within {curly braces}:

 

int myArray[10] = {2, 3, 7, 12, 14, 17, 42, 63, 70, 99} ; //must have exact number of elements

 

int myArray[] = {3, 13, 1, 56, 33} ; // array will be automatically sized to hold the number of elements defined

 

Single values can be defined by assigning the value to the specific element number, starting from zero!

 

myArray[6] = 21 ;

 

Specific elements can be accessed by assigning an array element to a variable

 

int myVar = myArray[4] ;

 

Or they can be used just on their own

 

printf("The value of the element in array position 5 is %d n", myArray[4]) ;

 

Multi-dimensional Arrays

Array of arrays

 

Declared by use of two or more sequential [square brackets]:

 

data type array Name[array size 1] [array size 2] ;

 

Akin to N rows & columns in a spreadsheet:

Column 0 Column 1 Column 2 Column 3
Row 0 arrayName[ 0 ][ 0 ] arrayName[ 0 ][ 1 ] arrayName[ 0 ][ 2 ] arrayName[ 0 ][ 3 ]
Row 1 arrayName[ 1 ][ 0 ] arrayName[ 1 ][ 1 ] arrayName[ 1 ][ 2 ] arrayName[ 1 ][ 3 ]
Row 2 arrayName[ 2 ][ 0 ] arrayName[ 2 ][ 1 ] arrayName[ 2 ][ 2 ] arrayName[ 2 ][ 3 ]

 

int myMultiArray[ 3 ] [ 4 ;  //declares a  two dimensional array called myMultiArray that contains 6 elements; e.g. in 3 rows / 4 columns

 

Defined by assiging values, separated by a comma, within {curly braces} for each dimension:

 

int myMultiArray[ 3 ] [ 4 ] = { { 0, 1, 2, 3 }, { 4, 5, 6, 7 }, { 8, 9, 10, 11 } } ; //same as:

int myMultiArray[ 3 ] [ 4 ] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 } ; //no need for nested curly braces

 

Single values can be defined by assigning the value to the specific element number, starting from zero for each dimension!

 

myMultiArray[ 2 ] [ 2 ] = 42 ; //sets the element in row 2, column 2 to 42

 

Specific elements can be accessed by assigning an array element to a variable

 

int myVar = myArray[ 1 ] [ 3 ] ; //assigns myVar with the element  contained in row 1, column 3

 

Compile & run:

Element [2][2] is: 10
Element [2][2] is now: 42
myVar has now been assigned the [2][2] element: 42

Strings

Character array, terminated by a null character .

 

Defined a number of ways:

 

Characters enclosed individually with 'single quotes'

char myArray[8] = { 'D', 'e', 'r', 'r', 'i', 'c', 'k', ''  } ;

char myArray[ ] = { 'D', 'e', 'r', 'r', 'i', 'c', 'k', ''  } ;

 

or within "double quotes"
char myArray[8] = "Derrick" ;
char myArray[ ] = "Derrick" ;

 

Compile & run:

0 = D,
1 = e,
2 = r,
3 = r,
4 = i,
5 = c,
6 = k,
7 =  ,

 

notice that the 8th element contains nothing = the NULL character

 

Compile & run:

Derrick

 

Alternatively, the %s string format specifier can be used to print the whole string

Compile & run:

Derrick

 

There are a number of predefined functions made to work with strings, to be accessed here!

String Functions

Requires the preprocessor directive:

 

#include <string.h>

 

strcat() concatenates a second string onto the end of the first string

Structures

Collection of different data types under a single identifier name, terminated with a semi-colon ;

 

Declared as follows:

 

struct identifier {

member1 ;

member2 ;

memberN ;

} ;

 

e.g.

struct bike {

char rider[ ] ;

int place ;
float speed ;
} ;

 

Once declared, the named structure creates a new data type that can be used just like any other data type:

 

data type identifier ; //e.g. int myVar ;

bike ducati ;
bike yamaha, honda ;

 

Alternatively a structure can be declared with identifiers after the closing } and before the ; as follows:

 

struct bike {

char rider[ ] ;

int place ;
float speed ;
} ducati, yamaha, honda ;

 

Individual members can be referred to by using the . dot member operator, which connects the structure name with the member name:

 

ducati.speed

 

Compile & run:

Lorenzo, 254.12, 1
Pedrosa, 267.83, 2
Valentino, 246.87, 6
DePuniet, 0.00, 0
, 0.00, 0

 

 


 

typedef struct

provides a shortcut instead of having to write out the full syntax, struct myStruct { . . . } ;

 

typedef struct identifier {

int memberOne ;

char memberTwo ;

float memberThree ;

} myStruct ;

 

myStruct can now be used instead of having to use the keyword, struct, before each declaration of a new structure.

Unions

Unions are similar to structures but only one member within the union can be used at a time, due to it having a shared memory for all members.

 

union Publication {

char name[20] ;
float price ;
int pages ;
};

 

Unions are used when just one condition will be applied and only one variable is required.

 

Compile & Run:

The best OS is fun, free and better! It costs less than $0.01 and has about 1200 developers on each release!

Standard Streams

stdin

The standard input is usually received from the keyboard

 

stdout / stderr

Standard out and standard error are usually sent to the screen

 

Standard In Input from Keyboard stdin 0
Standard Out Output to Screen stdout 1
Standard Error Output to Screen stderr 2

 

Command Line Arguments

argc is the number of arguments on the command line including the program name; argument count

 

argv contains the actual arguments; argument value

 

argv[0] contains the name of the program, argv[1] is the first argument, and so on.

Compile & run:

example.exe
one
two
three

Unformatted I/O

Requires the preprocessor directive:

 

#include <stdio.h>

 

Only works with the char data type

 

getch()

Reads a single character from stdin / the keyboard

Unbuffered, therefore immediate

identifier = getch() ;

 

getche()

Same as above buts echoes the character back to the screen

identifier = getche() ;

 

getchar()

Reads a single character from stdin / the keyboard

Echoes back to the screen

Buffered, therefore waits for Enter key to be pressed

identifier = getchar() ;

 

gets( identifier )

Same as above, reads an entire line of text (including spaces) until the Enter key is pressed

gets( identifier ;

 

 

putch()

Displays a single character on stdout / the screen

putch( identifier ) ;

 

putchar ()

Displays a single character on stdout / the screen

putchar ( identifier ) ;

 

puts( )

Displays a string of text (including spaces) on stdout / the screen

Automatically adds a new line n at the end of the entered string

puts( identifier ) ;

 

File I/O

Requires the preprocessor directive:

 

#include <stdio.h>

 

Requires a file pointer:

 

FILE *file-variable-name ; // sets up a pointer for the file to be opened

 

FILE is provided from stdio.h

 

When the pointer is declared the target file can be opened and assigned to the file-variable-name

 

file-variable-name = fopen( target-file-name, mode ) ;

Mode Description
r Opens an existing text file for reading
w Opens or creates a text file for writing. If file exists, truncate to zero length, else create
a Opens or creates a text file for writing from end of file, appending to any existing data
rb Opens an existing binary file for reading
wb Opens or creates a binary file for writing. If file exists, truncate to zero length, else create
ab Opens or creates a binary file for writing from end of file, appending to any existing data

 

FILE *filePointer ; //declares a variable pointer called filePointer

 

filePointer = fopen( "C:target-file.txt", "w" ) ; //target-file.txt is now open in write mode, and assigned to filePointer

 

Data can now be written to the file via filePointer:

 

fprintf( filePointer, "Here's some data being written to a text file" ) ;

 

The file must then be closed:

 

fclose( filePointer ) ;

 

Compile & Run produces a file called target-file.txt, that contains the text "Here's some data being written to a text file."

 


 

fgets() and fputs() can also be used to read/write data to/from the Standard Streams, or a file

 

fgets(name, size-of-name,file) //where file could be stdin

fputs(string, file) //where file could be stdout

 

This simple program reads text entered from the keyboard and prints it back out on the screen:

 

A variation on the above, using a file to store the data in:

 

Another variation, reading data from a file and displaying it on the screen:

 

 


*Note: the <stdio.h> fscanf() function has not been explained since it felt to be generally unreliable seeing that it can only read preceding characters up to the first space character, any characters after the space are ignored and it is for that reason that it is thought to be inappropriate for use in a general purpose robust file reader.

Error Handling

Based upon return values.

 

Requires the preprocessor directive:

 

#include <errno.h>

 

errno defined within errno.h header file in the C standard library, used by library functions to store a value upon error detection.

 

perror() function displays a passed string followed by the textual representation of the errno.

 

strerror() function returns a pointer to the text representation of the errno.

 

This piece of code tries to open a non-existent file and prints error messages to the screen:

Passing Array to Function

An array can be passed into a function either as:

  • pointer
  • sized array
  • unsized array

The array identifier and its size are sent as arguments to the function on line 22:

avgGrade = avgFunc( myArray, 7 ) ;

 

The function receives these as a pointer to the array and its size, on line 3:

double avgFunc(int *passedArray, int passedSize)

 

The first argument in the function's interface could alternatively have been:

  • int *passedArray
  • int passedArray[7]
  • int passedArray[ ]

Arrays and Pointers

An array's identifier (name or label) is actually a pointer to the first element of an array. That is myArray === &myArray[0]  (or even &myArray ). The only difference is that the array name is a constant pointer (cannot change the location it points at).

 

int myArray[5] = {1, 2, 3, 4, 5} ;

 

The address of the first element is: &myArray[0]

 

Therefore &myArray[0] is the same as myArray, as are their addresses

 

The value in &myArray[0] is myArray[0], and the value in myArray is *myArray, and therefore myArray[0] is the same as *myArray

 

 

Returning Array from Function

Since it is not possible to return the address of a local variable outside of a function it will need to be defined as a static variable, thus allowing a pointer (to that static variable) to be returned.

 

Pointer to an Array

An array name is a constant pointer to the first element of the array.

 

(constant pointer cannot change the location it points at)

 

int *myPointer ;

int myArray[42] ;

 

myPointer = myArray ;

 

myArray is a pointer to &myArray[0] which is the first element of myArray

 

Note: line 13 gives a brief intro to pointer arithmetic...

Function Overloading

Function overloading is the process of using the same function name for multiple functions, but using different parameters in the function call.

 

This allows the correct function to be called depending on the parameters sent.

 

Consider a function to add two numbers:

This is fine for two integers, but what if we wanted to add floating point pumbers? The precision would be lost.

 

We can create another function to add two floating point numbers:

But how does the compiler know which version of the add() function to use? The answer lies in how the add() function is called.

 

Basically, it depends on the context, inasmuch as the arguments used in the call.

 

Following on from the above code samples, if two ints are provided then the compiler will know that we mean to call the add(int a, int b) function, and consequently if we provide two floatinf point numbers the compiler will know that we mean to call the add(double a, double b) function.

 

add( 5, 7) ; //will call the add(int a, int b) function

 

add(3.141, 2.714) ; //will call the add(double a, double b) function

 

We could go on and define multiple add functions, and each would be called independently as long as each add() function has unique parameters, e.g. three ints, or two ints and a double, etc.

Variable Arguments

Allows a variable number of arguments to be used, where they may not be initially known

 

Use <stdarg.h> header file to to provide macros and functions for variable arguments

 

The function takes an int as its first argument, followed by an ellipses (three dots ... )

 

Requires the following, defined in stdarg.h:

  • va_list type variable
  • va_macro to initialise va_list to the argument list
  • va_arg macro to access each argument in the argument list
  • va_end to clean up memory assigned to va_list

*Note:

  • In the 1st call of average() there is no space between the first argument, 7, and the data to be acted upon, and that it indicates the number of arguments to perform the function call upon.
  • In the 2nd & 3rd calls of average(), a variable has been used to perform this task.
  • In the 2nd call of average(), the same data set has been provided, but only acts upon the following number of arguments indicated in the numArgs variable, in this case 5 (instead of the above 7)

Compile & Run:

Average of 2, 3, 4, 5, 6, 7, 8 = 5.000000
Average of 2, 3, 4, 5 = 4.000000
Average of 5, 10, 15 = 10.000000

Pointer Arithmetic

Pointers must be declared of a specific data type according to the item they are pointing to:

 

int *myInt ;  //declares a pointer to an int

char *myChar ; //declare a pointer to a char

 

The pointer data type instructs the compiler to allocate the correct amount of memory required for the item being pointed at. This allows arithmetic operations on pointed to objects to be carried out on an individual item of the correct data type size.

 

Arithmetic operations on the pointer are incremented/decremented according to the data type, not a single address location. i.e. the address location is incremented/decremented by the size of the data type. An int would change by 4 Bytes, a char would change by 1 Byte, and so on.

 

The usual example is in relation to an array:

*Note: on line 9 the pointer myPtr is being incremented in the iteration of the for loop.

 

Compile & Run:

myArray[0] contents = 12 at address location 22fd40
myArray[1] contents = 14 at address location 22fd44
myArray[2] contents = 17 at address location 22fd48
myArray[3] contents = 42 at address location 22fd4c
myArray[4] contents = 63 at address location 22fd50
myArray[5] contents = 70 at address location 22fd54

 

Notice that the address is being incremented by 4 Bytes for each iteration of myPtr++. This is due to the pointer type being declared as an int pointer, with an requiring 4 Bytes of memory.

 

This time a char pointer is declared and decremented:

Compile & Run:

myArray[7] contents = k at address location 22fd56
myArray[6] contents = c at address location 22fd55
myArray[5] contents = i at address location 22fd54
myArray[4] contents = r at address location 22fd53
myArray[3] contents = r at address location 22fd52
myArray[2] contents = e at address location 22fd51
myArray[1] contents = d at address location 22fd50

 

Notice that each decrement is by 1 Byte, for a char data type.

Array of Pointers

Declare a pointer array with the number of elements to be pointed to:

Compile & Run:

myArray[0] contains 12
myArray[1] contains 14
myArray[2] contains 17
myArray[3] contains 42
myArray[4] contains 63
myArray[5] contains 70

 

 

 

Or through a char array:

Compile & Run:

myArray[0] contains Casey Stoner
myArray[1] contains Danny Pedrosa
myArray[2] contains Valentino Rossi
myArray[3] contains Jorge Lorenzo

Pointer to Pointer

A pointer to a pointer is literally that, by which like any other pointer it contains the address of the item being pointed to, in this another pointer's address

 

Double asterisk ** is used to indicate level of indirection: int **myPtr2Ptr ; //declares a pointer to a pointer

 

 

 

Compile & Run:

myVar at address: 22fd6c, contains: 1942
*ptr at address: 22fd60, contains 22fd6c and points to 1942
**ptr2ptr at address: 22fd58 contains 22fd60 and points to 1942

 

 

Passing Pointers to Functions

A pointer can be passed into a function by using the ampersand & address of operator for the variable, as the parameter in the function call.

 

The receiving function accepts the parameter as a pointer of the same data type, which is then used within the function as per normal pointer operation:

Compile & Run:

Number of seconds: 1360667276

Void Pointer

The void pointer has no data type and can assigned to the memory address of ANY data type.

 

aka Generic Pointer.

 

Declared just like any other pointer:  void *vPtr ;

 

Or declared and initialised:  void *vPtr = &myVar ;

 

Compile & Run:

Address of myInt: 0x28ac64
Value of myInt, via void pointer: 17
Address of myFlt: 0x28ac60
Value of myFlt, via void pointer: 42.000000

 

 

Cannot dereference a void pointer.

 

*vPtr = 22 ;  //will not work!

 

Unless it is cast:

 

*(int *)vPtr = 36 ;  //here, the void pointer is being cast first of all

 

Compile & Run:

myInt: 36

malloc() & free()

3 main types of memory:

  • Data Segment
    • static code, globals, static
  • Stack
    • automatic local variables
  • Heap
    • dynamic, unknown until runtime

If the variables are not known until runtime, they cannot be allocated storage in the Data segment or Stack, since the compiler doesn't know how many variables there might be.

 

Therefore Dynamic Memory Management is utilised to allocate storage space at run time, as and when needed.

 

The malloc() and free() functions perform this functionality and are part of the stdlib.h, standard library header.

 

Therefore requires #include <stdlib.h>

 

malloc() returns a void pointer to the allocated memory, and requires the amount of memory required as its argument ( the sizeof() function is commonly used for this purpose).

 

Returns a NULL pointer if there's an error.

 

Must be removed at the end of its use, by use of free(), to stop memory leaks.

 

This example asks the user to enter the number of random characters to generate:

Standard Library

The C standard library is a collection of pre coded functions , macros, data-types, etc, that perform many common tasks (why reinvent the wheel).

 

Its APIs are declared in header files:

<assert.h> Contains the assert macro, used to assist with detecting logical errors and other types of bug in debugging versions of a program.
<complex.h> C99 set of functions for manipulating complex numbers.
<ctype.h> Defines set of functions used to classify characters by their types or to convert between upper and lower case in a way that is independent of the used character set (typically ASCII or one of its extensions, although implementations utilizing EBCDIC are also known).
<errno.h> For testing error codes reported by library functions.
<fenv.h> C99 Defines a set of functions for controlling floating-point environment.
<float.h> Defines macro constants specifying the implementation-specific properties of the floating-point library.
<inttypes.h> C99 Defines exact width integer types.
<iso646.h> NA1 Defines several macros that implement alternative ways to express several standard tokens. For programming in ISO 646 variant character sets.
<limits.h> Defines macro constants specifying the implementation-specific properties of the integer types.
<locale.h> Defines localization functions.
<math.h> Defines common mathematical functions.
<setjmp.h> Declares the macros setjmp and longjmp, which are used for non-local exits.
<signal.h> Defines signal handling functions.
<stdalign.h> C11 For querying and specifying the alignment of objects.
<stdarg.h> For accessing a varying number of arguments passed to functions.
<stdatomic.h> C11 For atomic operations on data shared between threads.
<stdbool.h> C99 Defines a boolean data type.
<stddef.h> Defines several useful types and macros.
<stdint.h> C99 Defines exact width integer types.
<stdio.h> Defines core input and output functions
<stdlib.h> Defines numeric conversion functionspseudo-random numbers generation functionsmemory allocationprocess control functions
<stdnoreturn.h> C11 For specifying non-returning functions.
<string.h> Defines string handling functions.
<tgmath.h> C99 Defines type-generic mathematical functions.
<threads.h> C11 Defines functions for managing multiple Threads as well as mutexes and condition variables.
<time.h> Defines date and time handling functions
<uchar.h> C11 Types and functions for manipulating Unicode characters.
<wchar.h> NA1 Defines wide string handling functions.
<wctype.h> NA1 Defines set of functions used to classify wide characters by their types or to convert between upper and lower case

Source: wikipedia

 

Another Great source with examples: tutorialspoint

Void

The Void data type indicates absence of any data type.

 

3 primary uses:

  • Void function returns
  • Void function arguments
  • Void pointers

 

Void function returns

Used when no return value is required or expected:

As can be seen in this very simple function, which simply prints a line of text, no return value is given or expected.

 

 

Void function arguments

Used where a function does not expect or accept an argument/parameter:

 

 

Void pointers

Can be used to point to ANY data type of object.

 

Represents the address of an object but not its data type.

 

Can be assigned to ANY data type of object.

 

CANNOT be dereferenced directly - HAS to be typecast to the data type of the object being pointed to.

 

Compile & Run:

value: 12
value: 14.634572

 

 

source

Lvalues & Rvalues

Lvalues are values that have a name and therefore an address, and persist beyond a single expression

 

Rvalues are temporary values that do not persist beyond the expression that uses it

 

int x ;

x = (2121 + 356) / (9 - 7) ;

 

The variable on the left hand side of the expression has been declared as an integer with a name of x, which has an associated memory address (that is accessible by &x ).

 

The values on the right hand side of the expression are literals and do not persist within memory when this expression has been completed.

 

If the values on the right hand side were presented as other variables, they would first be converted to Rvalues to be acted upon within that expression, e.g.:

 

int a = 2, b = 3, x ;

x = a + b ;

 

Although the a and b variables are Lvalues, but are implicitly converted from Lvalue to Rvalue within the expression.

 

  • lvalue equtes to an address
  • rvalue equates to a value

Memory Layout

The Byte is the standard unit of storage, made up of 8 bits (binary digits) whose state can be 1 or 0.

 

Bytes have been shown on the LHS of the diagram below to indicate that the memory area comprises of a huge number of these (many millions/billions, etc, according to the amount of storage available). The BIOS and the OS, just as any other program, require storage and have been shown here for completeness.

 

The main areas we are interested in is the data segment (collectively consisting of the code, data and bss), the stack and the heap.

 

*note: bss is used by many compilers and linkers for the portion of an object file or executable containing statically-allocated variables that are not explicitly initialized to any value. It is often referred to as the "bss section" or "bss segment".

 

The following diagram provides a simplified view of how the memory is laid out:

 

 

 

 

OS, environment variables, command line arguments

 

 

 

 

Stack

 

Works on a LIFO basis

 

Used for local variables and passing arguments to functions,

along with return address of the next instruction to be

executed when the function call is over

 

When a new stack frame needs to be added (as a result of

a new function), the stack grows downward

 

 

 

 

Unallocated. Free area, available to be utilised for growth by

heap or stack

 

 

 

Heap

 

Used for Dynamic Memory allocation

 

C      managed by malloc(), realloc(), free()

C++ managed by new, delete

 

 

 

 

 

 

 

 

 

 

BSS. Uninitialised data

 

Data segment. Initialised data. Global and static variables

 

 

 

Code segment. aka Text segment.

Contains the compiled Machine code (program) instructions.

Often r/o to stop it being overwritten.

 

 

Operating System

 

BIOS

 

 

Note: this is greatly over simplified and shows just one program loaded in memory

Program Memory

When a program is compiled into an executable file, the compiler converts the program's executable statements (e.g. printf("Hello, World!"); ) that are of a specific size into machine code, and also knows how much room to allocate for the global and static variables (from their data types), thus resulting in a fixed amount of storage for the compiled program.

 

When a program is executed (aka running) it is often referred to as a process (e.g. on a Linux system), and is allocated four areas of memory:

  • Stack
    • Return address
    • Function parameters (passed in arguments)
    • Local variables
  • Heap
    • Dynamic memory allocation
  • Data
    • Initialised global and static variables
    • Uninitialised variables
  • Code
    • Executable statements in machine code

 

Upon program execution the OS provides a new VAS (Virtual Address Space) for each process, whereby each virtual address is mapped to physical memory by use of a structure known as a page table. This is an abstraction of physical memory and the process effectively considers the full amount of memory available to it (as logical addresses). On a 32bit OS, the process has a 4GB VAS (2^32).

 

In reality, many programs are running at the same time with the underlying OS taking care of memory management.

 

The following represents two processes virtual address spaces being mapped to physical memory:

 

 

This representation shows the a single program being run as two processes (e.g. two instances of a text editor), sharing the same code segment in physical memory but with their own data and stack segments:

Stack

The stack is comprised of a number of Stack Frames, with each frame representing a function call.

 

The size of the stack increases in proportion to the number of functions called, and then shrinks upon return of a completed function.

 

Works on a LIFO basis.

 

Each stack frame contains:

  1. The returning line number
  2. Any arguments from the called function
  3. Storage space for all of the function's (automatic) variables
  4. (various bookkeeping data)

Consider the following program, containing two functions:

 

  1. Upon program start an initial stack frame is created for main()
  2.  
     

  3. firstFunc() is called and a new stack frame is created from unused stack memory, containing:
    1. Line to return to = the line after where it was called from in main() = Line 9
    2. Storage space for an int
  4.  

  5. secondFunc() is called and a new stack frame is created from unused stack memory, containing:
    1. Line to return to = the line after where it was called from in firstFunc() = Line 15
    2. Storage space for an int
    3. Storage space for a char
  6.  
     

  7. When secondFunc() returns, it's frame is used to determine where to return to (line 15 of firstFunc()), then deallocated and the space returned to the stack
  8.  
     
     

  9. When firstFunc() returns, it's frame is used to determine where to return to (line 9 of main()), then deallocated and the space returned to the stack

When main() returns, the program ends.

 
 

Processes

The terms program and process are often used interchangeably. However, there are subtle differences:

 

  • Program = source code and/or compiled machine code. The instructions that tell the computer what to do.
    • Fixed storage size

 

  • Process = instance of a running program. The code loaded into memory, operating and working on data
    • Variable storage size (in working memory)
    • Includes the storage requirements of the loaded program plus its data

 

A unique pid (process ID) is assigned by the OS kernel to each process as an index for various data relating to the process

Storage Class & Scope

The storage class of a variable refers to its lifetime during program execution.

 

The scope of a variable refers to its visibility.

 

The location at which a variable is declared, determines where it will be placed within memory.

 

Variables declared outside of any function have global scope and static (permanent) duration.

 

Variables declared within a function have local scope and automatic (temporary) duration.

 

Programming

The art of writing language in a form that can be understood by a computer.

 

Computers work on a binary level, consisting of just two states; one and zero. Instructions are given to a computer in the form of machine code consisting of these binary digits.

 

The CPU is at the heart of a computer and utilises an ISA (Instruction Set Architecture) to provide programmers with a slightly more easier set of instructions to allow the computer to be told what to do. These instructions are in the form of Assembly Code, consisting of abbreviations that try to relate to human understandable words. e.g. mov eax, 123  means move the immediate value 123 into the eax register.

 

However, Assembly Code is (almost) just as difficult for humans to write first hand as machine code and therefore higher level languages are utilised to make it easier for us to write programs. The languages at this level are known as compiled languages, since they require the use of a compiler program to convert the human developed code (e.g. written in C++) into assembly, then machine code that can be understood by the CPU.

 

The compiler takes the source code written by the programmer in the high level language and converts it an executable file, which we then refer to as the program.

Useful links

Here's some good sites I referred to whilst learning:

 

cprogramming.com
C programming.com - Learn C and C++ Programming - Cprogramming.com

 

cplusplus.com
fgets - C++ Reference

 

wikibooks.org/wiki/A_Little_C_Primer
A Little C Primer - Wikibooks, open books for an open world

 

eskimo.com
C Programming Notes

 

codingunit.com
CodingUnit Programming Tutorials

 

tutorialspoint.com/ansi_c
C Programming Tutorial - ANSI ISO GNU K and R C99 C69

 

programiz.com
C Programming Language - Tutorial to Program In C

 

cs.cf.ac.uk/Dave/C
Programming in C

 

wikibooks.org/wiki/C_Programming/File_IO
C Programming/File IO - Wikibooks, open books for an open world

 

tutorialspoint.com/cprogramming
Function call by reference in C

 

learncpp

https://www.learncpp.com/

return

Terminates execution of a function and returns control to the point immediately following the function call, where upon execution continues.

 

In the case of the main function, control is transferred back to the operating system.

 

Syntax:

return [expression] ;

 

The value of the expression is returned to the calling function.

 

If there is no expression the return value is said to be undefined.

 

Void functions do not and cannot have a return expression. You may also sometimes see a void function with no return statement, thus being undefined.

 

All other types of function must specify a return expression.

Compile & Run:

42 is larger than 17

 

 

You will often see return 0 ;  just before the end of main(). This is known as the application exit status, which is returning a value of 0 to the operating system. Standard convention has main() being declared with int preceding it, hence the ability to return a number to the OS, which can potentially be used by other programs or the OS as necessary. For instance, if an application exited with a return value of 1, this might alert the OS or other application that something went wrong, and can be acted upon accordingly.