Most large software projects implemented in classical procedural programming languages usually end up with lots of code taking care of resource allocation and deallocation. Bugs in such code are often very difficult to find, because they cause only `resource leakage', that is keeping a lot of memory and other resources which nobody references to.
We've tried to solve this problem by employing a resource tracking system which keeps track of all the resources allocated by all the modules of BIRD, deallocates everything automatically when a module shuts down and it is able to print out the list of resources and the corresponding modules they are allocated by.
Each allocated resource (from now we'll speak about allocated resources only) is represented by a structure starting with a standard header (struct resource) consisting of a list node (resources are often linked to various lists) and a pointer to resclass -- a resource class structure pointing to functions implementing generic resource operations (such as freeing of the resource) for the particular resource type.
There exist the following types of resources:
Resource pools (pool) are just containers holding a list of other resources. Freeing a pool causes all the listed resources to be freed as well. Each existing resource is linked to some pool except for a root pool which isn't linked anywhere, so all the resources form a tree structure with internal nodes corresponding to pools and leaves being the other resources.
Example: Almost all modules of BIRD have their private pool which is freed upon shutdown of the module.
pool * rp_new (pool * p, char * name) -- create a resource pool
parent pool
pool name (to be included in debugging dumps)
rp_new() creates a new resource pool inside the specified parent pool.
void rmove (void * res, pool * p) -- move a resource
resource
pool to move the resource to
rmove() moves a resource from one pool to another.
void rfree (void * res) -- free a resource
resource
rfree() frees the given resource and all information associated with it. In case it's a resource pool, it also frees all the objects living inside the pool.
It works by calling a class-specific freeing function.
void rdump (void * res) -- dump a resource
resource
This function prints out all available information about the given resource to the debugging output.
It works by calling a class-specific dump function.
void * ralloc (pool * p, struct resclass * c) -- create a resource
pool to create the resource in
class of the new resource
This function is called by the resource classes to create a new resource of the specified class and link it to the given pool. Allocated memory is zeroed. Size of the resource structure is taken from the size field of the resclass.
void rlookup (unsigned long a) -- look up a memory location
memory address
This function examines all existing resources to see whether the address a is inside any resource. It's used for debugging purposes only.
It works by calling a class-specific lookup function for each resource.
void resource_init (void) -- initialize the resource manager
This function is called during BIRD startup. It initializes all data structures of the resource manager and creates the root pool.
Memory blocks are pieces of contiguous allocated memory. They are a bit non-standard since they are represented not by a pointer to resource, but by a void pointer to the start of data of the memory block. All memory block functions know how to locate the header given the data pointer.
Example: All "unique" data structures such as hash tables are allocated as memory blocks.
void * mb_alloc (pool * p, unsigned size) -- allocate a memory block
pool
size of the block
mb_alloc() allocates memory of a given size and creates a memory block resource representing this memory chunk in the pool p.
Please note that mb_alloc() returns a pointer to the memory chunk, not to the resource, hence you have to free it using mb_free(), not rfree().
void * mb_allocz (pool * p, unsigned size) -- allocate and clear a memory block
pool
size of the block
mb_allocz() allocates memory of a given size, initializes it to zeroes and creates a memory block resource representing this memory chunk in the pool p.
Please note that mb_allocz() returns a pointer to the memory chunk, not to the resource, hence you have to free it using mb_free(), not rfree().
void * mb_realloc (void * m, unsigned size) -- reallocate a memory block
memory block
new size of the block
mb_realloc() changes the size of the memory block m to a given size. The contents will be unchanged to the minimum of the old and new sizes; newly allocated memory will be uninitialized. Contrary to realloc() behavior, m must be non-NULL, because the resource pool is inherited from it.
Like mb_alloc(), mb_realloc() also returns a pointer to the memory chunk, not to the resource, hence you have to free it using mb_free(), not rfree().
void mb_free (void * m) -- free a memory block
memory block
mb_free() frees all memory associated with the block m.
Linear memory pools are collections of memory blocks which support very fast allocation of new blocks, but are able to free only the whole collection at once.
Example: Each configuration is described by a complex system of structures, linked lists and function trees which are all allocated from a single linear pool, thus they can be freed at once when the configuration is no longer used.
linpool * lp_new (pool * p, uint blk) -- create a new linear memory pool
pool
block size
lp_new() creates a new linear memory pool resource inside the pool p. The linear pool consists of a list of memory chunks of size at least blk.
void * lp_alloc (linpool * m, uint size) -- allocate memory from a linpool
linear memory pool
amount of memory
lp_alloc() allocates size bytes of memory from a linpool m and it returns a pointer to the allocated memory.
It works by trying to find free space in the last memory chunk associated with the linpool and creating a new chunk of the standard size (as specified during lp_new()) if the free space is too small to satisfy the allocation. If size is too large to fit in a standard size chunk, an "overflow" chunk is created for it instead.
void * lp_allocu (linpool * m, uint size) -- allocate unaligned memory from a linpool
linear memory pool
amount of memory
lp_allocu() allocates size bytes of memory from a linpool m and it returns a pointer to the allocated memory. It doesn't attempt to align the memory block, giving a very efficient way how to allocate strings without any space overhead.
void * lp_allocz (linpool * m, uint size) -- allocate cleared memory from a linpool
linear memory pool
amount of memory
This function is identical to lp_alloc() except that it clears the allocated memory block.
void lp_flush (linpool * m) -- flush a linear memory pool
linear memory pool
This function frees the whole contents of the given linpool m, but leaves the pool itself.
Slabs are collections of memory blocks of a fixed size. They support very fast allocation and freeing of such blocks, prevent memory fragmentation and optimize L2 cache usage. Slabs have been invented by Jeff Bonwick and published in USENIX proceedings as `The Slab Allocator: An Object-Caching Kernel Memory Allocator'. Our implementation follows this article except that we don't use constructors and destructors.
When the DEBUGGING
switch is turned on, we automatically fill all
newly allocated and freed blocks with a special pattern to make detection
of use of uninitialized or already freed memory easier.
Example: Nodes of a FIB are allocated from a per-FIB Slab.
slab * sl_new (pool * p, uint size) -- create a new Slab
resource pool
block size
This function creates a new Slab resource from which objects of size size can be allocated.
void * sl_alloc (slab * s) -- allocate an object from Slab
slab
sl_alloc() allocates space for a single object from the Slab and returns a pointer to the object.
void sl_free (slab * s, void * oo) -- return a free object back to a Slab
slab
object returned by sl_alloc()
This function frees memory associated with the object oo and returns it back to the Slab s.
Events are there to keep track of deferred execution. Since BIRD is single-threaded, it requires long lasting tasks to be split to smaller parts, so that no module can monopolize the CPU. To split such a task, just create an event resource, point it to the function you want to have called and call ev_schedule() to ask the core to run the event when nothing more important requires attention.
You can also define your own event lists (the event_list structure), enqueue your events in them and explicitly ask to run them.
event * ev_new (pool * p) -- create a new event
resource pool
This function creates a new event resource. To use it, you need to fill the structure fields and call ev_schedule().
void ev_run (event * e) -- run an event
an event
This function explicitly runs the event e (calls its hook function) and removes it from an event list if it's linked to any.
From the hook function, you can call ev_enqueue() or ev_schedule() to re-add the event.
void ev_enqueue (event_list * l, event * e) -- enqueue an event
an event list
an event
ev_enqueue() stores the event e to the specified event list l which can be run by calling ev_run_list().
void ev_schedule (event * e) -- schedule an event
an event
This function schedules an event by enqueueing it to a system-wide event list which is run by the platform dependent code whenever appropriate.
int ev_run_list (event_list * l) -- run an event list
an event list
This function calls ev_run() for all events enqueued in the list l.
Timers are resources which represent a wish of a module to call a function at the specified time. The platform dependent code doesn't guarantee exact timing, only that a timer function won't be called before the requested time.
In BIRD, time is represented by values of the bird_clock_t type which are integral numbers interpreted as a relative number of seconds since some fixed time point in past. The current time can be read from variable now with reasonable accuracy and is monotonic. There is also a current 'absolute' time in variable now_real reported by OS.
Each timer is described by a timer structure containing a pointer
to the handler function (hook), data private to this function (data),
time the function should be called at (expires, 0 for inactive timers),
for the other fields see timer.h
.
timer * tm_new (pool * p) -- create a timer
pool
This function creates a new timer resource and returns a pointer to it. To use the timer, you need to fill in the structure fields and call tm_start() to start timing.
void tm_start (timer * t, unsigned after) -- start a timer
timer
number of seconds the timer should be run after
This function schedules the hook function of the timer to be called after after seconds. If the timer has been already started, it's expire time is replaced by the new value.
You can have set the randomize field of t, the timeout will be increased by a random number of seconds chosen uniformly from range 0 .. randomize.
You can call tm_start() from the handler function of the timer to request another run of the timer. Also, you can set the recurrent field to have the timer re-added automatically with the same timeout.
void tm_stop (timer * t) -- stop a timer
timer
This function stops a timer. If the timer is already stopped, nothing happens.
bird_clock_t tm_parse_datetime (char * x) -- parse a date and time
datetime string
tm_parse_datetime() takes a textual representation of a date and time (dd-mm-yyyy hh:mm:ss) and converts it to the corresponding value of type bird_clock_t.
bird_clock_t tm_parse_date (char * x) -- parse a date
date string
tm_parse_date() takes a textual representation of a date (dd-mm-yyyy) and converts it to the corresponding value of type bird_clock_t.
void tm_format_datetime (char * x, struct timeformat * fmt_spec, bird_clock_t t) -- convert date and time to textual representation
destination buffer of size TM_DATETIME_BUFFER_SIZE
specification of resulting textual representation of the time
time
This function formats the given relative time value t to a textual date/time representation (dd-mm-yyyy hh:mm:ss) in real time.
Socket resources represent network connections. Their data structure (socket) contains a lot of fields defining the exact type of the socket, the local and remote addresses and ports, pointers to socket buffers and finally pointers to hook functions to be called when new data have arrived to the receive buffer (rx_hook), when the contents of the transmit buffer have been transmitted (tx_hook) and when an error or connection close occurs (err_hook).
Freeing of sockets from inside socket hooks is perfectly safe.
int sk_setup_multicast (sock * s) -- enable multicast for given socket
socket
Prepare transmission of multicast packets for given datagram socket. The socket must have defined iface.
0 for success, -1 for an error.
int sk_join_group (sock * s, ip_addr maddr) -- join multicast group for given socket
socket
multicast address
Join multicast group for given datagram socket and associated interface. The socket must have defined iface.
0 for success, -1 for an error.
int sk_leave_group (sock * s, ip_addr maddr) -- leave multicast group for given socket
socket
multicast address
Leave multicast group for given datagram socket and associated interface. The socket must have defined iface.
0 for success, -1 for an error.
int sk_setup_broadcast (sock * s) -- enable broadcast for given socket
socket
Allow reception and transmission of broadcast packets for given datagram socket. The socket must have defined iface. For transmission, packets should be send to brd address of iface.
0 for success, -1 for an error.
int sk_set_ttl (sock * s, int ttl) -- set transmit TTL for given socket
socket
TTL value
Set TTL for already opened connections when TTL was not set before. Useful for accepted connections when different ones should have different TTL.
0 for success, -1 for an error.
int sk_set_min_ttl (sock * s, int ttl) -- set minimal accepted TTL for given socket
socket
TTL value
Set minimal accepted TTL for given socket. Can be used for TTL security. implementations.
0 for success, -1 for an error.
int sk_set_md5_auth (sock * s, ip_addr local, ip_addr remote, struct iface * ifa, char * passwd, int setkey) -- add / remove MD5 security association for given socket
socket
IP address of local side
IP address of remote side
Interface for link-local IP address
Password used for MD5 authentication
Update also system SA/SP database
In TCP MD5 handling code in kernel, there is a set of security associations used for choosing password and other authentication parameters according to the local and remote address. This function is useful for listening socket, for active sockets it may be enough to set s->password field.
When called with passwd != NULL, the new pair is added, When called with passwd == NULL, the existing pair is removed.
Note that while in Linux, the MD5 SAs are specific to socket, in BSD they are stored in global SA/SP database (but the behavior also must be enabled on per-socket basis). In case of multiple sockets to the same neighbor, the socket-specific state must be configured for each socket while global state just once per src-dst pair. The setkey argument controls whether the global state (SA/SP database) is also updated.
0 for success, -1 for an error.
int sk_set_ipv6_checksum (sock * s, int offset) -- specify IPv6 checksum offset for given socket
socket
offset
Specify IPv6 checksum field offset for given raw IPv6 socket. After that, the kernel will automatically fill it for outgoing packets and check it for incoming packets. Should not be used on ICMPv6 sockets, where the position is known to the kernel.
0 for success, -1 for an error.
sock * sock_new (pool * p) -- create a socket
pool
This function creates a new socket resource. If you want to use it, you need to fill in all the required fields of the structure and call sk_open() to do the actual opening of the socket.
The real function name is sock_new(), sk_new() is a macro wrapper to avoid collision with OpenSSL.
int sk_open (sock * s) -- open a socket
socket
This function takes a socket resource created by sk_new() and initialized by the user and binds a corresponding network connection to it.
0 for success, -1 for an error.
int sk_send (sock * s, unsigned len) -- send data to a socket
socket
number of bytes to send
This function sends len bytes of data prepared in the transmit buffer of the socket s to the network connection. If the packet can be sent immediately, it does so and returns 1, else it queues the packet for later processing, returns 0 and calls the tx_hook of the socket when the tranmission takes place.
int sk_send_to (sock * s, unsigned len, ip_addr addr, unsigned port) -- send data to a specific destination
socket
number of bytes to send
IP address to send the packet to
port to send the packet to
This is a sk_send() replacement for connection-less packet sockets which allows destination of the packet to be chosen dynamically. Raw IP sockets should use 0 for port.
void io_log_event (void * hook, void * data) -- mark approaching event into event log
event hook address
event data address
Store info (hook, data, timestamp) about the following internal event into a circular event log (event_log). When latency tracking is enabled, the log entry is kept open (in event_open) so the duration can be filled later.