Sunday, June 21, 2015

How to return array from function

We want function which dynamically creates array of integers, so no static declaration is possible because we do not know how big array needs to be.


We print address of dynamically allocated array in function and in main to be sure it points to the same place and use it to show it is usable. We must use pointer to integer in main because expression

int i[] = allocator(2);

is illegal initializer. It doesn't prevent us from using pointer to integer later as array. Finally we free allocated memory. That what we did is called manual memory allocation and such memory is placed on heap. If we want to use pointer arithmetic we will do something like this:

*i = 1;
*(i + 1) = 2;
printf ("%d, %d\n", *i, *(i+1));

Please note that *(i + 1) is not the same as *i + 1 even if it produces the same output in our example. Dereferencing pointer got higher order of precedence than addition. Because of precedence we are using braces. What if we need two-dimensional array? Two-dimensional array is array of arrays and we will need pointer to pointer.


As we see allocating is more complicated and feeing also. It goes in two steps, first we allocate array of rows and after that we allocate arrays of columns. As in previous case we can use array syntax. Now we can go mixing pointer arithmetic and array syntax or go completely pointer arithmetic


Maybe to add that *(*i+1) corresponds to i[0][1]. That was how to do it, now how not to do it. Going heroic and forcing through pointer to array as return type.


This compiles and executes with few warnings, if we change fun return type by adding another pointer int* (*fun())[], that also works. If something looks difficult, like returning address of array, than design needs additional work. For example


this is much nicer and easy to understand. Please note that returning static array is perfectly fine and that we do not use free on it. Statically declared variables are allocated automatically and available through execution of program. If we forget to declare array as static than we will return dangling pointer, local variable goes out of scope and returned pointer points to who knows what.

Tuesday, June 16, 2015

bgfx

Not sure into which category falls this blog entry, maybe review? I never did any 3D programming, so tutorial is out of question. Anyway I was fiddling with bgfx library and here are my experiences.
According to author, Branimir Karadzic, bgfx is cross-platform, graphics API agnostic, "Bring Your Own Engine/Framework" style rendering library. Since it works on Linux I was interested to take a look at it. Best way to get info on project is to visit https://github.com/bkaradzic/bgfx and read README.md. If you are curious to try it you will need also https://github.com/bkaradzic/bx to build it. Library uses custom tool GENie so that is reason why bx is needed, we will need it for projects as well. I have built it and attempted running examples. Some were working, some not. I was attempting to run them from directory where compiled assemblies are but they need runtime. So to run examples go to /examples/runtime and execute ../../.build/linux64_gcc/bin/example-00-helloworldDebug.
How to generate project make files or codeblocks or codelite projects. Since this is cross-platform tool Windows and Mac OS X targets are supported but I will focus on Linux only. Go to bgfx project root and run

../bx/tools/bin/linux/genie --help

To generate make files, do

../bx/tools/bin/linux/genie --gcc=linux-gcc --with-tools gmake

That will produce examples as targets plus tools, like shaderc. Now cd to .build/projects/gmake-linux and execute make help to see options. For 64 bit Linux we may build using this command:

make config=debug64 example-XY-name/nothing for all

If we are using IDE, we go to .build/projects/codeblocks or codelite, open generated project set target and build it. Now we are ready to start fiddling with provided examples.
For somebody who never did any 3D coding, like me, examples provided with bgfx are too complicated so I dumbed them down.



This is example-00-helloworld reduced to creating and showing blue window. There is huge amount of plumbing code which makes those examples work in very few lines of code. For example entry::processEvents. In original shape example will print some text and place image on the screen.
Next to be slaughtered is cubes example. It draws now single triangle:


Naturally I didn't touch shaders. If we remove completely shaders from code, instances of program, triangle will become black, new achievements of hacking. I also did square, here is the relevant code:


Everything else is the same as for triangle. I hope my experience helps somebody to get started. After I spent eternity in programming there are still areas completely unknown to me. If I didn't get friendly suggestion to use identity as view and orthogonal matrix as projection I would stay on Hello World/blue window.

Saturday, June 13, 2015

Thread-Specific Data API

Last time it was making function thread safe or reentrant via thread local storage, now more complicated thread-specific data. Everything happens on Linux but should be applicable on any UNIX since we are using POSIX.
This one is also called thread-private data API. It creates per thread and per reentrant function storage. Process starts with key creation when thread for the first time calls function, any thread. In order to have key created once pthread_once is used. Then, once we got key, pthread_getspecific and pthread_setspecific are used to deal with thread-specific data. Here is example:


To compile it the following is used:

gcc -o tsdexample tsdexample.c -pthread

Result of execution is:

Main message: Message is 'Hello from main!'
From thread: Message is 'Hello I am lumberjack!'
Again thread: Message is 'Hello I am lumberjack!'
Main message: Message is 'I am not lumberjack!'


In order to init static variable once, macro  PTHREAD_ONCE_INIT is used. During key creation destructor, ordinary free, is passed as parameter to pthread_key_create. Main thread and another thread then are calling reentrant function and setting and retrieving messages which are thread specific, what is expected. If we want to delete key it looks like this:


Though in example program it is static variable and it will go out of scope together with program.

Thursday, June 11, 2015

Thread safety

Before I start with thread safety, evolving design, if something can be improved don't be lazy.
Now I remembered that GCC supports 128 bit integers and that Fibonacci numbers can be calculated up to forty digits. There is problem with printing those 128 bit integers but somebody already offered utility function to convert them to string, that was Perkins . I only added zero case to his solution.


Now we can go up to 186th Fibonacci number without overflow.

Threads

There is few buzz words which people are using in job ads, related to threads, but usually they fail to define them. For example multithreading, concurrency, thread safety, reentrant function and similar.
What is reentrant function or thread safe function? That would be such function that does not use statically allocated or global variables and result depends only on input arguments. For example:


is not reentrant or threadsafe. If more than one thread calls that function in the same time, that is concurrency, it will produce unpredictable result. Even if it is called sequentially result will be influenced by previous calls, possibly from different thread. We are on Linux and we are using POSIX threads. So, here is example where two threads are calling nonreentrant function:


To compile it I used:

gcc -o notreentrant -g -gdwarf-2 notreentrant.c -pthread

That -gdwarf-2 story is distraction, I am using gcc version 4.9.2 built from source on old Mint based on Ubuntu 12.04 and new gcc outputs by default debug info which gdb doesn’t understand. If you just built new gcc and gdb doesn't work as expected, that is cure. Linker switch -pthread is required.
Example output from execution looks like this:

thread 1 call 1 = 2, expecting 2
thread 1 call 2 = 7, expecting 4
thread 1 call 3 = 9, expecting 6
thread 1 call 4 = 11, expecting 8
thread 2 call 1 = 5, expecting 3
thread 2 call 2 = 14, expecting 6
thread 2 call 3 = 17, expecting 9
thread 2 call 4 = 20, expecting 12
thread 1 exit code 1
thread 2 exit code 2


How do we remedy that? We make increase reentrant or thread safe:


We requested thread local storage, available since kernel 2.6 and supported by gcc 3.3 and higher. When thread terminates and goes out of scope, local thread storage does to. We could make function reentrant via using thread specific data, but that is much more code, maybe next time.

Saturday, June 6, 2015

Similarities and differences between Java and C

As almost everybody knows Java was intended to be sort of C without pointers and to be used in smart home appliances. And then Gosling and his co-workers were drinking too much coffee and it come out more like C++ without pointers, resource management and multiple inheritance. So, how really different can be coding the same task in C and Java. Practically we are addressing just resource management part, we will not try to go OO in C. We implement fast Fibonacci algorithm in Java.


It is really very fast, logarithmic fast. Now we try to do the same in C, we will skip BigInteger and use long. The first thing which wont work in C is array allocation, declaring them like that will create local variable. So we create function to allocate our arrays in order to return them to caller:


Those Q and I arrays can be global variables and we are ready to go, almost. We can not declare t as array since initializer can't be function call. Not big deal, we will use pointer to long but that introduces another problem. That pointer may point to global variables Q and I or allocated variables. Again we can check to what it points and if it doesn't point to Q or I we can call free. Already we have feeling that those arrays returned by squaring function are somehow polymorphic and in Java they were not like that. Allocating and freeing arrays on every step of recursion and checking should we free or not doesn't look right. Attempt to literally translate Java code may look something like this:


If Q and I arrays are created using malloc we could do reassignments of initial array and free it at the end of calculation. So no global Q and I but they will be allocated on the heap and serve all recursive calls.


Now if we run valgrind on our program it will report:

==18896== HEAP SUMMARY:
==18896==     in use at exit: 0 bytes in 0 blocks
==18896==   total heap usage: 93 allocs, 93 frees, 1,488 bytes allocated
==18896== 
==18896== All heap blocks were freed -- no leaks are possible


Looking now at initial Java solution, it also could benefit from array reuse instead of allocating new ones on every step. Garbage collector is nice thing but not thinking at all about resource management may result in lower quality of code.

Friday, June 5, 2015

Functions, pointers, nested functions, clojures

While ago GCC introduced nested functions which are allowing easy solution of funargs problem. With funargs problem solved C can accommodate closures and provide everything what different modern languages can, except funny syntax. In order to demonstrate that we need function pointers and they are not part of modern languages derived from C. So, we start with function pointers refresher.

Function pointers

They are more cryptic than other pointers and they scare away younger generations of programmers. We will declare function and pointer to it:


We have return type, variable name in brackets and type list for parameters. We could also use address of operator during assignment. How we pass it as parameter?


How we return it, write function which is returning function pointer? There is user friendly way, we declare typedef for function pointer:


But there is also user unfriendly way:


Return type of return type goes in front, inside first pair of brackets name and own parameters, inside the second pair of brackets list of parameter types of return type. And just for those who value extreme programming above all, pointer to function which returns function which returns function:


Nested Functions

Those are GCC specifics and they are just ordinary functions but inside another function. Standard C will not let you define function within function. To demonstrate we can do trampolining of one notoriously recursive function.


Here nested function is accessing variables which are host scoped and returns self or NULL. We got whole function back but we are not really using it, except to exit while loop if it is NULL. To remedy that we will dispatch those nested functions out of host function.


Here we have struct which is global variable, it is not really closure but just named like that, and one function generator. If we pass zero to generator we get one function back and for non zero the other one. Both nested functions now operating on global variable. In previous example we were working on variables belonging to host function, in this case if we try the same we will have problems since host's variables went out of scope.

Funarg problem

This one is related to functional languages and nested functions. If we pass nested function as argument to another function that is called downward funarg problem. Returning nested function to caller is upwards funarg problem. If any of passed or returned functions references variable from scope of declarer, those must be available to nested function so it can be executed. Passing nested function as argument to another function is not problem in C since all local variables of declarer are alive.


Returning nested function is more problematic because declarer went out of scope. That could be addressed like in Java, if anonymous function uses some declarer's variable, that variable must be final. In C there is even more options, we can use static variables or allocate them on the heap.


It is easier to use static than malloc and free. Removing static from counter declaration will result in exception during execution of program. So, funargs problem is not problem for C in GCC version. This makes C almost functional language, functional with few extras.

Tuesday, May 26, 2015

Recursive Squaring

This one neatly solves problem of log n multiplications to calculate nth power of of some number. It goes in two stages, calls self recursively with half power until it reaches power 0 or 1 and then squares results. If power was not even correction is required, additional multiplication with base.


If we want to see how it works and how many times it calls self we can insert debug code:


Should have been included it in previous blog entries about recursion but better late than never.

Friday, May 15, 2015

Recursion and optimizations

As seen in last instalment calculating Fibonacci numbers using naïve implementation creates binary tree of calls and the same subtrees are calculated over and over again what has huge impact on performance. One of ways to improve performance is caching previous results and that is called memoization. In old Java we will do something like this:


We initialize map container with result for first two cases, to avoid if statements, and later if there is mapping for input, we return it and if there is no mapping for input we calculate it. Without help from memoization naïve implementation will choke and die on input 92. Using Java 8 it becomes even shorter, here is just fibonacci method, everything else is the same:


Signature of computeIfAbsent is the following:


where n is key and function is interface Function, we defined it recursively and compiler was not upset. Memoization brings speed but takes space.
How do we turn tail call version into lambda. We can do this for example:


what does not look very functional and very Java 8 because it is anonymous class. Another more Java 8 option is declaring functional interface with sufficient number of arguments:


We also can not initialize it in one go, declare class field and initialize it, it must be done from some of host class methods.
Remaining optimization ideas are strictly related to Fibonacci numbers and not applicable to most other recursions. There is not so obvious way to generate Fibonacci numbers:

| 1 1 | n      | F(n+1) F(n)   |
| 1 0 |    =   | F(n)   F(n-1) |


If we rise matrix on the left, lets call it A, to power of n-1 its A[0][0] is Fn. We can prove it using induction. For n=1 our claim obviously holds, now

| F(n+1) F(n)   |   | 1 1 |
| F(n)   F(n-1) | x | 1 0 | =

|F(n+1)+F(n) F(n+1)+0|   |F(n+2) F(n+1)|
|F(n)+F(n−1) F(n)+0  | = |F(n+1) F(n)  |


Instead of doing chain multiplication we can do squaring and reduce number of multiplications to log(n) multiplications.

| F(n+1) F(n)   |2
| F(n)   F(n-1) | =

| F(n+1)^2+F(n)^2         F(n+1)*F(n)+F(n)*F(n−1)|
| F(n)*F(n+1)+F(n−1)*F(n) F(n)^2+F(n−1)^2        |


From where one can pull more interesting relations.
Here is illustrative implementation of this idea:


That ArrayList is mapper, consumes some additional space but makes things easier to debug.

Thursday, May 14, 2015

Recursion

Got recently homework to do as part of interview process, already described it here. After providing them with working solution, thoroughly tested, they decided not to speak with me. My conclusion is that was not so bad outcome.
While searching web for clues I discovered this Find all paths from source to destination.
Guy submits his code, peers do review, everything sounds nice.
I will refrain from reviewing particular solution but after looking at it I decided to write about recursion. My impression is that young generations of programmers are putting great effort into mastering frameworks and new features of programming languages but in the same time somehow missing basics. It also coincides with my exploration of new functional features of Java 8.
Recursion is when method, in Java, or function, in C, calls itself during execution. Execution of caller is suspended until callee finishes, then it proceeds. So, there must be some natural way for recursive calls to stop propagating themselves into infinity. If we are not traversing some kind of tree structure, or marking vertices as visited when traversing graph, we typically passing variable to control depth of recursion. Trivial example of recursion is calculation of factorial:

n! = n*(n-1)*(n-2)*...*2*1

It is defined for positive integers. Here is natural and naive implementation together with test:


I could throw exception on negative n and assert is 9! equal to 362880. What stops recursion here from proceeding forever is that if statement in combination with decreasing n in recursive calls. Now in order to visualize how execution looks like we will add some debugging code.


Code now prints current value of n and number of dashes is how deep we are into recursion. We can see how stack of frames is growing in debugger as well but this is nicer. Output of execution is:

9
-8
--7
---6
----5
-----4
------3
-------2
--------1
-------2
------3
-----4
----5
---6
--7
-8
9
9! = 362880


As expected it behaves in linear fashion. It can be much more interesting than linear, for that we will calculate Fibonacci numbers. In the case of Fibonacci numbers we have the following recursive definition:

Fn = Fn-1 + Fn-2

Trivial and naive implementation looks like this:


Execution of this code will take  about second or two. Using 42 as function argument illustrates point that naïve implementation is not really most efficient. Mechanics of stopping recursive calls is identical to one in the case of factorial. Let us insert debugging code and see what is going on.


For reasonably sized argument of 2 we have this output:

2
-1
2
-0
2
F2 = 1


Argument 2 is decreased and recursive call is made with argument 1. When it returns to level 0 with result 1 next recursive call is made with argument 0. The second call also returns with result 0 to level 0 and we have 1+0=1 as result.
We try now 3 as argument:

3
-2
--1
-2
--0
-2
3
-1
3
F3 = 2


We start on level 0 with 3-1 call, then we have pattern from last run elevated for one level up, return to level 0 with result 1 and finally 3-2 recursive call and its return with result 1 to level 0. We can make conclusion that recursive calls are building binary tree. We can try argument 4 to recognize patterns for 3 and 2 and so on.
We could achieve the same on the factorial if we were splitting chain multiplication in half, I was writing about that here.
Recursive call does not have to return anything, work can be performed on one of function arguments. For example we can reverse array by swapping elements.


Here recursion stops when lo is bigger than hi.

Optimization

We can improve on naïve implementation using tail call elimination. We write method in such way that return statement is pure function call and that allows compiler to perform optimization and practically discard frames without placing them onto the stack. For example factorial can be rewritten like this:


We provide 1 as initial value for accumulator, n is the same as before. For Fibonacci we will need two accumulators:


Initial values are a = 0, b = 1 and n as before. From this form is quite easy to switch to iterative form.


About other forms of optimization like memoization, repeated squaring and lambdas, next time.

Sunday, May 10, 2015

Scheduling and Johnson's Algorithm

There is actually paper available on the web where Johnson's scheduling algorithm is nicely described. You can find it here.
So that is not all-pairs shortest paths but scheduling one. For those who are lazy to read it, here is the story. We have programming task paired with testing task. We have one programmer and one tester, tester can test only what is coded, must wait for programmer to finish. In which order jobs should be executed so that whole bunch is finished in shortest time?
Now Johnson's algorithm in plain English:
  1. For each task T1, T2, ..., Tn, determine the P1 and P2 times.
  2. Establish two queues, Q1 at the beginning of the schedule and Q2 at the end of the schedule.
  3. Examine all the P1 and P2 times, and determine the smallest.
  4. If the smallest time is P1, then insert the corresponding task at the end of Q1. Otherwise, the smallest time is P2, and the corresponding task is inserted at the beginning of Q2. In case of a tie between a P1 and P2 time, use the P1 time.
  5. Delete that task from further consideration.
  6. Repeat steps 3 – 5 until all tasks have been assigned.
Very simple. Here is the code:


I think that code is self-explanatory and I will not bother you with explanation.
I also had a look at C++ code written by riteshkumargupta here it is slight modification of problem mentioned in paper. If you prefer C++ to Java take a look.

Monday, May 4, 2015

Find all paths from vertex to vertex in directed, weighted graph

Why I am doing homework? I have applied for a job at some company. They did some kind of phone interview and decided to give me “task” before they make up their mind will they ever talk to me about job again. OK, that evening I open Firefox and find in mailbox my tasks. That was few NP hard problems and I can pick and choose. Time for execution is limited, to make everything more interesting. Like reality show but you do not eat snails. Going quickly through them I realized that I am dealing with person who thinks that he or she is lecturer at university and that I do not have clue how to solve any of tasks.
BTW my educational background is physicist. Studied technical physics at school of electrical engineering at University of Belgrade. Never finished. As programmer I am self-educated. While I didn’t get university diploma I have learned to use literature.
Natural choice was the first task, it must be more important than others, that is the reason why it is the first. Task was something about finding all paths from vertex A to vertex B on weighted directed multigraph with cycles. Never encountered graphs before, only trees and they are part of the family. OK, time to google hard and yandex hard. After whole day of searching I got two clues. The first one is complete solution to generate all possible paths up to some length on Perl Monks and the other one is notes from some lecturing by Gunnar. Both clues claiming that adjacency matrix is a way to go and tool of choice is modified Floyd-Warshall. Learned quickly what I need and assembled working program based on degenerated matrix multiplication. After three days almost without sleep, submitted solution and went to bed. Next morning I figured out much simpler solution.
Now friendly warning, what follows is theory by somebody who studied graph theory for one whole day.

Adjacency matrix

Graph is one with directed edges, one way edges, multiple edges between pair of vertices may exist and there may be edge starting and finishing at the same vertex. Aadjacency matrix gives metric on graph space, who is neighbour to whom. Where you can go from some place.

     A    B    C    …    Z
A    1    0    0        1


From vertex A we can travel to A or Z in one move, to B and C we can't go in one move. Now comes curious tourist with question how far is B from A. If we start multiplying adjacency matrix with itself we will find out where we can go in two or more moves. Number of moves corresponds to power of adjacency matrix and if after rising it to power of 4 we see something like this:

     A    B    C    …    Z
A    0    2    1        4


meaning of it is we can't reach A in 4 moves, there is only one path to C but 2 paths to B and 4 paths to Z. Now to find out how many different paths exist from A to B where length of path is up to 4 moves one sums adjacencyMatrix[0][1] for powers from 1 to 4. Those are not shortest pats those are all possible paths including loops and cycles. Good reason to keep binary adjacency matrix around even if we got weights for edges.
So how do we go about finding those two or more moves paths, translating them in chain of edges? Since adjacency matrix is metric on that space we can use it to discover paths. For example where we can go from A in two moves.

     A    B    C    D
A    0    0    1    1
B    1    0    1    0
C    0    1    0    1
D    1    1    0    0


We can reach C or D in single move, from C we can get to B and D in single move and from D we can get to A and B in one move. So paths are ACB, ACD, ADA and ADB. That should be enough to write code for pathfinder.

Java implementation

From some vertex we can go where adjacency matrix says we can go and we move to next vertex. For that next vertex, we are in different row but we do the same thing. Looks like case for recursion. When frames start offloading from the stack they will bring us story where repeated recursive calls were taking them. We just need to send story back. If we do not limit depth of recursion it will never stop. BTW replace depth-1 with depth-- and --depth if you are not sure how suffix and prefix are different.


We have built binary adjacencyMatrix and we are traversing graph. If we find target we store it but we proceed looking for target via calling all vertices which are reachable in single move. With limitation that depth of traversal is bigger than 1 and on each new call we are using reduced incoming depth. When function call comes back we append prefix to all paths. If you print content of that string array you will get something like:

2,3,2,3,2,3
2,3,2,4,1,2,3
2,3,4,1,2,3


It is the same format as used by blokhead from PerlMonks in his solution and you can nicely compare output of your code. Do not forget to make  adjacency matrix match :P
What about weights? Instead of having binary adjacency matrix we replace 1 with corresponding edge, actually weight of that edge. I will use integers for weights.

     A    B    C    D
A    0    0    3    5


Slightly modified traversal code:


When you print content of that HashMap it looks like this:

2,3,2,3 = 23
2,3,4,1,2,3 = 33
2,3,2,4,1,2,3 = 35


It still stops on depth. If you need it to stop on weights sum then send data upstream. Don’t go for matrix multiplication if you do not have to, check other options. Nobody is going to ask you to find all paths between all vertices, to find all paths between two nodes recursive traversal looks much better than matrix multiplication.

Tuesday, April 21, 2015

PostgreSQL and PL/Pythonu

Beside SQL PostgreSQL has support for creating functions in many other so called procedural languages. We just need to install them and they are available within database. Standard distribution includes those four languages PL/pgSQL, PL/Tcl, PL/Perl, and PL/Python. There are other externally maintained languages like Java, Scheme, PHP, Unix shell and so on. If that is not enough it is possible to write procedural language handler, it is open source and consequently fully customizable.
Hosted Python is very interesting option. It is quite powerful, versatile language and code is very concise. If it is executed within PostgreSQL it enjoys very fast communication avoiding overhead of DB calls which external programs suffer. Both Python 2 and Python 3 are supported. If you are using, for example, Ubuntu then postgresql-plpython-9.x is Python 2 and postgresql-plpython3-9.x is Python 3. It needs to be installed, server restarted and language created.

createlang plpythonu [dbname]

If we do not specify db, then it will be created on db to which are we currently connected. That u suffix have meaning that we are dealing with unrestricted/untrusted language. Power is given to us and it is our responsibility to use it to our benefit without causing damage. Once everything is in place we can write Hello World function in Python.


From psql we try it


Or if you prefer GUI, then use pgAdmin III start Query tool and execute it there.
To enjoy full comfort of Python one need to import modules. How do we use modules in PL/Pythonu? It is no different to ordinary Python code:


we should see response. If we see:


That is sign that httplib2 is not in path, we can place it in path or if we do not want to reload server we can append to function:


There is significant quantity of examples on Internet to help you further. With some knowledge of Python one can do really huge amount of work directly from PostgreSQL without overhead of DB connection. In role of mailer of monthly report PL/Pythonu function beats Java Application Server and even C/C++ application connecting from outside to DB.

Thursday, April 9, 2015

GCC Quad-Precision Math Library example

There is small PDF on GCC website with libquadmath documentation. It contains explanation and example how to convert string to __float128 and how to convert __float128 back to string. It took me some googling to find out how to compile example and to figure out what is width parameter. To compile it we need -lquadmath linker switch and width, fourth argument to quadmath_snprintf, is overall width of output, right anchored. Here is my version of mentioned example:


If we try to assign number without Q suffix then we will have ordinary double inside quad storage. The fourth argument to quadmath_snprintf will be ignored if it is smaller than length of string representation, no spaces will be added in front of number.
If we need to examine layout in memory, we can use modified example from last article. We do not need here -lquadmath linker switch to compile.


Interesting part from GDB session:


We compare it with output of program to see what we are looking at.
Examining tests from GCC source, we find out that all operators which are supported for other floats are supported on __float128, so no funny function calls to add or compare two numbers. Comfort of normal syntax and 34 significant digits. For example we do square root round trip:


That code produced -3.8518598887744717061119558851698546e-34 error on my box.

Double precision floats and GDB

They are not human readable in memory like integers, so if you are examining memory in GDB session it is not very likely that you will recognize what number is it. Their layout in memory is specified by IEEE 754 standard. They occupy 64 bits and there is one bit for sign, 11 bits for exponent and 52 bits significand. There is really plenty of read about floats on the web and I will not bother you with theory, significance of their discrete spectrum, rounding and similar. Just want to show you how to handle them during debug session.
I have seen in some lecture, can't remember exactly where, solution to examine memory layout via struct and union. Here is my interpretation of it:


In struct we have specified layout and union is helper so that we can see number in human readable representation. We compile it using GCC -g switch and we start GDB session to examine how layout looks like. Before that we execute program once and save output so that we can recognize our number, the second printf is of interest.

number = 3.14159265358979 sign = 0, exponent = 400, mantissa = 921FB54442D18

Here is debug session, rather most interesting parts:


After variable is initialized we queried address of it and then we asked for two units to be printed in hex format. Just to be on the safe side, let us print them 32bit by 32bit and 4 bytes by 4 bytes as well


Where we clearly have little-endian layout in memory, least significant byte at lowest address. The same number in register


Here we do not really have address so no endianness either. Maybe we could just use f format?

Wednesday, April 8, 2015

OpenMP, file tree walk and map-reduce example

All happens on Linux and we will use GCC and built in OpenMP support and also thread local storage. We will see how easy is to spread workload across threads using OpenMP.
In man pages we find complete example for nftw or file tree walk. That example will print to standard output about everything what it finds while walking directories. Instead of having it printing everything I made it count files:


To compile it I used:

gcc -O1 -o simpleftw simpleftw.c

In order to count files across multiple directories we will pass few directories in argument list to program and also turn our static variable counter into thread local storage. We will use pragma OpenMP parallel for directive combined with reduction which is in our case simple addition:


That print inside for loop should demonstrate that thread local storage works as expected. C99 switch is required, or we will have to specify that variable i is private. Also switch for OpenMP is required. To compile it I used:

gcc -O1 -fopenmp -o dwalker dwalker.c -std=c99

Execution produces on my box:

./dwalker ~/Documents/ ~/Downloads/
Thread 1 3894
Thread 2 7784
There was 11678 files in ...

Everything works as expected.

Monday, April 6, 2015

C pointers, assembler and GDB

Idea here is to do some multitasking, find out how C pointer looks like in assembler and use Linux system calls, GCC and GDB. We start with Hello World:


Execute man 2 write in terminal to find out more about write, 1 is standard output and 12 is number of bytes to be written. I saved it as hello.c and this is how it was compiled:

gcc -O0 -m64 -save-temps -o hlo hello.c

GCC will emit assembly code as temp, I like 64 bit version and no optimizations. Content of temp file is:



It pushes base pointer onto stack, there is q at the end of the push and bp is called rbp, so it must be 64 bit code. Loads string into memory, later into rax, prepares call to write and so on. Now we take that temp file and compile it with -g switch so that we can use GDB.

gcc -g -O0 -o hlo hello.s
gdb hlo


We type in start at prompt and here is the GDB session:


Label .LCO is our string, -8(%rbp) is where pointer is. After info registers in short form, I also restricted info to rbp, I find out that rbp points to  0x7fffffffe360 and at  0x7fffffffe360 - 8 = 0x7fffffffe358 is our pointer. No, 0x60 - 8 is not 0x52. After we send next instruction and line 17 is executed we can see our string in memory, slash 13 c is we want 13 characters from that address.
Since write is Linux system call, we can use strace to see what is going on.


After quite few lines were replaced with three consecutive dots we see that we managed to write all 13 characters to standard output.
Using system calls is so easy that one may even attempt writing to real file:


We added more system calls. We applied common flags and modes in open, checked for all possible errors, after writing string to file we have closed file. If we compile it using

gcc -O0 -m64 -save-temps -o hello2 hello.c

We have binary and also assembly code. With help of strace


Executing cat deleteme.blog we see that write really worked and strace didn't cheat. Examining assembly code will be interesting to those who are curious how if works on low level. Finally every project manager will take a note that from 08:13:04.407694 to 09:24:49.906151 I have written only 21 line of code.

Thursday, April 2, 2015

Extending PostgreSQL with own C functions

Power of open source, if you are not happy with what PostgreSQL currently offers, you write own extension in C. Compile code with your functions into shared library, install it and they will become available from PostgreSQL. OK there are some rules and procedures. Once we are inside PostgreSQL we are using its types, interfaces and utilities. Let us do Hello World example. To build example you will need postgresql-server-dev-9.1 or whatever version you are using, installed on your Linux box.


As we can see we are using new V1 spec. That is modified concat_text example from PostgreSQL documentation, look for 35.9.4. Version 1 Calling Conventions. We have load of different VAR macros. For example:

VARHDRSZ is the same as sizeof(int4), but it's considered good style to use the macro VARHDRSZ to refer to the size of the overhead for a variable-length type.

That is from PostgreSQL documentation. SET_VARSIZE we can find in includes postgres.h


Unless you are on big endian. Going through header file one also can read in comments more about Datum and varlena datatypes. Then we got palloc which corresponds to malloc, memcpy you already know and GET and RETURN macros. It is obvious that for writing extensions one needs to familiarize himself with PostgreSQL internals. Power without knowledge and responsibility exists only in fery tails told by “software evangelists” at annual developers developers developers meetings.
Variables passed around by PostgreSQL may be on the disc, do not change them.
To build shared library I used the following Makefile:


Rather long story to get location of pgxs. That pgxs is location of makefiles for building extensions. It is not trivial build and using provided mk files is right way to do it. After that we can copy say_hello.so to some reasonable location or give full path to it in create function declaration.


PostgreSQL already allows Python through untrusted language PL/Python. One can utilize power of Python for functions or triggers without learning much about PostgreSQL internals. But again if you need power and speed, you can use what PostgreSQL speaks internally and that is C.

Wednesday, April 1, 2015

SQLite parameterized query in C

Still very angry at Discovery Health but that is not reason to stop using C. To execute parameterized query we prepare SQL statement with one or more placeholders. Placeholder could be question mark, alone or followed by number, column, dollar sign or at sign followed by alphanumeric. For example:

select title, full_name from jane where id = @id;

We prepare such statement using sqlite3_prepare_v2, later we bind our parameter and finally execute query. To do binding we will use appropriate function, there is few of them:


All bind functions will return SQLITE_OK if binding is successful or error code if it fails. The first argument is handle to prepared statement. The second argument is index of parameter. The third argument is value to be set. For blob, text we have the fourth argument, size in bytes and the fifth – destructor function. Instead of destructor function we can pass constants SQLITE_STATIC - do nothing or SQLITE_TRANSIENT – make local copy and free it when done. To find out what is index of parameter we are using this function:

int sqlite3_bind_parameter_index(sqlite3_stmt*, const char *zName);

We pass prepared statement and parameter name it returns zero if there is no such parameter or parameter index if it exists. Even if we know that our parameter must have index one, we will still look for it to demonstrate how it is done. Here is the code:


Database should be loaded with required values in previous examples, if not here is sql to create it:

CREATE TABLE jane ( id INTEGER PRIMARY KEY NOT NULL, title TEXT, full_name TEXT NOT NULL )
INSERT INTO jane VALUES(1,'Mr','Smith');
INSERT INTO jane VALUES(2,'Mrs','Doe');
INSERT INTO jane VALUES(3,'Mr','Doe');


After we build it using:

gcc -g test.c -lsqlite3 -o test

We execute test and see the following output:


We could also misspell parameter name and rebuild to check is error handling working.

Tuesday, March 31, 2015

PostgreSQL, libpqxx and prepared statement example

As I promised, if people are finding interesting introductory article,  I will write more about PostgreSQL and libpqxx. But before I start with programming – rant.
I am still looking for work and finding none. Today went to Discovery Health, had very pleasant 45 minutes chat with their architect and BA and promise that we will talk again. Later guy from employment agency, who arranged meeting, calls and says that they do not want me since I do not have ANSI C in CV?! Like somebody from management decided to override decision of interviewers. If ANSI C in CV is precondition, why they wasted my time and invited me for interview? I wish them the same from their customers. BTW I am usually employed as senior developer and not as C developer or, Perl developer.
Back to programming, this time C++, had enough of ANSI C for today. Environment is Linux, I am using the same PostgreSQL and libpqxx as last time and to compile example we will use:

g++ hello_prep.cxx -o hello_prep -I/usr/local/include/ -lpqxx -lpq


This time I used test092.cxx, run dpkg with -L switch on libpqxx3-doc to see where is it. It tests passing binary parameter to prepared statement. Test macros are replaced with printing of tested values to standard output and setup of connection/transaction is included. Here is the code:


After connection is successfully obtained, transaction T is constructed. Temp table is created and after confirmation that prepared statements are supported, testing goes on. We have prepare::declaration and prepare::invocation, available in reference generated with doxygen. Adding parameters is in iterative fashion and feels natural, as they say in documentation like varargs from C. Library is well designed and easy to use, documentation, tutorial and tests are supplied. Lengths and contents should match and test succeeds.