Computer Scientist

Showing posts with label C. Show all posts
Showing posts with label C. Show all posts

Saturday, 12 May 2018

libcurl in Linux

Learn to use libcurl in Linux

The objective of learning the libcurl is to implement a http access facility  in my crawler program for a set of specialised websites in C++. I have not obtained a more appropriate http request library yet for this purpose.  So libcurl is my first try. It will be tested and checked for this specific objective during the whole process of my learning. It is also worth to search for new alternative options that may help in more appropriate means. If it is in that case, new thread will be created. This thread only concentrates on the learning of the libcurl. 

Create a learning environment (developement environment):

I will use Linux (Ubuntu) as the main environment for learning and testing the libcurl. So, in order to ease and short the preparation phase and make the learning experience more joyful, the apt-get is relied on to install all necessary development packages of relevant libraries. The following packages are installed beforehand: 
  • libssl-dev
  • libssl-doc
  • libcurl4-openssl-dev
  • libcurl4-openssl-doc
these are the exact package names that apt-get install requires. The environment of the learning is listed as following: 
  • OS: Ubuntu 16.04 LTS (64-bit)
  • Compilers: GCC 5.4, G++ 5.4
  • CPU: i5-3570K
  • Memory: 8GB
  • Hard Drive: 256GB SSD
Strange enough that my libcurl and ssl are installed under the anaconda3 directory under my home directory. I just have no idea why this happened.

Initialisation before everything: 

A global initialisation for the library is necessary by using curl_global_init() function. There is also a corresponding clean up function curl_global_cleanup(). But keep in mind that these initialisation functions are NOT thread safe, even though most libcurl components are thread safe. These functions are expected to be invoked ONLY once for the entire life time of my program.

Run-time feature detection: 

The return structure of the function: curl_version_info() contains the details of what the currently running libcurl supports.

Easy-Interface vs Multi-Interface: 

The easy-interface is the synchronous transfer with blocked function calls. The multi-interface allows asynchronous transfer without blocking function call, which allows multiple simultaneous transfers. The easy-interface will come first in the following sections.

Easy-Interface:

All easy interface functions have the same prefix: 'curl_easy'
  • Handle: We should use one handle for each session in each one thread. DO NOT share a handle across multiple threads.
  • Options: 
    • Setting: the function curl_easy_setopt() can set options for a handle. Options are sticky, they will change only when they will be given a different value.
    • Resetting: curl_easy_reset() blank all previously set options.
    • Copy: curl_easy_duphandle() produces another handle with the same option settings.
  • Write back the result: 
    • Write function: 
      • if the option CURLOPT_WRITEFUNCTION is set, the response will be processed by the denoted write function with the signature, size_t func_name (void *buffer, size_t size, size_t nmemb, void *userp)
      • if the option CURLOPT_WRITEDATA is set to a given structure. This type will be passed into the write function as the fourth parameter.
    • No Write function: 
      • Output to stdout, if no write function is given, the response defaults to output to stdout.
      • Output to a file: if an opened file handle (FILE *) is passed to the curl handle as the option of CURLOPT_WRITEDATA. The file will store the response, rahter than the stdout.
      • <WARNING>: in some systems, passing opened file handle with CURLOPT_WRITEDATA crash the libcurl.
  • Make the transfer: 
    • The function: curl_easy_perform() connects to the remote site and do the necessary commands and receives the response.
    • The given write function may get one byte at a time or it may get many kilobytes at once. libcurl delivers as much as possible, as often as possible. 
    • The perform function returns a status code. But CURLOPT_ERRORBUFFER option can provide a buffer to keep the human-readable error message.
    • It is encouraged to re-use the transfer handle.
  • EXAMPLE and things to notice: 
    • Not only -lcurl is required, -lssl and -lcrypto are also required to link to the openssl and libcrypto.so librarys. If they are not provided, the following error messages may be shown: 
      • no -lssl: lib/libcurl.so: undefined reference to `SSLv2_client_method'
      • no -lcrypto: lib/libssl.so: undefined reference to `EVP_idea_cbc'

USE_CASEs:

Upload data to a remote site
  • Read data callback function: size_t read_function(char *bufptr, size_t size, size_t nitems, void *userp) will tell libcurl which data is going to transfer to the remote site.
  • Set read function: curl_easy_setopt(easyhandle, CURLOPT_READFUNCTION, read_function);
  • Set customer user data to be passed to the read function if it is needed: curl_easy_setopt(easyhandle, CURLOPT_READDATA, &filedata)
  • Set the operation of the perform is upload: curl_easy_setopt(easyhandle, CURLOPT_UPLOAD, 1L)
  • <WARNING>: a few protocols requires the expected file size as a prior knowledge of the transfer. This can be set by: curl_easy_setopt(easyhandle, CURLOPT_INFILESIZE_LARGE, file_size), where file_size must be a type of curl_off_t.
Providing username and password
  • Username and password can be provided in the URL: http://myname:thesecret@example.com/path
  • They can also be provided by setting handle's option: 
    • curl_easy_setopt(easyhandle, CURLOPT_USERPWD, "myname:thesecret"). This is same as providing username and password in the URL.
    • curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "myname:thesecret"). This is to provide username and password for proxy only.
    • Back to the UNIX popular era, the file $HOME/.netrc file was usually used to keep the username and password for user's FTP credential in form of plain text. libcurl provides a method to use this file for not only FTP, but also HTTP: curl_easy_setopt(easyhandle, CURLOPT_NETRC, 1L)
    • The form of the .netrc file is as following: 
machine myhost.mydomain.com
login userlogin
password secretword 

Multi-Interface:

Cautions: 
  • There is no internal thread synchronisation in libcurl, even though libcurl is thread safe.
  • Handles: never share the same handle in multiple threads. But you can pass the handles around among threads. But never use a single handle from more than one thread at any given time. (It looks useless for just passing a handle but not using it).
  • Shared objects: Certain data can be shared between multiple handles by using the share interface. But a locking mechanism (libcurl doesn't provide it internally) is to be provided by using the function: curl_share_setopt()
    • CURLSHOPT_LOCKFUNC 
    • CURLSHOPT_UNLOCKFUNC

DEBUGGING:

Deal with run-time errors:
  • CURLOPT_VERBOSE (set1): spew out the entire protocol details the libcurl sends, some internal info, some received protocol data.
  • CURLOPT_HEADER (set 1): for HTTP to include headers in the normal body.
  • CURLOPT_DEBUGFUNCTION: for the situation where CURLOPT_VERBOSE is not enough.



















Tuesday, 10 October 2017

CMake Learning 1

Here is a snippet collect of learning CMake tool, which replaces my autoconf tool sets.

This is not a complete and clean version. A final cleanse and restructure are required,


Basic Usage:
  1. CMake is case insensitive. 
  2. Variables: 
    1. Variable: ${VAR}. set is used to set variable values: set (Foo a b c).
    2. command(${Foo}) = command(a b c). command("${Foo}") = command("a b c").
    3. $ENV{VAR}: is for accessing to system environment variables.
  3. CMakefile:
    1. # means a start of a comment line in the file.
    2. add_executable: build executable using the given list of files.
    3. find_library: look for specific libraries with different NAMES and in different PATHS. 
    4. target_link_library: link libraries to executables.
  4. Run CMake: 
    1. Two directories: Source directory and Build directory
    2.  In-source build: source directory and build directory are the same.

Main Structures:
The main components in CMake are implemented as C++ classes, which are then referenced in many of CMake commands.

The following is the structure of the main CMake components:


  1. Source files: C or C++ source code.
  2. Targets: is typically an executable or library.
  3. Directory: each directory contains one or several targets which are built from the source files in the directory.
  4. Generators: each directory has a local generator that is responsible for generating the Makefiles or project files for the directory. All local generators share a common global generators to oversee the build process.
For example, under Visual Studio 7, the global generator creates a solution file for the entire project while the local generators create a project file for each target in their directory.
In Unix makefile generator, the local generators create the makefiles in each sub-directory and the global generator creates the top-level makefile in the root directory.

Mechanism of CMake commands:
  • There are two main parts for each command: InitialPass method and FinalPass method. 
  • InitialPass method accepts arguments and local generator (cmMakefile instance). Apply the command using the arguments and store the results in the provided cmMakefile instance.
  • FinalPass method runs only after the invocation of InitialPass methods from all commands. Not all commands have FinalPass. For some commands, the global information is required, which may not be available in the InitialPass phase.

Targets:
  • add_library, add_executable and add_custom_target command creates a target. eg: add_library (foo STATIC foo1.c foo2.c) [created a static library called foo]
    • STATIC, SHARED, MODULE are available options. In most system, SHARED and MODULE are the same but not in Mac OS X. 
    • If it is blank, the variable BUILD_SHARED_LIBS controls whether a shared or static library should be built. By default, CMake builds a static library.
  • set_target_property and get_target_property commands to manipulate target's properties.
  • target_link_libraries command to denote a list of libraries that the target link against. Accepted format: libraries, full path to libraries or name of a library from an add_library command.

Source Files:

  • set_source_files_properties and get_source_file_property are to access a source file's properties.
    • COMPLILE_FLAGS
    • GENERATED
    • OBJECT_DEPENDS
    • ABSTRACTWRAP_EXCLUDE
Variables:

  • Variables have scope which is after its definition.
  • Two examples of scope:
    function (foo)
        message(${test} # output 1 here
        set (test 2)
        messgae(${test} # output 2 here 
    endfunction()

    set(test 1)
    foo()
    message (${test}) # This is still a 1.

vs

    function (foo)
        message(${test} # output 1 here
        set (test 2 PATENT_SCOPE)
        messgae(${test} # output 2 here 
    endfunction()

    set(test 1)
    foo()
    message (${test}) # This is 2 now.


  • if command:
    set (FOO 1)
 
    if (${FOO} LESS 2)
        set (FOO 2)
    else (${FOO} LESS 2)
        set (FOO 3)
    endif(${FOO} LESS 2
  • loop command:

    set (items_to_buy apple, orange pear beer)

    foreeach (item ${iterm_to_buy})
        message("Don't forget to buy one ${itme}")
    endforeeach ()

Cache Entries:

  • In case the user is allowed to build the project to set a variable from the CMake user interface, the variable must be a cache entry.
  • A cache file is produced in the build directory, which stores the user's selections and choices as its main purpose.
  • option (USE_JPEG "Do you want to use the jpeg library"). This command creates a variable called USE_JPEG and put it into the cache.
  • Ways of create a cache entry: 
    • option
    • find_file
    • set using CACHE option: set (USE_JPEG ON CACHE BOOL "include jpeg support?"), the following variable types must be used for GUI to control how that variable is set and displayed.
      • BOOL
      • PATH
      • FILEPATH
      • STRING
  • Another purpose of using cache is to store key variables that are expensive to determine, such as CMAKE_WORDS_BIGENDIAN, which needs to compile and run a program to determine its value. This is to prevent having to recompute them every time CMake is run.
  • mark_as_advanced is used to set a cache entry as an advanced cache, which will not be shown at first when the CMake GUI is run. The advanced cache entries are other options that the user can modify, but typically will not.
  • the value of cache entry can be restriced to a limited set of predefined options (pull down list) by setting it's property: 
    • set (CRYPTOBACKEND "OpenSSL" CACHE STRING "Select a cryptography backend")
    • set_property (CACHE CRYPTOBACKEND PROPERTY STRINGS "OpenSSL" "LibTomCrypt" "LibDES")
  • The following points are worth to pointing: 
    • Variables in cache can still be overridden in a CMakeLists file using set without CACHE optoin.
    • Cache values are checked only if the variable is not found in the current cmMakefile instance before CMakeLists file processing begins.
    • The set command will set the variable for processing the current CMakeLists file without changing the value in the cache.
    • "Once a variable is in the cache, its "cache" value cannot normally be modified from a CMakeLists file. The reasoning behind this is that once CMake has put the variable into the cache with its initial value, the user may then modify that value from the GUI. If the next invocation of CMake overwrote their change back to the set value, the user would never be able to make a change that CMake wouldn't overwrite. So a set (FOO ON CACHE BOOL "doc") comman will typically only do something when the cache doesn't have the variable in it. Once the variable is in the cache, that command will have no effect."
Build Configurations:

  • Supported build configurations: 
    • Debug, has basic debug flags turned on.
    • Release, has basic optimizations turned on.
    • MinSizeRel, has the flags that produce the smallest object code, but not necessarily the fastest code.
    • RelWithDebInfo, builds an optimised build with debug information as well.
  • For Visual Studio IDE, CMAKE_CONFIGURATION_TYPES is used to tell CMake which configurations to put in the workspace.
  • For Makefile, only one configuration can be active , whcih is specified by the CMAKE_BUILD_TYPE variable. The corresponding flags (CMAKE_CXX_FLAGS_<ConfigName>) are added to the compile lines. Make file doesn't create subdirectories for object files. So to build both debug and release trees, two separated build directories are required and use the out of source build feature of CMake, for example: (ccmake ../MyProject -DCMAKE_BUILD_TYPE:STRING=Debug) or (ccmake ../MyProject -DCMAKE_BUILD_TYPE:STRING=Release)




Saturday, 23 March 2013

C/C++ Programming Tips

Here, some useful tips or unknowns of C/C++ programming are listed for further reference:

1. Stand alone curly brackets in code.
Sometimes a pair of curly brackets can be found in a C/C++ code file. It seems to be useless:

Configurator::only ().dump();

{
    strbuf x = strbuf ("starting: ");
    for (int i = 0; i < argc; i++) {
        x << argv[i] << " ";
    }
    x << "\n";
    info << x;
}

Here, the curly brackets is not necessarily to exist. However, they have their meanings from two different point of views: Scope perspective and Legacy perspective.
In scope perspective, the codes in the pair of curly brackets is in a new sub scope.
In legacy perspective, the legacy C standard needs variable declarations to be in the very front of the code. The curly brackets are able to make a new "beginning".
I am working on a C++ project. So the original coder considered in a scope perspective.

Sunday, 21 October 2012

Thread-local storage

Thread-local storage (TLS) is a programming method that is used to make static and global variables be local to a thread. The '"errno" is a canonical example of TLS method. Actually, a "errno" should be a static variable used by all functions to determine the error status. However, in each thread of a running process, there should be a separated one in order to prevent the infection from other threads in the same process.

The comparing method to achieve the similar function with the TLS is the thread lock for a static or global variable. However, a carelessness of requiring the lock in any threads in the process would affect other threads.

Tuesday, 12 June 2012

How to estimate the order of magnitude of Factorial product

The Factorial of a number usually denotes a very big order of magnitude, especially for the computer world. In order to approximate the magnitude of the factorial of a number, it is necessary to find a method to represent the factorial of a number and to keep it into the computer memory. In my example, I am estimating the factorial of 200 which exceeds much more than the capability of any type of C programming language to keep it. An uniform method to represent this big number is important during the estimation. The only way to represent this gigantic big number is by using some smaller numbers that are with in the capability of a C compiler.

My method to represent a number is to use the prime numbers that is smaller that this number. The big number can be factorised into these prime number, so the big number can easily represented the some powers of these prime numbers. For example:

10! = 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 9 x 10
      = 1 x 2 x 3 x ( 2 x 2 ) x 5 x ( 2 x 3 ) x 7 x ( 2 x 2 x 2 ) x ( 3 x 3 ) x ( 2 x 5 )
      = 28 x 34 x 52 x 7

In this case, the factorial of 10 can be represented by these four much smaller prime numbers: 2, 3, 5, 7 with their exponential numbers. Although it is difficult to reproduce the similar procedure for the factorial of 200 by sketching on the paper, the result of the factorial of 200 is not difficult to estimate, because the only difference of the factorial of 200 and that of 10 is the number of prime numbers involved. According to the prime number list referring to the wikipedia page, there are only 47 prime numbers smaller than 200. The factorial of 200 is easily represented by these prime numbers with their exponential numbers. The problem of finding all of the prime numbers smaller than a number is out of discussion of this thread. I will discuss it in more details in another thread.

When we've got the prime-numbers representation of the factorial number, the next task should be concerned is how to estimate its magnitude using this prime_number representation.

As the scientific representation is used to represent a big number, what we concern is the number zeros following the number if the first radix point is placed immediately after the first significant digit, for example, 1.23456 x 1020. The magnitude of this number is ten raised to power of 20. In other words, we are counting the number of zeros when we are talking about the magnitude of a number. The example mentioned above is for decimal number. As for the binary number, zero should still be the digit to be counted, but with different means.

I will give the method for decimal number magnitude estimation, firstly. In order to count the number zero which means a ten (10), we need to make a logarithm to the original number. Using the prime-number representation will make a more convenient way to calculate.

log10(10!) = log10(28 x 34 x 52 x 7)
                  = log(28) + log10(34) + log10(52) + log10(7)
                  = 8 x log10(2) + 4 x log10(3) + 2 x log10(5) + log10(7)

These much smaller prime-numbers logarithm is easily to be calculated. Then the number of zeros ( x 10) can be evaluated form this method.

As for a binary number the log2() should be used to calculate the number twos.

Monday, 16 April 2012

BE CAUTIONS: when you are using 'strncpy'

In my experiment results, an random error happened quite frequently. But it is not show itself in each time's experiment. It makes me be really frustrated as it is shown in a experiment. The PROBLEM is: an additional extra number always suffix the real number. Due to it happened randomly, which means that no patterns can be found from the situations it is shown, it is difficult to spot the problem's root cause.

What can I do at this point is to skim through the code where it probably happened from. Because this code is written with pure C instead of C++, the most possible reason should be the wrong memory pointer.

After the searching of man docs of the Linux, the reason is found --- the using of 'strncpy'.

When we copy a group of characters to a destination C-type string, a null-termination will be added to the destination automatically as 'strcpy' (without the 'n') is used. There are no problems at all.

However, the 'strncpy' is used, we need to think about the null-termination problem. As the statement of warning in the man doc, if the first 'n' characters of source C-type string have not null-termination in it. This function will not append the '\0' termination to the end of destination string automatically. This is not related to the size of destination string. We only need to be bare in mind the size of n and the first n characters of source string.

SOLUTION: add the '\0' termination to the destination string manually.

Monday, 16 January 2012

Terminal Colors

Here's a list of different colors:
30 black foreground
31 red foreground
32 green foreground
33 brown foreground
34 blue foreground
35 magenta (purple) foreground
36 cyan (light blue) foreground
37 gray foreground

40 black background
41 red background
42 green background
43 brown background
44 blue background
45 magenta background
46 cyan background
47 white background

Commands can also be combined using a semicolon, like so:


printf("\033[45;37mGrey on purple.\033[0m");



Finally here is a list of other neat commands that go at the end (where the '0' is):
0 reset all attributes to their defaults
1 set bold
5 set blink
7 set reverse video
22 set normal intensity
25 blink off
27 reverse video off


=====================================================
Another one:


\033[22;30m - black
\033[22;31m - red
\033[22;32m - green
\033[22;33m - brown
\033[22;34m - blue
\033[22;35m - magenta
\033[22;36m - cyan
\033[22;37m - gray
\033[01;30m - dark gray
\033[01;31m - light red
\033[01;32m - light green
\033[01;33m - yellow
\033[01;34m - light blue
\033[01;35m - light magenta
\033[01;36m - light cyan
\033[01;37m - white

Saturday, 19 November 2011

Use LD_PRELOAD to load another version library before running

LD_PRELOAD is a fantastic method to debug the code or library.

Basically, this just record other's work on the LD_PRELOAD and description. That's is fair enough, so I won't bored to modify that:


You Are Here

LD_PRELOAD fun

Here is a welcome digression from my previous Twitter oriented posts. I’m starting to play around with the LD_PRELOAD feature in the Linux dynamic linker. For those who might not know what this feature is, here is the description from ld.so (8).

LD_PRELOAD
              A whitespace-separated list of additional,  user-specified,  ELF
              shared  libraries  to  be loaded before all others.  This can be
              used  to  selectively  override  functions   in   other   shared
              libraries.   For  setuid/setgid  ELF binaries, only libraries in
              the standard search directories that are  also  setgid  will  be
              loaded.
So in pratical term, any libraries you specify in the LD_PRELOAD environment variable will loaded before any system libraries. This means that dynamic symbols in a loading program will be first searched in those libraries before being searched anywhere else. This means you can override any defined symbol you want in standard libraries.
Let’s start with a rather juvenile example. This will change the behavior of the read (2) function in order to make the user believe a file might have a different content.

 
ssize_t read(int fd, void *buf, size_t count) {
    static int done = 0;
    if (!done) {
        char silly_str[] = &amp;quot;Haha you got overriden.\n&amp;quot;;
        size_t s = count &amp;amp;gt; sizeof(silly_str) ? sizeof(silly_str) : count;
        memcpy(buf, silly_str, s);
        done = 1;
        return s;
    }
    else return 0;
}
If you compile this inside a library that is called, for example, libread.so, you can test this code by running:
> /bin/cat /etc/fstab
# /etc/fstab: static file system information.
#
...
> LD_LIBRARY_PATH=. LD_PRELOAD=libread.so /bin/cat /etc/fstab
Haha you got overriden.
That in itself is just a rather silly prank you can play on your friend’s computer if you happen to have access to it. Experienced programmer will start seeing potential uses for LD_PRELOAD. I am getting to that.
The subject of our next example will be the honorable ls (1). ls uses the opendir (3) function to open a directory and browse its files. It should react properly if it can’t open the directory. One way to test this is to make opendir() return NULL and observe how the caller reacts. You can do that using LD_PRELOAD.

DIR *opendir(const char *name) {
    return NULL;
}
 
> LD_LIBRARY_PATH=. LD_PRELOAD=libls1.so /bin/ls /tmp
/bin/ls: cannot open directory /tmp
What can you do now if you want to preserve part of the behavior of the function, or modify they result it returns? Your preloaded library will then need to use libdl to dynamically load the function it wants to modify the behavior.
The following example is a very simple override of the opendir (3) function which open a different directory than what the caller expects. I will explain more in detail the details of this function below.

DIR *opendir(const char *name) {
    DIR *(*libc_opendir)(const char *name);
    *(void **)(&libc_opendir) = dlsym(RTLD_NEXT, "opendir");
    return libc_opendir("/tmp");
}

libdl is fortunately very simple to use. The naive approach would be to use dlopen (3) to open the C library, then get the pointer to the function you are calling using dlsym (3). In theory, this technique is valid and working, but doing that circumvents the LD_PRELOAD mechanisme because preloaded libraries can be chained and calling directly into the C library prevents other caller to override our own function.
In practice, calling dlopen() on libc on an Ubuntu Karmic system made some program crash and burn for reasons I will not attempt to explain. The next technique should be preferred on Linux system, especially when dealing with the system C library.
dlsym() has an option that makes the Linux dynamic linker search for the right symbol to be override. This is the RTLD_NEXT flag, which is to be used just for the purpose of wrapper dynamic library functions.
libdl the task of returning the pointer to the right symbol. The RTLD_NEXT option to dlsym() returns the right symbol.
The next and final example of the use of LD_PRELOAD will still use the valiant ls. In time for Christmas, this will modify the output of ls by randomizing the d_type field returned in the dirent structure by readdir (3). If you use colorized ls output, and I believe most of you probably do, you should see a pretty display of color whenever you list a directory by preloading this function.

struct dirent64 *readdir64(DIR *dir) {
    static struct dirent64 *(* libc_readdir64)(DIR *dir) = NULL;
    struct dirent64 *dent;
    unsigned char rnd_dtype[7] = { DT_UNKNOWN, DT_REG,
                                   DT_DIR, DT_FIFO,
                                   DT_SOCK, DT_CHR,
                                   DT_BLK };
    if (libc_readdir64 == NULL) {
        *(void **)(&libc_readdir64) = dlsym(RTLD_NEXT, "readdir64");
        srand(time(NULL));
    }
    dent = libc_readdir64(dir);
    if (dent != NULL)
        dent->d_type = rnd_dtype[rand() % 7];
    return dent;
}
There is still a problem with this code on my new Ubuntu Hardy machine. The code from the preloaded library hangs before the program terminates. I do not understand why this happen and a search for this bug did not turn up anything. The problem doesn’t happen with Ubuntu Karmic.
There is nothing new about using LD_PRELOAD this way. Several very nice libraries have been built with the intention of modifying the behavior of typical libraries.
  • fakeroot: “fakeroot provides a fake root environment by means of LD_PRELOAD and SYSV IPC (or TCP) trickery.”
  • fakechroot: fakechroot provides a fake chroot environment to programs.
  • libtrash:“[...] the shared library which, when preloaded, implements a trash can under GNU/Linux”
  • cowdancer: cowdancer is an userland implementation of copy-on-write filesystem.
There are 29 projects matching LD_PRELOAD on freshmeat.net. You might have used some of them.
The code I have written for this demonstration is available on BitBucket.
Written by fdgonthier
January 11th, 2010 at 10:10 pm
Posted in Debian,Linux,Programming,Tips and Tricks
Tagged with , , , ,

reference:  http://www.lostwebsite.net/2010/01/ld_preload-fun/

Timing problem in LInux

Gettimeofday???
Linux provides a 'gettimeofday()' function for users to check the epoch time. However, it gets the system's best guess at wall time. This can go backwards. This is why a 'timer_correct' function is embedded in the Libevent's event.c file. This function will be used when the "MONOTONIC" clock is not in used.


Monotonic clock
In Linux, another function 'clock_gettime(CLOCK_MONOTONIC)' is used to obtain the monotonic time, where the monotonic means that there is no possible to get a time backwards with this function. In this case, this is more reasonable used be used.

"POSIX.1-2008 marks gettimeofday() as obsolete, recommending the use of clock_gettime(2) instead."

Wednesday, 5 October 2011

Allocate memory in a function

I used to try to find some ways to allocate memory in a function. When other functions invoke this function with a null pointer as a parameter, this null pointer would be filled with some contents.

The most silly way to accomplish this task is like this:

void fill_function (void * pointer){
    pointer = (void *) malloc (certain length);

    pointer is filled with some contects;
}

void main () {
    int * file_content;
    fill_function(file_content);

    printf("", file_content);
}

Here, the most important thing which is ignored is the malloc function will allocate the memory in a certain position of the memory which is completely decided by malloc not the one on the left side of the equal symbol. If the file_content pointer is initialized with a number 5(assume that this is a memory address). The fill_function will not allocate the memory space from 5. The malloc will find another free place (for example 102, another memory address) and allocate the continuous memory space after 102, the local pointer variable pointer will be set to be 102. However, the invoking function main has no chance to catch this allocated address.

So a more reasonable way is like this:

int * fill_function (){
    int * pointer = (int *) malloc (certain length);

    return pointer;
}

void main(){
    int * file_content = fill_function();

    printf("", file_content);
}

Another problem is the garbage allocation collection problem. At this point, the boost::shared_ptr<> is encouraged to be used in this situation if you are programming by C++.

Monday, 13 June 2011

Double pointer's usages (on going)

The first usage of the double pointer is to allocate memory for a given pointer:

         nlp = &res.readdir_res_u.list;
while (d = readdir(dirp)) {
nl = *nlp = (namenode *)malloc(sizeof(namenode));
nl->name = strdup(d->d_name);
nlp = &nl->pNext;
}
*nlp = NULL;

This comes from "Power Programming with RPC P84"

Saturday, 19 February 2011

Usage of u_int32_t and size_t

Actually, the usage of these kinds of types confused me for a quite long time, because I really don't know what's the differences between these types and the C's primitive types such as 'int'?

The C's primitive types such as 'long', 'int' are machine dependent, which means different types system probably has different definition of long and int. For example, in old x86 (486), 'int' may be defined by 2 bytes. However, it is 4 bytes in most machines nowadays. In this case, 'u_int32_t' or 'int32_t' is invented to be a system independent type. In fedora, they are defined in file '/usr/include/sys/types.h', 'u_int32_t' is only  possible to be a 'int' type with 32bit (4 bytes). This will provide the program with being portable.

Wednesday, 15 December 2010

Discussion on Array size, String length.

This is an revision concentrating two functions, sizeof() and strlen().

There are several manners for a programmer to define a string in C/C++ programs.

  1. char pointer: char *string; 
  2. char array: char string[100];

In order to initialize them, the following steps work.

  • define string immediately if we know what we want to define.
        char string[] = "This is what we want to defined";
        char *string = "This is what we want to defined";
  • define string first and then give the specific number afterwards.
        char string[100];
        string = "This is what we want to defined";
            CAUTION:<<This is not allowed in C++, can not assign an array to another array>>
            INSTEAD: strcpy(string, "This is what we want to defined");

        char *string;
        string = "This is what we want to defined";

In the following part, I give some different defined strings in my code. The print out is the results of two functions, sizeof() and strlen.

Here is the code:

    char *test;
    char test2[100];


    test = "This is what we want to define";
    char buffer []= "This is what we want to define";
    strcpy(test2, "This is what we want to define");
   
    std::cout << "test sizeof " << sizeof(test) << "\n";
    std::cout << "test strlen " << strlen(test) << "\n";
   
    std::cout << "buffer sizeof " << sizeof(buffer) << "\n";
    std::cout << "buffer strlen " << strlen(buffer) << "\n";
   
    std::cout << "test2 sizeof " << sizeof (test2) << "\n";
    std::cout << "test2 strlen " << strlen(test2) << "\n";

The print out is:

   test sizeof 4
   test strlen 30
   buffer sizeof 31
   buffer strlen 30
   test2 sizeof 100
   test2 strlen 30

Another aspect of the difference between strlen() and sizeof() is that strlen needs a function call to determine the string length, however, sizeof is able to give the length during the compile process. The buffer's example demonstrates this argument quite well. But, the prerequisite is that the sizeof() is able to give rather correct string length. The string should be defined and initialized as buffer example dose. In this case, bear in mind that sizeof will include the '\0' but strlent will not.

Hopefully, this makes clear of the usage of string.

Tuesday, 7 December 2010

Parsing Long Options

Find this topic in DOCUMENT of GNU C library: libc


Here I conclude some useful tips:


== return values of getopt():
  • successful
    • a character (the option name without argument)
    • a character (the option name), a pointer to char (char *optarg: argument)
  • failed
    • '?' (not included in options OR missing argument) (int optopt keeps the character)
  • -1 complete

== return values of getopt_long ():

  • successful
    • short_options
      • (same with getopt())
    • long_options
      • content of val (flag = NULL) (Tips, put corresponding short option char in val)
      • 0 (flag != NULL, put content of val into *flag)
      • (same with above two) (with argument are stored in optarg)
  • failed
    • (same with getopt())
  • -1 complete
    PS: indexptr record the index of the options in array of struct option.

Tuesday, 30 November 2010

Application of Double pointer in C

1. 作为参数, 用于函数改写指针参数。




2. 用来组建多维数组。
      具体参见: http://landerchan.blogspot.com/2008/11/dynamically-allocating-multidimensional.html

Tuesday, 23 November 2010

The meaning of printf(_("Some Strings"))

This is related international programming and GNU i18n. _() can be identified as the Macro of gettext() function. More details can be found here:
http://en.wikipedia.org/wiki/GNU_gettext