Working diary: 2018

Saturday, 9 June 2018

Google Test

Embed Google Test into my work as a formal Unit Testing environment

It's time to include Google test into my code base and make my development work more efficiently, or more cumbersome, let's see:
(This is just my learning note, which includes a lot of original content from the primer of the google test. Thank you the development team of the Google Test.)

Installation

It seems that there is no installation required. the google test module seems not too expensive to compile. Maybe it's a good idea to just keep a portable version into my source tree.

I believe this is the design point, portability, that is emphasised by the prime at the beginning.

<But it is still need a small amount of work to link google test into Make, which is my current build tool. But I am planning to change to CMake.>

Concept

Google test has a different definition of the 'Test Case' or 'Test' from the ISTQB (International Software Testing Qualification Board) for historical reason. But the legacy is kept because of the difficulty of changing everything in the code and more importantly make it not wrong!!! So:

Google Test vs ISTQB

TEST() = Test Case

Test Case = Test Suite

There are three assertion results:

success, nothing to say
nonfatal failure (EXPECT_*), process keeps going
fatal failure (ASSERT_*), stop running at the point of code produced fatal failure.

A Test case contains at least one test. It is encouraged to group tests into test with respect to the structure of the tested code.

A Test program can contain multiple test cases.

Assertions

ASSERT_EQ(x, y) << "extra message to be reported"

for (int i = 0; i < x.size(); ++i{

EXPECT_EQ(x[i], y[i]) << "Vectors x and y differ at index" << i;

}

Anything that can be fed into ostream, can also be used as a custom failure message and fed by <<.

The following is some common assertions, they are just self explained by their names:

[ASSERT | EXPECT]_TRUE (condition)

[ASSERT | EXPECT]_FALSE (condition)

[ASSERT | EXPECT]_EQ (val1, val2)

[ASSERT | EXPECT]_NE (val1, val2)

[ASSERT | EXPECT]_LT (val1, val2)

[ASSERT | EXPECT]_LE (val1, val2)

[ASSERT | EXPECT]_GT (val1, val2)

[ASSERT | EXPECT]_GE (val1, val2)

<WARNING: the following string tests should only be used to compare C strings, for string object, the Primer recommend to use the above value tests>

[ASSERT | EXPECT]_STREQ (str1, str2)

[ASSERT | EXPECT]_STRNE (str1, str2)

[ASSERT | EXPECT]_STRCASEEQ (str1, str2)

[ASSERT | EXPECT]_STRCASENE (str1, str2)

Simple Tests

Creation of a test in a test case using the macro, TEST() :

TEST(testCaseName, testName) {

... test body ...

}

For example, for a given function, which is to be tested:

int Factorial(int n);

The following tests can be created:

// Test factorial of 0.

TEST(FactorialTest, HandlesZeroInput) {

EXPECT_EQ(1, Factorial(0));

}

// Test factorial of positive numbers:

TEST(FactorialTest, HandlesPositiveInput) {

EXPECT_EQ(1, Factorial(1));

EXPECT_EQ(2, Factorial(2));

EXPECT_EQ(6, Factorial(3));

EXPECT_EQ(40320, Factorial(8));

}

So two related tests, HandlesZeroInput and HandlesPositiveInput have been created in the test case FactorialTest.

Test Fixtures

From my understand, the test fixture is just a test class with an extra context environment in which all tests are supposed to run.

The following steps can create a test fixture:

Derive a class from ::testing::Test and make sure its members can be accessed by its sub-classes.
Have a fun within the class body!!
To make any preparation, write a default constructor or SetUp() function.
To make any clean up, write a default destructor or TearDown() function.
Define subroutines if it is necessary.

Constructor/Destructor vs SetUp()/TearDown():

It is a per-test set-up and tear-down logic for both. It means that, for each test, the fixture will be created and destroyed.
Normally, constructor/destructor is preferred:

constructor can initialise const for the test.
sub-class will call constructor and destructor automatically. The test fixture will following the super-class. There is no risk of forgetting the calling of SetUp()/TearDown()

For other rare cases, SetUp()/TearDown() is preferred:

It's not ideal to have code in the destructor that can throw exception. But TearDown() can handle it.
No virtual function call in Constructor/Destructor, because they are statically bound. It is not possible to call overridden methods in a derived class.

For example for the testing of the following Queue class:

template <typename E>

class Queue {

public:

Queue();

void Enqueue(const E& element);

E* Dequeue();

size_t size();

...

};

To create a test fixture for the testing:

class QueueTest : public ::testing::Test{

protected:

virtual void SetUp() {

q1_.Enqueue(1);

q2_.Enqueue(2);

q3_.Enqueue(3);

}

// virtual void TearDown() // not need to clean up anything inside

Queue<int> q0_;

Queue<int> q1_;

Queue<int> q2_;

};

To create tests for the above test fixture:

TEST_F(QueueTest, IsEmptyInitially){

EXPECT_EQ(0, q0.size());

}

TEST_F(QueueTest, DequeueWorks){

int* n = q0_.Dequeue();

EXPECT_EQ(NULL, n);

n = q1_.Dequeue();

ASSERT_TRUE(n != NULL);

EXPECT_EQ(1, *n);

EXPECT_EQ(0, q1_.size());

delete n; // test has responsibility to clean n!!

n = q2_.Dequeue();

ASSERT_TRUE(n != NULL);

EXPECT_EQ(2, *n);

EXPECT_EQ(1, q2_.size());

delete n;

}

The things are behind the above tests:

Construct QueueTest object (eg. t1)
t1.SetUp()
run IsEmptyInitially test on t1
t1.TearDown()
t1 is destructed
1 - 5 repeats for another QueueTest (t2), this time running the DequeueWorks test.

Invoke tests

TEST() and TEST_F() implicitly register their tests.
Use RUN_ALL_TESTS() to run all tests. It return 0 on success, 1 otherwise.
Don't ignore the return value of the RUN_ALL_TESTS(), it should be passed to main and get it returned.
Call RUN_ALL_TESTS() only once. Multiple calling is not supported by the Google Test.

main() function of the Google Test:

int main (int argc, char **argv) {

::testing::InitialGoogleTest(&argc, argv);

return RUN_ALL_TESTS();

}

::testing::InitialGoogleTest get arguments from main and remove the flags that it recognised. This provides a user with a flexibility to control the tests via flags.

By linking the test with test_main library, the user does not need to create a main for the tests.

Conclusion

Here is all primer of using the Google Test. There are still more features to be investigated. That would be included in my following notes.

Follow up

Having tested the sample tests in the release package of the Google Test, all is good at least for the sample1, which is the only one sample I have tried :(.

However, after embedding the Google Test into myself project. It is getting strange. The message is as follows:

Undefined symbols for architecture x86_64:
"std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > space_tokeniser<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >)", referenced from:
(anonymous namespace)::StrTokeniserTest_SpaceTokeniserTest_Test::TestBody() in string_tokeniser_unittest.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [string_tokeniser_unittest] Error 1

My initial thought is that the different linking libraries between Google Test and my code cause this discrepancy, like what the most stack overflow contributors' reply to this message in the MAC system.

However, in my Linux box, the problem persists and produces another error message. So I believe the replies is not fit for my situation.

In the end, a common linking error thread (undefined-reference-to-template-class-constructor) on the stack overflow is found very helpful for my solution, which moves template function definition into .h file to replace its original declaration there.

Thanks to Arron, who provided the details of the solution and description. It's always good to solve problems finally, even though it is exchanged by a large time spending.

Tuesday, 15 May 2018

Apache HTTP Server on Ubuntu

The purpose of this document is to record all steps to install Apache http server and configure is as per various requirements. Nothing exciting happened during this process. This is highly related to the Ubuntu system and the specification of my own box. When it it referred by others to install and configure Apache in a Linux system, bear in mind the settings' difference.

Install Apache in Ubuntu 16.04 LTS

The installation here put the Apache into my Ubuntu system with minimum requirements to make it run properly. No advanced features of the HTTP server will be discussed in this stage until the later section of this document.

Advanced features of an Apache server

CGI:

Copy over the configure file of cgi (/etc/apache2/conf-available/serv-cgi-bin.conf) to the end of configure file of the Virtual Host (/etc/apache2/sites-available/<myhost_name>.conf), because I am using a Virtual Host in the Apache server.
The only part needs to change is the Alias and Directory.

ScriptAlias /cgi-bin/ /var/www/<myhost_name>/public_html/cgi-bin/
<Directory "/var/www/<myhost_name>/public_html/cgi-bin">

Reload (or restart) the Apache server to feed the change of configuration into a running server.
The URL to access the cgi file (helloworld.cgi) is: http://<myhost_name>/cgi-bin/helloworld.cgi

Saturday, 12 May 2018

libcurl in Linux

Learn to use libcurl in Linux

The objective of learning the libcurl is to implement a http access facility in my crawler program for a set of specialised websites in C++. I have not obtained a more appropriate http request library yet for this purpose. So libcurl is my first try. It will be tested and checked for this specific objective during the whole process of my learning. It is also worth to search for new alternative options that may help in more appropriate means. If it is in that case, new thread will be created. This thread only concentrates on the learning of the libcurl.

Create a learning environment (developement environment):

I will use Linux (Ubuntu) as the main environment for learning and testing the libcurl. So, in order to ease and short the preparation phase and make the learning experience more joyful, the apt-get is relied on to install all necessary development packages of relevant libraries. The following packages are installed beforehand:

libssl-dev
libssl-doc
libcurl4-openssl-dev
libcurl4-openssl-doc

these are the exact package names that apt-get install requires. The environment of the learning is listed as following:

OS: Ubuntu 16.04 LTS (64-bit)
Compilers: GCC 5.4, G++ 5.4
CPU: i5-3570K
Memory: 8GB
Hard Drive: 256GB SSD

Strange enough that my libcurl and ssl are installed under the anaconda3 directory under my home directory. I just have no idea why this happened.

Initialisation before everything:

A global initialisation for the library is necessary by using curl_global_init() function. There is also a corresponding clean up function curl_global_cleanup(). But keep in mind that these initialisation functions are NOT thread safe, even though most libcurl components are thread safe. These functions are expected to be invoked ONLY once for the entire life time of my program.

Run-time feature detection:

The return structure of the function: curl_version_info() contains the details of what the currently running libcurl supports.

Easy-Interface vs Multi-Interface:

The easy-interface is the synchronous transfer with blocked function calls. The multi-interface allows asynchronous transfer without blocking function call, which allows multiple simultaneous transfers. The easy-interface will come first in the following sections.

Easy-Interface:

All easy interface functions have the same prefix: 'curl_easy'

Handle: We should use one handle for each session in each one thread. DO NOT share a handle across multiple threads.
Options:

Setting: the function curl_easy_setopt() can set options for a handle. Options are sticky, they will change only when they will be given a different value.
Resetting: curl_easy_reset() blank all previously set options.
Copy: curl_easy_duphandle() produces another handle with the same option settings.

Write back the result:

Write function:

if the option CURLOPT_WRITEFUNCTION is set, the response will be processed by the denoted write function with the signature, size_t func_name (void *buffer, size_t size, size_t nmemb, void *userp).
if the option CURLOPT_WRITEDATA is set to a given structure. This type will be passed into the write function as the fourth parameter.

No Write function:

Output to stdout, if no write function is given, the response defaults to output to stdout.
Output to a file: if an opened file handle (FILE *) is passed to the curl handle as the option of CURLOPT_WRITEDATA. The file will store the response, rahter than the stdout.
<WARNING>: in some systems, passing opened file handle with CURLOPT_WRITEDATA crash the libcurl.

Make the transfer:

The function: curl_easy_perform() connects to the remote site and do the necessary commands and receives the response.
The given write function may get one byte at a time or it may get many kilobytes at once. libcurl delivers as much as possible, as often as possible.
The perform function returns a status code. But CURLOPT_ERRORBUFFER option can provide a buffer to keep the human-readable error message.
It is encouraged to re-use the transfer handle.

EXAMPLE and things to notice:

Not only -lcurl is required, -lssl and -lcrypto are also required to link to the openssl and libcrypto.so librarys. If they are not provided, the following error messages may be shown:

no -lssl: lib/libcurl.so: undefined reference to `SSLv2_client_method'
no -lcrypto: lib/libssl.so: undefined reference to `EVP_idea_cbc'

USE_CASEs:

Upload data to a remote site

Read data callback function: size_t read_function(char *bufptr, size_t size, size_t nitems, void *userp) will tell libcurl which data is going to transfer to the remote site.
Set read function: curl_easy_setopt(easyhandle, CURLOPT_READFUNCTION, read_function);
Set customer user data to be passed to the read function if it is needed: curl_easy_setopt(easyhandle, CURLOPT_READDATA, &filedata)
Set the operation of the perform is upload: curl_easy_setopt(easyhandle, CURLOPT_UPLOAD, 1L)
<WARNING>: a few protocols requires the expected file size as a prior knowledge of the transfer. This can be set by: curl_easy_setopt(easyhandle, CURLOPT_INFILESIZE_LARGE, file_size), where file_size must be a type of curl_off_t.

Providing username and password

Username and password can be provided in the URL: http://myname:thesecret@example.com/path
They can also be provided by setting handle's option:

curl_easy_setopt(easyhandle, CURLOPT_USERPWD, "myname:thesecret"). This is same as providing username and password in the URL.
curl_easy_setopt(easyhandle, CURLOPT_PROXYUSERPWD, "myname:thesecret"). This is to provide username and password for proxy only.
Back to the UNIX popular era, the file $HOME/.netrc file was usually used to keep the username and password for user's FTP credential in form of plain text. libcurl provides a method to use this file for not only FTP, but also HTTP: curl_easy_setopt(easyhandle, CURLOPT_NETRC, 1L)
The form of the .netrc file is as following:

machine myhost.mydomain.com

login userlogin

password secretword

Multi-Interface:

Cautions:

There is no internal thread synchronisation in libcurl, even though libcurl is thread safe.
Handles: never share the same handle in multiple threads. But you can pass the handles around among threads. But never use a single handle from more than one thread at any given time. (It looks useless for just passing a handle but not using it).
Shared objects: Certain data can be shared between multiple handles by using the share interface. But a locking mechanism (libcurl doesn't provide it internally) is to be provided by using the function: curl_share_setopt().

CURLSHOPT_LOCKFUNC
CURLSHOPT_UNLOCKFUNC

DEBUGGING:

Deal with run-time errors:

CURLOPT_VERBOSE (set1): spew out the entire protocol details the libcurl sends, some internal info, some received protocol data.
CURLOPT_HEADER (set 1): for HTTP to include headers in the normal body.
CURLOPT_DEBUGFUNCTION: for the situation where CURLOPT_VERBOSE is not enough.

Tuesday, 9 January 2018

Python unittest: everything needed to work

Unit test is essential for each project, even though it would be ignored when the project was getting time restrained. Here the basic steps are listed in the following sections to illustrate the usage of unittest module for Python project:

Test theory:

Test fixture: this is the test preparation before tests and any associated cleanup actions.
Test case: this is the most fundamental test unit.
Test suite: this is a collection of test cases and/or test suites.
Test runner: this is the most outer part of the test system, is used to orchestrate the execution of tests and to provide the outcomes to the user.

Here is the potential structure of these components in a project:

How to run unittest from command-line?

Here is an example of running unittest from command-line for the project Lux:

python -m unittest discover -s tests -p . -f -v

This will be explained in more details with further more options.

-m: run the 'unittest' library module as a script.

-v: verbose
-f (--failfast): stop the test run on the first error or failure.
-b (--buffer): stdout and stderr streams are bufferred during the test run. It will be discarded for passing tests and echo normally for test fail or error.
-c (--catch): First Control-C waits current running test and reports after its completion. A second Control-C raises the normal KeyboardInterrupt exception.
--locals: show local variables in trackbacks.

-s and -p: are discovery mode parameters.

Accepted input file list:

The unittest module can be run from the following different manner from a command-line environment.

Class or Module:

python -m unittest test_module1 test_module2python -m unittest test_module.TestClasspython -m unittest test_module.TestClass.test_method

File with path:

python -m unittest tests/test_something.py

Test Discovery:

python -m unittest

Here no input arguments are provided. This lead unittest module into a discovery mode, in which unittest searches test case class by itself in the given or current directory (by default).

Test Discovery

Running requirements:

All of the test files must be modules or packages (including namespaces packages) importable from the top-level directory of the project!!! (this can be denoted by -t option)

Arguments options:

-s (--start-directory): Directory to start discovery (. default)

-p (--pattern): Pattern to match test files (test*.py default)

-t (--top-level-directory): Top level directory of project (default to start directory)

Prepare the basic test unit: a Test Case

Subclass of TestCase

import unittest

class DefaultWidgetSizeTestCase(unittest.TestCase): def test_default_widget_size(self): widget = Widget('The widget') self.assertEqual(widget.size(), (50, 50))

Test (test_default_widget_size) is labelled as failure if function assertEqual() raised an exception. Any other exceptions are labelled as errors.

import unittest

class WidgetTestCase(unittest.TestCase): def setUp(self): self.widget = Widget('The widget') def tearDown(self): self.widget.dispose()

setUp() and tearDown() are used to prepare and cleanup tests. They will be called for every single test.

If setUp() fails, the test case is labelled as failed.

If setUp() succeeded, tearDown() will be run whether the test function is succeeded or not.

To use the test suite:

def suite(): suite = unittest.TestSuite() suite.addTest(WidgetTestCase('test_default_widget_size')) suite.addTest(WidgetTestCase('test_widget_resize')) return suite

if __name__ == '__main__': runner = unittest.TextTestRunner() runner.run(suite())

FunctionTestCase

This is a method to reuse the old test code back from the dark ages without systematic testing components. This is where FuntionTestCase comes into consideration.

FunctionTestCase is a subclass of the TestCase, which wraps an existing test function. Its flexibility makes the setUp() and tearDown() functions available to be provided:

def testSomething():
    something = makeSomething()
    assert something.name is not None
    # ...

testcase = unittest.FunctionTestCase(testSomething,
                                     setUp=makeSomethingDB,
                                     tearDown=deleteSomethingDB)

Even though the FunctionTestCase provides an alternative method to construct a test case. But this is not a recommend solution.

Skip a test

The following decorators implement test skipping and expected failures:

@unittest.skip(reason)¶: Unconditionally skip the decorated test. reason should describe why the test is being skipped.

@unittest.skipIf(condition, reason)¶: Skip the decorated test if condition is true.

@unittest.skipUnless(condition, reason)¶: Skip the decorated test unless condition is true.

@unittest.expectedFailure¶: Mark the test as an expected failure. If the test fails when run, the test is not counted as a failure.

exception unittest.SkipTest(reason)¶: This exception is raised to skip a test.

setUp() and tearDown() will not be run for skipped tests. setUpClass() and tearDownClass() will not be called for skipped test classes.

Subtest

This is specifically for iteration tests, where the test is commonly terminated after the first failure, which provide no obvious reason. The definition of the subtest would keep the iterative tests running to the end to show the abnormality along the whole iterative process.

Here is pretty much the most background of using Python unittest module. Most of the above content is from the Python Manual. Please refer it for more details.