May 5, 2018

Exploring an Integer Overflow

I was lucky to find a ticket against Boost.Regex describing an integer overflow. Never having worked on Boost, I took the opportunity to create a patch.

Boost.Regex was added to C++11, so I don’t know if my pull request will ever get merged. My rationale for providing a patch was for the experience and because the reporter may not be able to use C++11.

The source code:

  std::ptrdiff_t states = re.size();
  if(states == 0)
    states = 1;
  states *= states; // overflows here on 32 bit platforms if regex
                    // string length greater than 2**16

Two challenges in tackling this bug.

  1. The signed integer overflow.

    Clearly, is , so an integer overflow occurs. Unfortunately, states is a signed integral type, so the overflow occurs when the value of states is greater than , or .

  2. Implementation specific behaviour in some circumstances during initialization.

    Implementation specific because the conversion of an unsigned value to a signed value works as expected if the unsigned value can be represented in the signed value. Otherwise the result is implementation dependent.

    The function declaration for re.size() sets the return type as std::size_t. Unfortunately, whenever re.size() returns a value greater than std::numeric_limits<std::ptrdiff_t>::max() the value of states is implementation dependent.

The issue exists on 64-bit platforms as well. Only the magnitude of the values changes.

Signed integer overflow can be detected using CERTS’s INT32.C secure coding recommendation. They provide a compliant algorithm that works with all compilers.

#include <limits.h>
void func(signed int si_a, signed int si_b) {
  signed int result; 
  if (si_a > 0) {  /* si_a is positive */
    if (si_b > 0) {  /* si_a and si_b are positive */
      if (si_a > (INT_MAX / si_b)) {
        /* Handle error */
    } else { /* si_a positive, si_b nonpositive */
      if (si_b < (INT_MIN / si_a)) {
        /* Handle error */
    } /* si_a positive, si_b nonpositive */
  } else { /* si_a is nonpositive */
    if (si_b > 0) { /* si_a is nonpositive, si_b is positive */
      if (si_a < (INT_MIN / si_b)) {
        /* Handle error */
    } else { /* si_a and si_b are nonpositive */
      if ( (si_a != 0) && (si_b < (INT_MAX / si_a))) {
        /* Handle error */
    } /* End if si_a and si_b are nonpositive */
  } /* End if si_a is nonpositive */
  result = si_a * si_b;

Fortunately, the algorithm provided by CERT can be simplified in this case. The states variable is initialized with an unsigned integral type whose value is guaranteed to be greater than zero and _a = si_b. So we are really looking at:

#include <limits.h>
void func(signed int si_a, signed int si_b) {
  if (si_a > (INT_MAX / si_b)) {
    /* Handle error */

The initialization of std::ptrdiff_t with a std::size_t is strange. My best rationale for why this is written like this is that no regex state space is greater than (std::numeric_limits<std::ptrdiff_t>::max)(). I never established why the declaration of states used a type unable to contain the size of the regex. I figure the pull request will tell the tail.

In any event, changing the type declaration of states to match that of the value used to initialize it permits unsigned integer overflow checks.

Side note:

I’d be interested in understanding what the original reporter is doing that requires so many states in a regular expression.

April 29, 2018

The Temporary Scrum Master

I'm curious how rotating the Scrum Master role through the Development Team works out for the Development Team and the Organization as a whole.  I'm not sure that rotating the Scrum Master role is healthy. Selecting one member of the Development Team to permanently become the Scrum Master seems the better choice.

The Scrum Guide permits the the Product Owner and Scrum Master to execute work in the Sprint Backlog. I take this to mean both roles can be carried out by someone in the Development Team.

A review of the servant-leadership philosophy applied to the Scrum Master role provides insight on the challenges:
  • service to the Product Owner: the support the Scrum Master provides the Product Owner is not focused solely upon the domain. It includes Product Backlog management.
  • service to the Development Team: this focuses on the organization and the Scrum Team. It includes building bridges to other parts of the organization. It includes coaching the Development Team on self-organization and cross-functionality.
  • service to the Organization: this includes helping the organization leverage Scrum better.
When I hear about the Scrum Master role being fulfilled by the Development Team it usually includes a concessions to ensure the Scrum Master isn't taking on that role permanently. The motivation behind this concession is interesting.

Rotation implies that the organization isn't fully vested in Scrum. Further:
  • it implies the Scrum Master role is less valuable than the "other" role the Scrum Master has. 
  • it implies that Scrum Master isn't a good career choice for domain experts. 
  • it subjects the team to different Scrum Master's each with their own set of values and approaches.
Different values and approaches aren't bad. They are opportunities for learning. But they may cause confusion if you are just rolling out your Scrum implementation.

In all, rotating the Scrum Master doesn't sit well with me. The Scrum Master seems better suited as a permanent role. Even if I assume the Developer turned Scrum Master is a domain expert there are significant trade-offs involved in this approach, especially if you view Scrum as an important initiative that can benefit the entire organization.

April 6, 2018

GDB Command Files

A favorite feature of mine in GDB is the command file. A command file is a simple text file used to store GDB commands.

I usually use these files to store breakpoints which simplifies the set up of the debugging session.

For example, from within gdb run:

source command.txt

where command.txt contains:

set breakpoint pending on
b main

I use breakpoint pending on to permit deferred loading of shared libraries.

The fantastic thing about these files is that you can store them under revision control and build them up while you chase down an issue. And then use them to confirm the correction.

March 31, 2018

RBTLIB v0.3.0 On Travis CI

In RBTLIB v0.3.0 On Read The Docs, I discussed adding support for Read the Docs to RBTLIB. Recently, I added RBTLIB to Travis CI. Travis CI is super easy to work with. It provided the opportunity to eliminate deployment issues. This is important, as my ultimate goal for RBTLIB is availability through PyPi.

The main advantage Travis CI provides is the ability to test on different platforms and to eliminate issues of portability. I lack experience with Python's setup tools so there are likely to be issues as I move RBTLIB to PyPi.

All in all, v0.3.0 has significant infrastructure improvements over v0.2. Functionally, v0.3.0 targets posting of review requests through rbt.

March 8, 2018

Twitter Timelines on GitHub Pages

Adding a Twitter timeline to a Jekyll site hosted on GitHub is easy. The Twitter Developer Documentation provides instructions for this. See Embedded Timelines.

No Jekyll plugin required.