November 1, 2019

Class Breakdown for a JIRA Worklog

  —Domain entities are the most important elements of models.

I’ve had occasion to write some tools that explore ticket data using JIRA. JIRA provides a rich REST API for doing these kinds of explorations.

One of the problems I ran into was the concept of worklog. JIRA associates a worklog with each JIRA issue.

The basic relationship is that a project is associated with zero or more issues. Each issue is associated a worklog. Each worklog has zero or more worklogs (I’ll call these worklog entries to differentiate them from the issue worklog).

My problem was finding a good class breakdown for this structure. I started by focusing on the collections of worklogs associated with a project.

The code looked something like this:

Worklog = namedtuple('Worklog', 'created comment timeSpentSeconds')

class IssueWorklog(object):
    def __init__(self, issue_key):
        self._worklog = list(map(lambda worklog: Worklog._make(worklog['created'], worklog['comment'], worklog['timeSpentSeconds']), GetIssueWorklog(issue_key)

    def get_worklog_entry(self):
        for worklog in self._worklog:
            yield worklog

class ProjectWorklog(object):
    def __init__(self, project_key):
        self._worklog = list(map(lamda issue: IssueWorklog(issue['key']), SearchUri(project_key)))

    def get_worklog_entry(self):
        for worklog in self._worklog:
            yield worklog

(GetIssueWorklog() and SearchUri() are methods that manages calls to the Get Issue Worklog and Search resources defined in JIRA’s REST API.)

Needless to say, I hated this implementation. I hated it because the use of classes in this instance is overkill. All this does is create a collection of worklog entries.

Other parts of my applications took the worklog entries and generated plots from them. In all, you could do this with three functions (or a nested for-loop):

  For each issue in the project:
     For each worklog entry in the issue:
         Add worklog entry to a container.

In my opinion, I’ve completely missed the point of good design.

An alternative design that I like much better looks like this.

WorklogEntry = namedtuple('WorklogEntry', 'created comment timeSpentSeconds')

class ResampleFrequency(Enum):
    MONTH_END = 'M'

class Worklog(object):
    """A collection of worklog entries from a project, an issue, or both."""

   def __init__(self, worklog_entries):
       self._df = pd.DataFrame(sorted(worklog_entries, key = lambda worklog: worklog.created), columns = WorklogEntry._fields)

   def doSumAndResample(self, new_frequency):
       return self._df.resample(new_frequency).sum(), columns = [ 'timeSpentSeconds'])

The important part here is the shift in emphasis from operations for the collection of worklog entries to operations on WorklogEntry objects. Here a worklog entry is a container and the Worklog class does all of the heavy lifting.

Can I do better than the doResampleAndSumTimeSeries() method? Well, let’s see.

# Worklog information collected from a JIRA issue.
WorklogEntry = namedtuple('Worklog', 'key summary comment created timeSpentSeconds')

class ResampleFrequency(Enum):
    """Enumerate different Pandas resample freqencies using Pandas DateOffset objects."""
    MONTH_END = 'M'

class Worklog(object):
    """Construct a Panda data frame from collected worklogs.

    A worklog can be viewed as a time series depicting time spent, along with other metadata.
    def __init__(self, worklog_entries: List[WorklogEntry]):
        """Construct the worklog object using the provided worklog entries."""
        df = pd.DataFrame.from_records(sorted(worklog_entries, key = lambda worklog: worklog.created), columns = WorklogEntry._fields)
        self._df = df.set_index(pd.DatetimeIndex(pd.to_datetime(df['created'], utc = True)))

    def doResampleAndSumTimeSeries(self, resample_frequency: ResampleFrequency):
        """Resample and sum the worklog's time spent values using the provided resample frequency."""
        self._df = pd.DataFrame(self._df.resample(resample_frequency.value).sum(), columns = [ 'timeSpentSeconds' ])

    def doCalculateRollingAverage(self, window_size: int):
        """Caclulate a rolling average of the worklog's time spent values."""
        self._df['rollingAverageTimeSpentSeconds'] = self._df.rolling(window = windw_size).mean()

    def doMakeDataFrame(self) -> pd.DataFrame:
        """Make a data frame from the worklog object."""
        return self._df

It’s not perfect. It’s not perfect because I wrap the Pandas data frame with operations that manipulate the JIRA worklogs. It provides an advantage in that the maniplation of the data frame is contained within the class. It has the disadvantage of exporting the data frame which seems disappointing in terms of extending the class.

October 9, 2019

Craftsman is Sexist

  —A comment on craftsman is sexist.

I read this Twitter thread with interest/dismay/curiosity at various times. The point is words have power and people need to be cognizant of how they use language. It’s focus is that language exerts the power to exclude.

In this case, the use of “craftsman” is asserted as exclusionary to women. To disagree requires some creativity. Or insensitivity.

You can understand this as follows.

If you relate to being a male but won’t refer to yourself as a craftswoman then you get it.

It’s an excellent point and worth remembering.

It would take considerable effort to understand everything else in this and related threads. I won’t try.

I still follow Sarah Mei. Can’t say I agree with everything she says.

I can say the same about Robert Martin.

I’ll continue to follow them both because I still have hope I can learn from them. Perhaps that’s enough.

A week on and this thread is still going. One argument that’s come up a couple of times is that Sarah is a troll. I think it patiently ridiculous.

She isn’t a troll because she isn’t doing this anonymously. Read her blog. There is a consistent message there.

A better summary of Sarah’s points: More on why agile / XP so often fails heterogenous teams.

I get the pair programming issue–I haven’t seen it work whenever its been forced and I don’t force it.

I don’t get the issue with TDD Test-Driven Development. I treat TDD as a tool in the testing toolbox–use it if you like but you must test your code.

Perhaps the difference here is the 100% component of the argument? Nah, the focus is on the power dynamics.

So where is the power dynamic in TDD? I always thought of TDD as a single person activity which is probably the gap in my understanding.

I wrote the original version of this post in 2018.

Only two things have really changed in the past year. Sarah’s points still make sense. Robert’s are harder to rationalize.

Still following both Sarah and Robert.


October 3, 2019

The Mythical Man-Month (Independent Subtasks, Too!)

  —The Mythical Man-Month on independent subtasks.

In The Mythical Man-Month (Worth Reading Again), I describe the value of re-reading this classic essay.

Some people on team are fond of quoting Brook’s statement that “the bearing of a child takes nine months, no matter how many women are assigned the task”. Unfortunately, they use this statement to deflect analysis on a work breakdown to determine the number optimal number of people on their projects. That’s disappointing on many levels.

It’s an example where the wisdom in Brook’s essay is only partially understood or remembered. It impedes the creation of optimal schedules.

Brook’s clearly points out that adding more people to a late project makes it later. He points out that you should resource your schedules so that the number of people equals the number of independent substasks. Then decide if you want to adjust the number of people on the project.

I am disappointed that people don’t seem to read the full extent of his essay. It is worth reading again. And again. Or at least until someone says they’s reached the optimal number of subtasks in a work breakdown.

September 10, 2019

Great Teams Make Great People

  —Being part of a great team amplifies what people accomplish.

This article’s title is taken from an article by Jessica Kerr entitled The Origins of Opera and the Future of Programming. Jessica’s article is well worth the read.

I like Jessica’s article for the juxtaposition of team, people, agility and the notion that we learn together. She calls out the notion that team includes community, and that community might extend beyond the people you work with on a daily basis. Your community might be made of a diverse group of people who share a common interest in solving a problem.

A community shares

  • process and values,
  • priority problems and
  • a shorthand for communication.

Communities place an emphasis on building mental models as a form of communication. Or more so, creating an environment wherein sharing mental models is rewarded. Recognizing generativity over personal productivity leads to symmathesy. (Symmathesy involves the notion that together we learn.)

What intrigued me most about the perspective brought in this article was that it shed light on a problem of my own. I’ve been working with my team for several months. I’ve begun looking deeply into my team’s values as a means to improve communication and create a shared understanding.

It’s striking that my search for values has largely produced a void. It’s not that I can’t identify values in other parts of my organization. I’ve been unable to find a values that resonate with me and my perceptions of what the team values.

The notion of symmathesy might be part of the solution.

When I first joined the team, the major challenge they presented was one of learning. Groups of people felt that they weren’t part of a culture of learning. Others didn’t seem to notice.

The underlying problem was the power hierarchy between groups. The people complaining about lack of learning were newer than those that weren’t. This implied there was lots of tribal knowledge but no emphasis on sharing.

We had a poor understanding of our process. This wasn’t new to me but the process appeared “weaponized” in ways that I hadn’t encountered before.

A weaponized process is one where the process is used to shield people from responsibility, or to stiffle others or somehow introduce inequity between teams. I’d been in environments where the process felt weaponized but the challenge here was outside of my experience. The team was focused on following the process without apparent recognition they could do much more to move the business forward.

Another issue was one of shared understanding of the code base. This overlaps with the learning issue identified above, but seemed to go deeper and touch on empowerment. Empowerment in the sense that people didn’t feel like they could own what they didn’t understand.

Changing the values of the team so that generativity is emphasized looks like a good direction. The notion of symmathesy certainly resonates with me and is something I need to think deeply about.

What I see in Jessica’s article is an opportunity to emphasize share learning for the team through the creation of a process that includes specific values. She has, in effect provided new tools for me to use to frame my own challenges with. These tools look powerful.

September 4, 2019

Locality and Design Rationale

  —Minimize documentation while simultaneously communicating design intent and history.

I had occasion to read Data Abstraction and Hierarchy and was struck by the concept of locality described in the paper. This paper was presented in OOPSLA 87 and it’s focus is on using inheritance to define type heirarchies. It is the source of what became known as the Liskov Substituion Principle.

The notion of locality is enabled by abstraction supported by specification and encapsulation. The specification describes what the abstraction does, but omits any information about how it is implemented. Encapsulation guarantees that modules can be implemented and reimplemented independently. Encapsulation is related to the notion of information hiding described by Parnas.

The paper goes on to discuss the benefits of using abstraction to define type heirarchies. It includes an example of incremental design through the use of refining the type heirarchy as development progresses on the data type.

What’s interesting is that abstraction supported by specification and encapsulation during incremental design can be used to provide a design rationale.

The design rationale describes the decisions made at particular points in the design, and discusses why they were made and what alternatives exist. By maintaining the hierarchy to represent the decisions as they are made over time, we can avoid confusion and be more precise.

This provides insight on how to minimize documentation (specification), capture the evolution of the a design while communicating intent and history.