Refactoring Kata - 6 Year Code Review

Going through my old repositories on Github made me realise I’d forgotten about a number of coding items I have done in the past. There was one which was a coding Kata originally created by Scott Allen around refactoring.

Refactoring is the ability to improve code readability to increase its simplicity and maintainability. If you write code that the rest of your development team can’t understand or spends 10 times as long to understand before they can alter it then it’s failed. It’s also a skill which needs to be learnt and practiced hence Katas. This also means that something doesn’t always have to use the latest language feature if it isn’t maintainable - you don’t want it to end up like Python aka “Write once, read never”.

The Read Me gives some starting points on where to go with the refactor but the interesting part is:

!!How to End!!

You can stop when you feel the code is good enough – something you can come back to in 6 months and understand.

I originally did this kata back in April 2012, almost 6 years ago, and my career and experience has advanced a long way since then. Due to this I decided I would open it up and see if I could still understand the updated code and if it could be improved. Not really surprising it wasn’t the easiest code to read (at least it made sense!) but it could be a lot better.

There were a number of SOLID Principles which were being violated and some parts I was violating the K.I.S.S principle - not good!

Finder class

The Finder class is the main entry into the functionality; from the list of People provided it is asked to return a specific Result.

In the original refactoring the Finder was cleaned up and different types of “finding” algorithm were split out into separate delegates; in fact they were Func<IEnumerable<Result>, Result>.

However there is too much responsibility in the single class. Let’s start off my looking at the main Find method.

    public Result Find(ResultType resultType)
    {
        var results = new List<Result>();

        foreach (var person in _people)
        {
            Person person1 = person;
            results.AddRange(_people.Where(x => !x.Equals(person1)).Select(other => new Result { PersonOne = person1, PersonTwo = other }));
        }

        return results.Any() ? _dictionary[resultType](results) : new Result();
    }

The main Find method is trying to do a number of jobs. It’s mapping the list of People into Result pairings. Using the Result pairings it is then defining the processing which needs to occur on this List<Result> to find the single response required.

    foreach (var person in _people)
    {
        Person person1 = person;
        results.AddRange(_people.Where(x => !x.Equals(person1)).Select(other => new Result { PersonOne = person1, PersonTwo = other }));
    }

This code is iterating over the _people list a lot more than I think I realised back in 2012 as well it is returning more results than required; for 4 people it returns 12 results. The act of converting People into a Result pairing means that each person in _people are mapped with everyone else; a pairing of “Sue” and “Greg” is being generated as well as a pairing of “Greg” and “Sue”. As the result pairing has no indication of direction or has to understand if the value is positive or negative multiple pairings are not required. It also looks a bit messy!

	return results.Any() ? _dictionary[resultType](results) : new Result();

The above line starts off with a valid check to see if there are any results, which is good, but then looks up a delegate in a dictionary using the resultType as the key. It does not check there is an entry for this enum value so if a new value was added in the future but no corresponding processing it would blow up. The _dictionary member variable isn’t named well and looking at this line of code isn’t clear what it stores. Only by looking at its declaration you have an idea of what it’s doing but it’s still a bit of a mouthful.

	Dictionary<ResultType, Func<IEnumerable<Result>, Result>> _dictionary =

This class violates a number of the SOLID principles:

Single responsibility principle

It maps the People into Result pairings
It looks up the type of result processing required
It processes the results

Open/Closed principle

If a new result processing needs to be defined the dictionary has to be updated and the implementation specified. This is not “open for extension” as it is modifying the class.

Dependency inversion principle

No dependencies, however the concrete implementations of the result processing should be abstracted away so I would class that as a violation.

Result class

There are a number of issues with this class and it makes me cringe when I look at it. From top to bottom:

The PersonOne and PersonTwo properties call a private method in the setter.
PersonOne and PersonTwo don’t really make sense or indicate their usage.
GetDelta() checks that the properties are not null before processing and then calculates the difference between the two people.
The private method DetermineOrder(), which is called from the properties, also checks that the properties are null and then swaps them around if required. This is duplicating logic.

The biggest issue I have with my 2012 refactored code is a “Result” should not change once set and this class is mutable on each Person property change.

Person class

No way to describe this other than it’s very heavily over engineered.

public class Person
{
    public string Name { get; set; }
    public DateTime BirthDate { get; set; }

    public bool Equals(Person other)
    {
        if (ReferenceEquals(null, other)) return false;
        if (ReferenceEquals(this, other)) return true;
        return Equals(other.Name, Name) && other.BirthDate.Equals(BirthDate);
    }

    public override bool Equals(object obj)
    {
        if (ReferenceEquals(null, obj)) return false;
        if (ReferenceEquals(this, obj)) return true;
        if (obj.GetType() != typeof(Person)) return false;
        return Equals((Person)obj);
    }

    public override int GetHashCode()
    {
        unchecked
        {
            return ((Name != null ? Name.GetHashCode() : 0) * 397) ^ BirthDate.GetHashCode();
        }
    }
}

There is no need to over complicate such a simple concept. Creating a specific Equals method while overriding the base Equals method is going too far; there is just no need for it. As for generating my own hashcode, what was I thinking?!

Conclusion

The refactored implementation is a lot clearer than the original which was the aim. However another 6 years on and I can see there are a lot of improvements which should be made to make the code clean and maintainable. The code needs to be more readable, take less time to understand what it’s trying to do and allow for extensibility.

Following on from this I will refactor again and show how to apply a number of the SOLID principles to keep the functionality the same however make the code cleaner, leaner, more readable and above all else increase maintainability.

I’d be interested in hearing your thoughts. Please contact me on Twitter @WestDiscGolf

Finder class#

Single responsibility principle#

Open/Closed principle#

Dependency inversion principle#

Result class#

Person class#

Conclusion#