Going through my old repositories on Github made me realise I’d forgotten about a number of coding items I have done in the past. There was one which was a coding Kata originally created by Scott Allen around refactoring.
Refactoring is the ability to improve code readability to increase its simplicity and maintainability. If you write code that the rest of your development team can’t understand or spends 10 times as long to understand before they can alter it then it’s failed. It’s also a skill which needs to be learnt and practiced hence Katas. This also means that something doesn’t always have to use the latest language feature if it isn’t maintainable - you don’t want it to end up like Python aka “Write once, read never”.
The Read Me gives some starting points on where to go with the refactor but the interesting part is:
!!How to End!!
You can stop when you feel the code is good enough – something you can come back to in 6 months and understand.
I originally did this kata back in April 2012, almost 6 years ago, and my career and experience has advanced a long way since then. Due to this I decided I would open it up and see if I could still understand the updated code and if it could be improved. Not really surprising it wasn’t the easiest code to read (at least it made sense!) but it could be a lot better.
There were a number of SOLID Principles which were being violated and some parts I was violating the K.I.S.S principle - not good!
Finder class
The Finder class is the main entry into the functionality; from the list of People
provided it is asked to return a specific Result
.
In the original refactoring the Finder was cleaned up and different types of “finding” algorithm were split out into separate delegates; in fact they were Func<IEnumerable<Result>, Result>
.
However there is too much responsibility in the single class. Let’s start off my looking at the main Find
method.
public Result Find(ResultType resultType)
{
var results = new List<Result>();
foreach (var person in _people)
{
Person person1 = person;
results.AddRange(_people.Where(x => !x.Equals(person1)).Select(other => new Result { PersonOne = person1, PersonTwo = other }));
}
return results.Any() ? _dictionary[resultType](results) : new Result();
}
The main Find
method is trying to do a number of jobs. It’s mapping the list of People
into Result
pairings. Using the Result
pairings it is then defining the processing which needs to occur on this List<Result>
to find the single response required.
foreach (var person in _people)
{
Person person1 = person;
results.AddRange(_people.Where(x => !x.Equals(person1)).Select(other => new Result { PersonOne = person1, PersonTwo = other }));
}
This code is iterating over the _people
list a lot more than I think I realised back in 2012 as well it is returning more results than required; for 4 people it returns 12 results. The act of converting People
into a Result
pairing means that each person in _people
are mapped with everyone else; a pairing of “Sue” and “Greg” is being generated as well as a pairing of “Greg” and “Sue”. As the result pairing has no indication of direction or has to understand if the value is positive or negative multiple pairings are not required. It also looks a bit messy!
return results.Any() ? _dictionary[resultType](results) : new Result();
The above line starts off with a valid check to see if there are any results, which is good, but then looks up a delegate in a dictionary using the resultType
as the key. It does not check there is an entry for this enum value so if a new value was added in the future but no corresponding processing it would blow up. The _dictionary
member variable isn’t named well and looking at this line of code isn’t clear what it stores. Only by looking at its declaration you have an idea of what it’s doing but it’s still a bit of a mouthful.
Dictionary<ResultType, Func<IEnumerable<Result>, Result>> _dictionary =
This class violates a number of the SOLID principles:
Single responsibility principle
- It maps the
People
intoResult
pairings - It looks up the type of result processing required
- It processes the results
Open/Closed principle
- If a new result processing needs to be defined the dictionary has to be updated and the implementation specified. This is not “open for extension” as it is modifying the class.
Dependency inversion principle
- No dependencies, however the concrete implementations of the result processing should be abstracted away so I would class that as a violation.
Result class
There are a number of issues with this class and it makes me cringe when I look at it. From top to bottom:
- The
PersonOne
andPersonTwo
properties call a private method in the setter. PersonOne
andPersonTwo
don’t really make sense or indicate their usage.GetDelta()
checks that the properties are notnull
before processing and then calculates the difference between the two people.- The private method
DetermineOrder()
, which is called from the properties, also checks that the properties arenull
and then swaps them around if required. This is duplicating logic.
The biggest issue I have with my 2012 refactored code is a “Result” should not change once set and this class is mutable on each Person property change.
Person class
No way to describe this other than it’s very heavily over engineered.
public class Person
{
public string Name { get; set; }
public DateTime BirthDate { get; set; }
public bool Equals(Person other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
return Equals(other.Name, Name) && other.BirthDate.Equals(BirthDate);
}
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != typeof(Person)) return false;
return Equals((Person)obj);
}
public override int GetHashCode()
{
unchecked
{
return ((Name != null ? Name.GetHashCode() : 0) * 397) ^ BirthDate.GetHashCode();
}
}
}
There is no need to over complicate such a simple concept. Creating a specific Equals
method while overriding the base Equals
method is going too far; there is just no need for it. As for generating my own hashcode, what was I thinking?!
Conclusion
The refactored implementation is a lot clearer than the original which was the aim. However another 6 years on and I can see there are a lot of improvements which should be made to make the code clean and maintainable. The code needs to be more readable, take less time to understand what it’s trying to do and allow for extensibility.
Following on from this I will refactor again and show how to apply a number of the SOLID principles to keep the functionality the same however make the code cleaner, leaner, more readable and above all else increase maintainability.
I’d be interested in hearing your thoughts. Please contact me on Twitter @WestDiscGolf