Skip to main content

Interesting bug - Line endings and Hash Code


I recently came across an interesting bug which emphasize how different line endings format can break your custom equality implementation if you do not carefully consider them.

Context
We have an application that periodically updates the local assets with latest updated resources. In a nutshell, it makes an web api call to get the latest set of metadata and compare them against a locally stored metadata file. If they differs then we update the locally stored metadata file and download new/updated resources.

Bug
For a particular asset, associated metadata file was always getting updated although there were no visible changes detected using the revision history.

Investigation
My obvious suspect was the code responsible for doing the equality check between local metadata and the metadata received from the Web API.

For verification, I setup a conditional break-point which will be hit when the equality returns false. After my debug hit the break-point, I looked into all the properties and found that one of them was returning false. It was a list of tag objects and we were doing an HashSet equality comparison on that list. Something similar to below pseudo code:


Although, I had narrowed down the issue it was still not clear which object from the list was actually throwing the equality to false. So I decided to look into the hash-codes for this two list of objects.

Upon looking into the hash code, I discovered that one of the hash-code was different between the compared objects which happened to be at position 3. A quick lookup on the value at that position and viola!. One of them contains line endings as '\r\n' = CR + LF while the one from the web api contains '\n' = LF.

Most text editor will not display the different line endings in default view also line endings could be OS specific. Hence, it seems like nothing has been changed in the file. However, as they contains different values my custom equality implementation along with the get hash-code function generates different hashes for them and thus the equality returned false.

Here is a simplified version of the original class which implements IEquatable of T


Fix

Updating the object property getter to use consistent line ending. Basically doing a Regex replace with a '\r\n'.

After the above change is in place,  hash-code and hence the equality will be same for object with different line endings format. This will give us the expected result and our metadata file will not get unnecessarily updated.


Lesson Learned

Be mindful of different line endings, null values and sometimes different culture settings when implementing a custom equality based on string properties.


Comments

Popular posts from this blog

Creating dynamic email templates using C# and Office Outlook

It is quite common for many applications to send automated email notifications. Couple of months ago, I have worked on improving our old email template format to make it more user friendly . In this tutorial I will walk you though regarding how I took advantage of Microsoft Outlook to quickly generate custom email template and later using the html template for building an automated custom email application using C#. Steps: Creating Templates: Using the rich text editor support  in Outlook create a nicely formatted email. Use placeholder text for the values you like to change dynamically based on your task completion status. To keep this tutorial simple, I have created a  simple table with placeholder text inside the third bracket  [place holder text]. However, you can use anything supported by outlook editor. Figure: Email Template Getting HTML code: Send the created email to your own address. After that, open the sent email and right click to view source . It

Persian Music - Homayoon - Nemitonam English lyrics

I love this song. I have requested one of my Persian friend to translate it for me and she did really nice job.. I am sure you will love it.. " "The person who was the only person I had, was the only refuge of my lonely heart, Left me alone and went from my side I am restless form the pain of her separation … I thought she stays with me sings love song for me I thought she understand my words I didn’t know she’s unkind … Though gone, but still I am full of her love Her thought is always with me Wherever I go, she is in front of my eyes, in front of my eyes I want to stand out find a way to reduce my pain of her separation But it is not possible, There is no way I cannot bear I cannot bear The person who was the only person I had, was the only refuge of my lonely heart, Left me alone and went from my side I am restless form the pain of her separation … I thought she stays with me sings love song for me I thought she understand my words I didn’t know she’s u

Code Review - Best Practices

Code review is a great learning and knowledge sharing tool not only for the new members of the team but for the long time company veterans as well. Having a code review process in place dramatically improves the code quality and helps detect bugs in an early stage. Much of what I wanted to write here has already been captured in this great post form smartbear. However, here are the important bits:   Author: Size matters. Keep it under 400 lines. For bigger change, break down the review in meaningful chunks. Add TODO comments for future CR. Keep you changes under feature flag or in a separate feature branch to facilitate smaller incremental changes which are not ready to be released in production. Provide a detailed context of the change. I prefer documenting context in commit message and will recommend to follow similar format from Linux project [ link ] Provide details of tests performed to verify the current change. Unit tests, integration tests, verific