Recently, I have dealt with an interesting performance issue with one of my SQL query and thought I will share the experience here.
Context:
We had a legacy stored procedure responsible for saving large amount of excel row data to our database tables. It was using User Defined Table Types as one of the parameter to get a list of row data from excel. However, the stored procedure was taking very long time to save the large data set.
Root Cause:
After quite a bit of investigation using execution plan in SSMS, I was able to narrow down the performance issue to the following:
After doing additional research using stack overflow, I was able to figure out that the primary reason for the poor performance doing a JOIN on Table Valued parameters is that : it does not keep statistics and appear to the Query Optimizer as having only a single row. However, in my case I had thousands of rows. As a result, joining on them was quite slow due to no optimization performed by query optimizer.
Resolution:
After I have dumped everything from the user defined table type parameters to a temporary table, performance improved drastically as Query Optimizer was able to keep up with the statistics. Also, I was able to compute the expensive hash function only once and saved the result as a column in the temp table. In that way I was able to save some additional compute time there too.
Doing all these, I was able to make my save operation more than twice as faster compared to previous time.
Lesson Learned:
If you have to save huge amount of row data using Table Valued parameters, consider dumping them to a temporary table first and then join with that temp table.
Context:
We had a legacy stored procedure responsible for saving large amount of excel row data to our database tables. It was using User Defined Table Types as one of the parameter to get a list of row data from excel. However, the stored procedure was taking very long time to save the large data set.
Root Cause:
After quite a bit of investigation using execution plan in SSMS, I was able to narrow down the performance issue to the following:
- Joining with User defined table type was taking >90 percent of the time
- A custom hash function which has been used multiple times as a join criteria was also quite expensive to compute.
After doing additional research using stack overflow, I was able to figure out that the primary reason for the poor performance doing a JOIN on Table Valued parameters is that : it does not keep statistics and appear to the Query Optimizer as having only a single row. However, in my case I had thousands of rows. As a result, joining on them was quite slow due to no optimization performed by query optimizer.
Resolution:
After I have dumped everything from the user defined table type parameters to a temporary table, performance improved drastically as Query Optimizer was able to keep up with the statistics. Also, I was able to compute the expensive hash function only once and saved the result as a column in the temp table. In that way I was able to save some additional compute time there too.
Doing all these, I was able to make my save operation more than twice as faster compared to previous time.
Lesson Learned:
If you have to save huge amount of row data using Table Valued parameters, consider dumping them to a temporary table first and then join with that temp table.
Comments