• Skip to primary navigation
  • Skip to main content

Collin M. Barrett

Collin M. Barrett is a staff software engineer in Memphis, TN.

  • Start
  • Blog
  • Resume
    • Projects
    • References
  • Contact
Home ❭ Code ❭ SELECT All HAVING Non-Distinct Composite Column with T-SQL
SELECT All HAVING Non-Distinct Composite Column with T-SQLPhoto by mcmurryjulie

SELECT All HAVING Non-Distinct Composite Column with T-SQL

Published: 6.11.17

Requirement

This week, the cross-functional team that is testing the application I have been building discovered a data issue where a table that should only contain one record for each ([Sku], [GroupCode]) pair had more than one in some cases. This table was not designed to have ([Sku], [GroupCode]) as a composite key (there is an [Id] identity column as a key), allowing this issue to occur.

Below is a sample of what the table [Prices] could have contained.

Id Sku GroupCode Price
1 1 1 1.00
2 1 1 2.00
3 1 2 1.00
4 2 1 1.00
5 2 1 2.00
6 2 2 1.00

To diagnose why this issue (such as records where [Id] IN (1,2)) was happening, I needed a query to select only the records that contained more than one ([Sku], [GroupCode]) composite.

A Solution

I did not have a lot of time to test my Google-Fu for this one, so I received assistance from a BI developer who sits across the hall from me. She suggested a query like below.

SELECT COUNT([Sku] + [GroupCode]) AS 'Count',
[Sku],
[GroupCode]
FROM [Prices]
GROUP BY [Sku], [GroupCode]
HAVING COUNT([Sku] + [GroupCode]) > 1

I had used the HAVING keyword before, but I had not used it extensively enough to think to apply it to this problem off the top of my head. The difference between HAVING and WHERE is quite subtle; but, for this situation, I think HAVING is required since I wanted to select duplicate composite columns after the GROUP BY has already been applied.

Furthermore, I was interested in treating [Sku] and [GroupCode] like a single composite column. So, concatenating the two columns fulfills this requirement. The query above, however, adds the two columns rather than concatenates because the columns are INTs. So, I cast the INTs to VARCHARs below.

SELECT COUNT(CAST([Sku] AS VARCHAR(20)) + CAST([GroupCode] AS VARCHAR(20))) AS 'Count',
[Sku],
[GroupCode]
FROM [Prices]
GROUP BY [Sku], [GroupCode]
HAVING COUNT(CAST([Sku] AS VARCHAR(20)) + CAST([GroupCode] AS VARCHAR(20))) > 1

The result set from the sample [Prices] table is below and is what I needed to diagnose further why the issue was occurring.

Count Sku GroupCode
2 1 1
2 2 1

Related

About Collin M. Barrett

Collin M. Barrett is a staff software engineer in Memphis, TN.

Subscribe! (max one per week)

Reader Interactions

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Collin M. Barrett © 2023

  • LinkedIn
  • GitHub
  • Stack Overflow
  • DEV
  • Twitter
  • Goodreads
  • Sitemap