Problems with Group BY Clause

  • One or more of the columns in your group by has different values for one departure airport. Can't say which one without seeing a sample output from your query.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail,

    Many Thanks

  • Ok, so if we look at your Gatwick - Amsterdam lines. They differ in the DEP_DATE and the RtnDEP_DATE. Those two columns are included in the group by. Hence the group by returns 2 rows, because some of the data you're grouping by is different.

    If you only want 1 line per airport, which dates do you want to see? The highest, lowest, one corresonding to the cheapest price, any one at random?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • It would probably be the one corresponding to the cheapest price, but it would be handy to know how to display any one at random also.

    Thanks again

  • I'll have a go at something for you a bit later. Can you post the schema of the tables please?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail,

    Many Thanks

  • Could you possibly post it as a create table please? I want to have a test table to try the query out on, and it's going to take too long to turn that list into a table def.

    Thanks.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • No problem Gail,

    Thanks

  • How does this look?

    SELECT depAir.Name AS departureAirportName,

    dest.Name AS destination,

    TotalFlights,

    Offers.ARR_AIRPORT,

    Offers.DEP_AIRPORT,

    LowestPrice,

    LowestPrice/NO_ADULTS AS Price,

    Offers.DEP_DATE,

    Offers.RtnDEP_DATE,

    Offers.NO_ADULTS,

    DATEDIFF(d, Offers.DEP_DATE, Offers.RtnDEP_DATE) AS duration

    FROM cached_offers_flights Offers INNER JOIN

    (SELECT DEP_AIRPORT, ARR_AIRPORT, COUNT(*) AS TotalFlights,

    MIN(CAST(STANDARD_PRICE AS NUMERIC(7,2))) AS LowestPrice

    FROM cached_offers_flights GROUP BY DEP_AIRPORT, ARR_AIRPORT) CheapestOffer

    ON Offers.DEP_AIRPORT = CheapestOffer.DEP_AIRPORT AND Offers.Arr_AIRPORT = CheapestOffer.Arr_AIRPORT

    AND Offers.STANDARD_PRICE = CheapestOffer.LowestPrice

    INNER JOIN Airport AS depAir ON Offers.DEP_AIRPORT = depAir.Code

    INNER JOIN Airport AS destAir ON Offers.ARR_AIRPORT = destAir.Code

    INNER JOIN City AS dest ON destAir.fkCity = dest.ID

    WHERE (dest.Name LIKE 'a%')

    AND (Offers.DATETIME > '2008/05/08 12:00:00')

    AND (CAST(Offers.DEP_DATE AS datetime) >= '2008/05/16')

    AND (Offers.Website = 'www.test.com')

    ORDER BY Price

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The two Stansteds are because they have different destinations. You want only 1 row per departure airport, regardless of where the flight is going?

    I think the 2 rows for Gatwick are because there are 2 entries with same, lowest, price. If that case does happen, do you want to show the earliest one?

    Edit: To confirm, you are using SQL server 2005?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • The two Stansteds are because they have different destinations. You want only 1 row per departure airport, regardless of where the flight is going?

    Yes that would be great.

    I think the 2 rows for Gatwick are because there are 2 entries with same, lowest, price. If that case does happen, do you want to show the earliest one?

    Yes please 🙂

    Thanks again Gail

  • Ok, maybe this one...

    SELECT depAir.Name AS departureAirportName,

    dest.Name AS destination,

    TotalFlights,

    Offers.ARR_AIRPORT,

    Offers.DEP_AIRPORT,

    LowestPrice,

    LowestPrice/NO_ADULTS AS Price,

    Offers.DEP_DATE,

    Offers.RtnDEP_DATE,

    Offers.NO_ADULTS,

    DATEDIFF(d, Offers.DEP_DATE, Offers.RtnDEP_DATE) AS duration

    FROM (SELECT ARR_AIRPORT, cached_offers_flights.DEP_AIRPORT, NO_ADULTS, DEP_DATE,

    RtnDEP_DATE, [DATETIME], Website, TotalFlights, LowestPrice,

    ROW_NUMBER() OVER (PARTITION BY cached_offers_flights.DEP_AIRPORT ORDER BY DEP_DATE) RowNo

    FROM cached_offers_flights INNER JOIN

    (SELECT DEP_AIRPORT, COUNT(*) AS TotalFlights,

    MIN(CAST(STANDARD_PRICE AS NUMERIC(7,2))) AS LowestPrice

    FROM cached_offers_flights GROUP BY DEP_AIRPORT) CheapestOffer

    ON cached_offers_flights.DEP_AIRPORT = CheapestOffer.DEP_AIRPORT AND cached_offers_flights.STANDARD_PRICE = CheapestOffer.LowestPrice) Offers

    INNER JOIN Airport AS depAir ON Offers.DEP_AIRPORT = depAir.Code

    INNER JOIN Airport AS destAir ON Offers.ARR_AIRPORT = destAir.Code

    INNER JOIN City AS dest ON destAir.fkCity = dest.ID

    WHERE (dest.Name LIKE 'a%')

    AND (Offers.DATETIME > '2008/05/08 12:00:00')

    AND (CAST(Offers.DEP_DATE AS datetime) >= '2008/05/16')

    AND (Offers.Website = 'www.test.com')

    AND RowNo = 1

    ORDER BY Price

    If this doesn't do what you want, please give me some sample data for the cached_offers_flights table (in the form of insert statements)

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Thanks for posting that, gives me the results that I want.

    I've never used the sql snippet below before:

    ROW_NUMBER() OVER PARTITION BY

    so it's gave me something to look into as well...

    Thanks again for your help Gail

    Cheers Mark 😀

  • Hi Gail,

    Unfortunatley the query I thought was working the way I wanted it seems to not pull out all the flights available to the various destinations for each departure airport.

  • I'll take a look, but I'm going to need more than 1 row of data. Can you give me data for all the dest airports like 'b%' (the ones you illustrated in the last post?

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply