group-by interview questions
Top group-by frequently asked interview questions
How can I do GroupBy Multiple Columns in LINQ
Something similar to this in SQL:
SELECT * FROM <TableName> GROUP BY <Column1>,<Column2>
How can I convert this to LINQ:
QuantityBreakdown
(
MaterialID int,
ProductID int,
Quantity float
)
INSERT INTO @QuantityBreakdown (MaterialID, ProductID, Quantity)
SELECT MaterialID, ProductID, SUM(Quantity)
FROM @Transactions
GROUP BY MaterialID, ProductID
Source: (StackOverflow)
like when I do
SELECT [Date]
FROM [FRIIB].[dbo].[ArchiveAnalog]
GROUP BY [Date]
how can I specify the group period ?
MS SQL 2008
2nd Edit
I'm trying
SELECT MIN([Date]) AS RecT, AVG(Value)
FROM [FRIIB].[dbo].[ArchiveAnalog]
GROUP BY (DATEPART(MINUTE, [Date]) / 10)
ORDER BY RecT
changed %10 to / 10. is it possible to make Date output without milliseconds ?
Source: (StackOverflow)
Just curious about SQL syntax. So if I have
SELECT
itemName as ItemName,
substring(itemName, 1,1) as FirstLetter,
Count(itemName)
FROM table1
GROUP BY itemName, FirstLetter
This would be incorrect because
GROUP BY itemName, FirstLetter
really should be
GROUP BY itemName, substring(itemName, 1,1)
But why can't we simply use the former for convenience?
Source: (StackOverflow)
What is the correction needed for example 2 inorder to group by multiple columns
Example 1
var query = from cm in cust
group cm by new { cm.Customer, cm.OrderDate } into cms
select
new
{ Key1 = cms.Key.Customer,Key2=cms.Key.OrderDate,Count=cms.Count() };
Example 2 (incorrect)
var qry =
cust.GroupBy(p => p.Customer, q => q.OrderDate, (k1, k2, group) =>
new { Key1 = k1, Key2 = k2, Count = group.Count() });
Source: (StackOverflow)
Someone sent me a SQL query where the GROUP BY
clause consisted of the statement: GROUP BY 1
.
This must be a typo right? No column is given the alias 1. What could this mean? Am I right to assume that this must be a typo?
Source: (StackOverflow)
how do I write this query in linq (vb.net)?
select B.Name
from Company B
group by B.Name
having COUNT(1) > 1
Source: (StackOverflow)
Let's suppose if we have a class like
class Person {
internal int PersonID;
internal string car ;
}
Now I have a list of this class: List<Person> persons;
Now this list can have instances multiple same PersonIDs, for ex.
persons[0] = new Person { PersonID = 1, car = "Ferrari" };
persons[1] = new Person { PersonID = 1, car = "BMW" };
persons[2] = new Person { PersonID = 2, car = "Audi" };
Is there a way I can group by personID and get the list of all the cars he has?
For ex. expected result would be
class Result {
int PersonID;
List<string> cars;
}
So after grouping by I would get:
results[0].PersonID = 1;
List<string> cars = results[0].cars;
result[1].PersonID = 2;
List<string> cars = result[1].cars;
From what I have done so far:
var results = from p in persons
group p by p.PersonID into g
select new { PersonID = g.Key, // this is where I am not sure what to do
Could someone please point me in the right direction?
Source: (StackOverflow)
As the title suggests, I'd like to select the first row of each set of rows grouped with a GROUP BY
.
Specifically, if I've got a purchases
table that looks like this:
SELECT * FROM purchases;
id | customer | total
---+----------+------
1 | Joe | 5
2 | Sally | 3
3 | Joe | 2
4 | Sally | 1
I'd like to query for the id
of the largest purchase (total
) made by each customer
. Something like this:
SELECT FIRST(id), customer, FIRST(total)
FROM purchases
GROUP BY customer
ORDER BY total DESC;
FIRST(id) | customer | FIRST(total)
----------+----------+-------------
1 | Joe | 5
2 | Sally | 3
Source: (StackOverflow)
Is it possible I make a simple query to count how many records I have in a determined period of time like a Year, month or day, having a TIMESTAMP
field, like:
SELECT COUNT(id)
FROM stats
WHERE record_date.YEAR = 2009
GROUP BY record_date.YEAR
Or even:
SELECT COUNT(id)
FROM stats
GROUP BY record_date.YEAR, record_date.MONTH
To have a monthly statistic.
Thanks!
Source: (StackOverflow)
There is a table messages
that contains data as shown below:
Id Name Other_Columns
-------------------------
1 A A_data_1
2 A A_data_2
3 A A_data_3
4 B B_data_1
5 B B_data_2
6 C C_data_1
If I run a query select * from messages group by name
, I will get the result as:
1 A A_data_1
4 B B_data_1
6 C C_data_1
What query will return the following result?
3 A A_data_3
5 B B_data_2
6 C C_data_1
That is, the last record in each group should be returned.
At present, this is the query that I use:
select * from (select * from messages ORDER BY id DESC) AS x GROUP BY name
But this looks highly inefficient. Any other ways to achieve the same result?
Source: (StackOverflow)
I understand the point of group by x
But how does group by x, y
work and what does it mean?
Source: (StackOverflow)
I am looking for a way to concatenate the strings of a field within a group by query. So for example, I have a table:
ID COMPANY_ID EMPLOYEE
1 1 Anna
2 1 Bill
3 2 Carol
4 2 Dave
and I wanted to group by company_id to get something like:
COMPANY_ID EMPLOYEE
1 Anna, Bill
2 Carol, Dave
There is a built-in function in mySQL to do this group_concat
Source: (StackOverflow)
I have a table which I want to get the latest entry for each group. Here's the table:
DocumentStatusLogs
Table
|ID| DocumentID | Status | DateCreated |
| 2| 1 | S1 | 7/29/2011 |
| 3| 1 | S2 | 7/30/2011 |
| 6| 1 | S1 | 8/02/2011 |
| 1| 2 | S1 | 7/28/2011 |
| 4| 2 | S2 | 7/30/2011 |
| 5| 2 | S3 | 8/01/2011 |
| 6| 3 | S1 | 8/02/2011 |
The table will be grouped by DocumentID and sorted by DateCreated in descending order. For each DocumentID, I want to get the latest status.
My preferred output:
| DocumentID | Status | DateCreated |
| 1 | S1 | 8/02/2011 |
| 2 | S3 | 8/01/2011 |
| 3 | S1 | 8/02/2011 |
Is there any aggregate function to get only the top from each group? See pseudo-code GetOnlyTheTop
below:
select DocumentID, GetOnlyTheTop(Status), GetOnlyTheTop(DateCreated)
from DocumentStatusLogs
group by DocumentID
order by DateCreated desc
If such function doesn't exist, is there any way I can achieve the output I want?
- Or at the first place, could this be caused by unnormalized database? I'm thinking, since what I'm looking for is just one row, should that
status
also be located in the parent table?
Please see the parent table for more information:
Current Documents
Table
| DocumentID | Title | Content | DateCreated |
| 1 | TitleA | ... | ... |
| 2 | TitleB | ... | ... |
| 3 | TitleC | ... | ... |
Should the parent table be like this so that I can easily access its status?
| DocumentID | Title | Content | DateCreated | CurrentStatus |
| 1 | TitleA | ... | ... | s1 |
| 2 | TitleB | ... | ... | s3 |
| 3 | TitleC | ... | ... | s1 |
UPDATE
I just learned how to use "apply" which makes it easier to address such problems.
Source: (StackOverflow)
This question already has an answer here:
public class ConsolidatedChild
{
public string School { get; set; }
public string Friend { get; set; }
public string FavoriteColor { get; set; }
public List<Child> Children { get; set; }
}
public class Child
{
public string School { get; set; }
public string Name { get; set; }
public string Address { get; set; }
public string Friend { get; set; }
public string Mother { get; set; }
public string FavoriteColor { get; set; }
}
Given the two classes above, I would like to use LINQ to create a List from the List, grouped by the School, Friend and FavoriteColor properties. Is this possible with LINQ?
Please ignore the properties, the code has been written just to help with the question.
Source: (StackOverflow)
I learned something simple about SQL the other day:
SELECT c FROM myTbl GROUP BY C
Has the same result as:
SELECT DISTINCT C FROM myTble
What I am curious of, is there anything different in the way an SQL engine processes the command, or are they truly the same thing?
I personally prefer the distinct syntax, but I am sure it's more out of habit than anything else.
EDIT: This is not a question about aggregates. The use of GROUP BY
with aggregate functions is understood.
Source: (StackOverflow)