Wednesday, March 7, 2012

I need some help with a complex query

I've written a lot of queries in the past, but I'm having a lot of trouble with this one.

I have 3 tables: Thread, Reply, User

Thread has these relevant fields: ThreadID, UserID, DTStamp, Subject

Reply has these relevant fields: ThreadID, UserID, DTStamp

User has these relavent fields: UserID, Name

A few details:
- Thread and Reply connect with ThreadID
- Thread and Reply both connect to Person with UserID
- Thread and Reply share a 1 to many relationship (1 Thread with many Replies)
- It is also possible there are no replies

What I need is a query that looks at thedata and returns a set of records with the following data fields:

Subject: The Subject of the thread
CreationDate: The Date that Subject thread was created
Author: The Name of the UserID that created that thread
Replies: The Number of Replies to the Subject thread
LastPost: The Date of the Last Reply
LastPostAuthor: The Name of the UserID of the Last Reply

And I need this all sorted by Date of Last Reply

Is this even doable? Are there any suggestions on the best way to get started?


Thanks in advance,

Chris

Hi Chris my friend, I will help you on this.

First run the following SQL to create a function: -

CREATE FUNCTION fn_GetLastPostUserNameByThreadID
(
@.ThreadID AS INT
)

RETURNS varchar(30)

AS

BEGIN

DECLARE @.Author AS VARCHAR(30)

SET @.Author = (
SELECT Name FROM User WHERE UserID =
(SELECT TOP 1 USERID FROM Reply WHERE ThreadID = @.ThreadID
ORDER BY dtstamp DESC)
)

RETURN @.Author

END

Now for the SQL to get you the results: -

SELECT thread.Subject,
thread.dtstamp,
user.name,
COUNT(reply.threadid) AS Replies,
MAX(reply.dtstamp) AS LastPost,
dbo.fn_GetLastPostUserNameByThreadID(thread.threadid) AS LastPostAuthor

FROM thread

INNER JOIN user on thread.userid = user.userid
LEFT OUTER JOIN reply ON thread.threadid = reply.threadid

GROUP BY Subject,
thread.dtstamp,
user.name,
dbo.fn_GetLastPostUserNameByThreadID(thread.threadid)

ORDER BY reply.dtstamp DESC

|||

Thank you so much! I can't wait to try this out.

I have created stored procedures before, but not functions, so I have a couple follow-up questions.

When I look at my database through SQL Server Management Studio Express, I see funtions and under that 4 categories:

Table-valued Functions|||

Hi Chris,

A Table-valued function returns a table. To use one, you do "SELECT * FROM dbo.MyFunction()". The following example takes a string and turns it into a table.

create FUNCTION dbo.StringArrayIntoTable
(
@.String VARCHAR(8000),
@.Separator VARCHAR(1)
)
RETURNS @.tblStrings TABLE(Item VARCHAR(8000))

AS

BEGIN

DECLARE @.pos INT,
@.SubStr VARCHAR(10)


SET @.pos = CHARINDEX(@.Separator, @.String)

WHILE @.pos > 0
BEGIN
SET @.SubStr = SUBSTRING(@.String, 0, @.pos)

INSERT INTO @.tblStrings (Item) VALUES (@.SubStr)

SET @.String = SUBSTRING(@.String, LEN(@.SubStr) + 2, LEN(@.String) - LEN(@.SubStr) + 1)
SET @.pos = CHARINDEX(@.Separator, @.String)
END

INSERT INTO @.tblStrings (Item) VALUES (@.String)
RETURN
END

The first parameter is the string. The second is the separator. Test it with the following: -

select * from dbo.StringArrayIntoTable('red,blue,yellow', ',')
select * from dbo.StringArrayIntoTable('USA|Germany|Russia|UK', '|')

This is useful if you need to pass an array of values into a stored procedure. Just pass in a string that you can separate!

Scalar valued functions only return one value and when using them, you don't use "SELECT * FROM", just "SELECT FunctionName()". You would create one of these if you wanted a function that only returned one value, like the one I gave to you that returns 1 varchar; the author's name.

Aggregate functions are built-in scalar valued functions. For example, SUM and AVG; select SUM(SaleValue) AS Total, AVG(SaleValue) AS AverageSale FROM tblSales.

As for System functions, some are more useful than others. For example, if you want to return 0 for SaleValue if the field value is null, you can use SELECT IsNull(SaleValue, 0) AS SaleValue. It returns whatever the SalesValue is, but 0 if it is NULL.

As for your second question, always use a stored procedure. When a stored procedure cannot give you directly what you need without calling a function, as in the problem you posted, have the procedure call a function. Stored procedures have pre-compiled execution plans and execute more efficiently. On the other hand, performance hits are associated with functions so use them only when necessary.

By the way. I noticed that within the SQL I gave to you I used "user" to refer to the author table. This cannot be right because user is not a valid table name. I did not realize this at first because I did it in Notepad. Please substitute this with the correct table name and the SQL should work.

In return for all of this help, I only ask that you mark me as the answerer of your question in this forum.

Kind regards

Scotty

|||

Scotty,

Wow! I can't begin to thank you enough for your help. Your code and explanations are just what I needed.

FYI, my table is called User. I probably should change the name, but for now I just refer to it like this [User] and it works okay.

Also, I made one other change to the code, I changed the last line to "Order By LastPost Desc" as it had a problem with "reply.dtstamp" not being part of the result set.

Thanks again,

Chris

p.s. If you want to see your code in action, feel free to check out my site MoviePoet.com early next week.

No comments:

Post a Comment