Search code examples
sqlt-sqljoinwindow-functionscalculated-columns

How do I create a Query that calculates the adverage of a computed column


I need to create a query that shows the average productivity for editors that have worked on more than one book except for their first book published with a precision of 0.01 pages/day.

I'm having trouble finding a way to get the number of days since the last DateOfPublication column and use it to divide it from the NoOfPages column.

The columns to be displayed are

EditorName

BookName

computed column AverageProductivity

Here are the tables and their columns

AGENT  AgentID (PK,varchar(11), not null)
       AgentName (varchar(25), not null)

BOOK   BookName (PK, varchar(45), not null)
       Genre (varchar(25), not null)
       DateOfPublication (date, not null)
       NoOfPages (int, not null)
       WriterID (PK, FK,, varchar(11), not null)
       EditorID (FK, varchar(11), not null)

EDITOR EditorID (PK, varchar(11), not null)
       EditorName (varchar(25), not null)
       Mentors_EditorID (FK, varchar(11), null)

WRITER WriterID (PK, varchar(11), not null)
       WriterName (varchar(25), not null)
       AgentID (FK, varchar(11), not null)

Sample Data

insert into BOOK (BookName, WriterID, Genre, DateOfPublication, NoOfPages, EditorID)
values ('Valley of Heroes','10','Fiction','2010-01-12',874,'20');
insert into BOOK (BookName, WriterID, Genre, DateOfPublication, NoOfPages, EditorID)
values ('The Ruler''s Return','11','Fantasy','2012-03-14',765,'22');
insert into BOOK (BookName, WriterID, Genre, DateOfPublication, NoOfPages, EditorID)
values ('eRobot','11','Fantasy','2011-04-15',264,'20');
insert into BOOK (BookName, WriterID, Genre, DateOfPublication, NoOfPages, EditorID)
values ('An Uncle''s Letters','12','Fiction','2012-06-12',258,'20');
insert into BOOK (BookName, WriterID, Genre, DateOfPublication, NoOfPages, EditorID)
values ('Pretty flowers','13','Album','2013-01-31',148,'22');
insert into BOOK (BookName, WriterID, Genre, DateOfPublication, NoOfPages, EditorID)
values ('A Tale of Lions','12','Fantasy','2012-08-17',301,'21');
insert into BOOK (BookName, WriterID, Genre, DateOfPublication, NoOfPages, EditorID)
values ('eRobot','13','Sci Fi','2012-10-04',465,'23');

The Query is producing the correct Columns now, credit to GMB, but The calculated column is showing 0 values.

Here is the Query...

select * from (
    select 
    e.EditorName,
    b.BookName,
    round(
        NoOfPages/datediff(
            day, 
            lag(b.DateOfPublication) over(partition by b.EditorID order by b.DateOfPublication),
            DateOfPublication
        ),
        2
    ) AverageProductivity       
from book b
inner join editor e on e.EditorID = b.EditorID 
) x where AverageProductivity is not null

Results...

Melanie eRobot  0
Melanie An Uncle's Letters  0
George  Pretty flowers  0

Solution

  • You can use window function lag() to recover the date of the previous publication of the same editor.

    Then, datediff(day, ...) can give you the difference between the publication date of the last book and the current one, in days.

    Finally, divide the number of pages of the current book with the day difference, use round() to limit the number of decimals, and you are done.

    For the first book of an editor, lag() returns null, which will propagate in the computation, resulting in the computed column showing null as well.

    select 
        e.EditorName,
        b.BookName,
        round(
            NoOfPages/datediff(
                day, 
                lag(b.DateOfPublication) over(partition by b.EditorID order by b.DateOfPublication),
                DateOfPublication
            ),
            2
        ) AverageProductivity       
    from book b
    inner join editor e on e.EditorID = b.EditorID 
    

    If you want to skip the records that correspond to the first book of each editor, then you can wrap the query:

    select * from (
        -- above query
    ) x where AverageProductivity is not null