Discussion:
Assigning Display Lines to Date Ranges
(too old to reply)
Michael Cole
2015-11-12 00:25:23 UTC
Permalink
I have the following data: -

CREATE TABLE [dbo].[Test](
[DateFrom] [date] NOT NULL,
[DateTo] [date] NOT NULL,
[ExpectedLine] [int] NULL
) ON [PRIMARY]

GO
INSERT [dbo].[Test] ([DateFrom], [DateTo], [ExpectedLine]) VALUES
(CAST(N'2015-01-01' AS Date), CAST(N'2015-01-10' AS Date), 1)
GO
INSERT [dbo].[Test] ([DateFrom], [DateTo], [ExpectedLine]) VALUES
(CAST(N'2015-01-12' AS Date), CAST(N'2015-01-16' AS Date), 1)
GO
INSERT [dbo].[Test] ([DateFrom], [DateTo], [ExpectedLine]) VALUES
(CAST(N'2015-01-15' AS Date), CAST(N'2015-01-20' AS Date), 2)
GO
INSERT [dbo].[Test] ([DateFrom], [DateTo], [ExpectedLine]) VALUES
(CAST(N'2015-01-18' AS Date), CAST(N'2015-01-24' AS Date), 1)
GO
INSERT [dbo].[Test] ([DateFrom], [DateTo], [ExpectedLine]) VALUES
(CAST(N'2015-01-25' AS Date), CAST(N'2015-01-30' AS Date), 1)
GO
INSERT [dbo].[Test] ([DateFrom], [DateTo], [ExpectedLine]) VALUES
(CAST(N'2015-02-01' AS Date), CAST(N'2015-02-07' AS Date), 1)
GO
INSERT [dbo].[Test] ([DateFrom], [DateTo], [ExpectedLine]) VALUES
(CAST(N'2015-02-03' AS Date), CAST(N'2015-02-08' AS Date), 2)
GO
INSERT [dbo].[Test] ([DateFrom], [DateTo], [ExpectedLine]) VALUES
(CAST(N'2015-02-06' AS Date), CAST(N'2015-02-12' AS Date), 3)
GO
INSERT [dbo].[Test] ([DateFrom], [DateTo], [ExpectedLine]) VALUES
(CAST(N'2015-02-11' AS Date), CAST(N'2015-02-20' AS Date), 1)
GO
INSERT [dbo].[Test] ([DateFrom], [DateTo], [ExpectedLine]) VALUES
(CAST(N'2015-02-16' AS Date), CAST(N'2015-02-24' AS Date), 2)
GO

Note that I also have a Date Dimension table (and an Integer Dimension
table if it becomes needed)

I am trying to write a query to return the Line Number, as per the
expected results listed in the table above. This line number is
calculated as follows: -

For each date range, in sequence of the From Date, assign the minimum
line number such that it does not have the same line number as any
prior date range for a date within the respective ranges.

If that doesn't make sense, I'll try to illustrate using the above
data.

Note - all dates in dd/m format.

Range 1 - 1/1 to 10/1 - gets Line 1
Range 2 - 12/1 to 16/1 - also gets Line 1
Range 3 - 15/1 to 20/1 - gets Line 2, as Line 1 for the 15th and 16th
is taken
Range 4 - 18/1 to 24/1 - gets Line 1 - whilst 18-20 are also taken by
Range 3, it was assigned to Line 2, so Line 1 is available
Range 5 - 25/1 to 30/1 - gets Line 1 - no issues
Range 6 - 1/2 to 7/2 - gets Line 1 - no issues
Range 7 - 3/2 to 8/2 - gets Line 2 - 3rd to 7th (Line 1) are taken by
above
Range 8 - 6/2 to 12/2 - gets Line 3 - for 6th and 7th, both Lines 1 and
2 are already taken
Range 9 - 11/2 to 20/2 - gets Line 1 - even though 11th and 12th are
taken by range 8, it was assigned Line 3
Range 10 - 16/2 to 24/2 - gets Line 2

Does this make sense? If not, I shall try to explain it better.

I would prefer a set-based solution, as there will be a lot of data to
go though (about 1500 ranges) and it needs a response time of less than
15 seconds maximum (less than 10 would be better.)


I have tried quite a few things and just cannot get the proper approach
- any advice would be helpful.
--
Michael Cole
Erland Sommarskog
2015-11-12 21:44:33 UTC
Permalink
Post by Michael Cole
I would prefer a set-based solution, as there will be a lot of data to
go though (about 1500 ranges) and it needs a response time of less than
15 seconds maximum (less than 10 would be better.)
It is far from certain that a set-based query would be the fastest in
this case, although I have not tried anything yet. But if the ranges
are tied to different entities that are separate from each other, you could
iterate over the entities in parallel.

A key question: which version of SQL Server are you using?
--
Erland Sommarskog, Stockholm, ***@sommarskog.se
Michael Cole
2015-11-12 22:01:42 UTC
Permalink
Post by Erland Sommarskog
Post by Michael Cole
I would prefer a set-based solution, as there will be a lot of data to
go though (about 1500 ranges) and it needs a response time of less than
15 seconds maximum (less than 10 would be better.)
It is far from certain that a set-based query would be the fastest in
this case, although I have not tried anything yet. But if the ranges
are tied to different entities that are separate from each other, you could
iterate over the entities in parallel.
They are. We are looking at about 70 entities, with a complete date
range of about a year. Individual ranges of about a month. For the
record, they are for bricklayers on new house builds.
Post by Erland Sommarskog
A key question: which version of SQL Server are you using?
Its eventually going to end up on Azure. I'm not quite sure if that
makes it more difficult...
--
Michael Cole
Erland Sommarskog
2015-11-13 13:12:09 UTC
Permalink
Post by Michael Cole
They are. We are looking at about 70 entities, with a complete date
range of about a year. Individual ranges of about a month. For the
record, they are for bricklayers on new house builds.
OK, so we could add HouseID to your test table like the below?

CREATE TABLE [dbo].[Test](
HouseID int NOT NULL,
[DateFrom] [date] NOT NULL,
[DateTo] [date] NOT NULL,
[ExpectedLine] [int] NULL,
PRIMARY KEY (HouseID, DateFrom)
) ON [PRIMARY]

GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-01-01' AS Date), CAST(N'2015-01-10' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-01-12' AS Date), CAST(N'2015-01-16' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-01-15' AS Date), CAST(N'2015-01-20' AS Date), 2)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-01-18' AS Date), CAST(N'2015-01-24' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-01-25' AS Date), CAST(N'2015-01-30' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-02-01' AS Date), CAST(N'2015-02-07' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-02-03' AS Date), CAST(N'2015-02-08' AS Date), 2)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-02-06' AS Date), CAST(N'2015-02-12' AS Date), 3)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-02-11' AS Date), CAST(N'2015-02-20' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-02-16' AS Date), CAST(N'2015-02-24' AS Date), 2)

INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(2, CAST(N'2015-01-01' AS Date), CAST(N'2015-01-10' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(2, CAST(N'2015-01-12' AS Date), CAST(N'2015-01-16' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(2, CAST(N'2015-01-15' AS Date), CAST(N'2015-01-20' AS Date), 2)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(2, CAST(N'2015-01-18' AS Date), CAST(N'2015-01-24' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(2, CAST(N'2015-01-25' AS Date), CAST(N'2015-01-30' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(2, CAST(N'2015-02-01' AS Date), CAST(N'2015-02-07' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(2, CAST(N'2015-02-03' AS Date), CAST(N'2015-02-08' AS Date), 2)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(2, CAST(N'2015-02-06' AS Date), CAST(N'2015-02-12' AS Date), 3)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(2, CAST(N'2015-02-11' AS Date), CAST(N'2015-02-20' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(2, CAST(N'2015-02-16' AS Date), CAST(N'2015-02-24' AS Date), 2)


GO
select * FROM Test
go
drop table Test
--
Erland Sommarskog, Stockholm, ***@sommarskog.se
Erland Sommarskog
2015-11-13 22:18:58 UTC
Permalink
Here is a solution that uses set-based iteration. That is, it first
computes row numbers house. Then I iterate over the row numbers, so
that I can handle the first row for all houses in a single go, and
then move on the second row.

To compute the Line Number, we need a table of numbers. That is a table
with all numbers from 1 and up to some max. As it happens, the highest
number I will ever need, is also the highest row numbers, so I
extract all row numbers to the table #nums which then doubles as a table
of numbers and as a driver for the cursor.

I modified the dates for HouseID = 2, but I have not verified that the
values I compute the desired ones.

CREATE TABLE [dbo].[Test](
HouseID int NOT NULL,
[DateFrom] [date] NOT NULL,
[DateTo] [date] NOT NULL,
[ExpectedLine] [int] NULL,
ComputedValue int NULL,
PRIMARY KEY (HouseID, DateFrom)
) ON [PRIMARY]

GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-01-01' AS Date), CAST(N'2015-01-10' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-01-12' AS Date), CAST(N'2015-01-16' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-01-15' AS Date), CAST(N'2015-01-20' AS Date), 2)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-01-18' AS Date), CAST(N'2015-01-24' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-01-25' AS Date), CAST(N'2015-01-30' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-02-01' AS Date), CAST(N'2015-02-07' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-02-03' AS Date), CAST(N'2015-02-08' AS Date), 2)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-02-06' AS Date), CAST(N'2015-02-12' AS Date), 3)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-02-11' AS Date), CAST(N'2015-02-20' AS Date), 1)
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo], [ExpectedLine]) VALUES
(1, CAST(N'2015-02-16' AS Date), CAST(N'2015-02-24' AS Date), 2)

INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo]) VALUES
(2, CAST(N'2015-04-01' AS Date), CAST(N'2015-04-17' AS Date))
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo]) VALUES
(2, CAST(N'2015-04-12' AS Date), CAST(N'2015-04-23' AS Date))
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo]) VALUES
(2, CAST(N'2015-04-15' AS Date), CAST(N'2015-04-27' AS Date))
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo]) VALUES
(2, CAST(N'2015-04-18' AS Date), CAST(N'2015-04-24' AS Date))
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo]) VALUES
(2, CAST(N'2015-04-25' AS Date), CAST(N'2015-04-30' AS Date))
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo]) VALUES
(2, CAST(N'2015-05-05' AS Date), CAST(N'2015-05-07' AS Date))
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo]) VALUES
(2, CAST(N'2015-05-03' AS Date), CAST(N'2015-05-08' AS Date))
GO
INSERT [dbo].[Test] (HouseID, DateFrom, [DateTo]) VALUES
(2, CAST(N'2015-05-02' AS Date), CAST(N'2015-05-12' AS Date))
GO
CREATE TABLE #temp (HouseID int NOT NULL,
DateFrom date NOT NULL,
DateTo date NOT NULL,
rowno int NOT NULL,
PRIMARY KEY (rowno, HouseID))

INSERT #temp(HouseID, DateFrom, DateTo, rowno)
SELECT HouseID, DateFrom, DateTo,
row_number() OVER (PARTITION BY HouseID ORDER BY DateFrom)
FROM Test

CREATE TABLE #nums(num int NOT NULL PRIMARY KEY)
INSERT #nums (num)
SELECT DISTINCT rowno FROM #temp

DECLARE @rowno int

DECLARE cur CURSOR STATIC LOCAL FOR
SELECT num FROM #nums ORDER BY num

OPEN cur

WHILE 1 = 1
BEGIN
FETCH cur INTO @rowno
IF @@fetch_status <> 0
BREAK

UPDATE Test

SET ComputedValue = val.val
FROM Test a
JOIN #temp b ON a.HouseID = b.HouseID
AND a.DateFrom = b.DateFrom
CROSS APPLY (SELECT MIN(n.num) AS val
FROM #nums n
WHERE NOT EXISTS (SELECT *
FROM Test a2
WHERE a.HouseID = a2.HouseID
AND a.DateFrom < a2.DateTo
AND a.DateTo > a2.DateFrom
AND n.num = a2.ComputedValue)) AS val(val)
WHERE b.rowno = @rowno
END

DEALLOCATE cur
go
SELECT * FROM Test ORDER BY HouseID, DateFrom
go
drop table Test, #temp, #nums
--
Erland Sommarskog, Stockholm, ***@sommarskog.se
Michael Cole
2015-11-16 00:34:15 UTC
Permalink
Post by Erland Sommarskog
Here is a solution that uses set-based iteration. That is, it first
computes row numbers house. Then I iterate over the row numbers, so
that I can handle the first row for all houses in a single go, and
then move on the second row.
Erland,

Thank you very much. Works brilliantly and very quickly. Much
appreciated, and I also learnt a great deal from it.
--
Michael Cole
--CELKO--
2015-11-13 02:47:15 UTC
Permalink
Thank you for trying to do this right.Pleas e3learn thew ISO-11179 rules for data element names. Also, you need to read and download the PDF for: https:--www.simple-talk.com-books-sql-books-119-sql-code-smells-


We have no key, no constraints, etc. Is this what you were try to say?: Tables have keys; where were you on day one of RDBMS basics?

CREATE TABLE Tests
(test_start_date DATE NOT NULL PRIMARY KEY,
test_end_date DATE NOT NULL,
CHECK(test_start_date <= test_end_date),
expected_line_value INTEGER
);

Please learn the current SQL syntax. Why did you force casting?

INSERT INTO Tests
VALUES
('2015-01-01', '2015-01-10', 1),
('2015-01-12', '2015-01-16', 1),
('2015-01-15', '2015-01-20', 2),
('2015-01-18', '2015-01-24', 1),
('2015-01-25', '2015-01-30', 1),
('2015-02-01', '2015-02-07', 1),
('2015-02-03', '2015-02-08', 2),
('2015-02-06', '2015-02-12', 3),
('2015-02-11', '2015-02-20', 1),
('2015-02-16', '2015-02-24', 2);
Note that I also have a Date Dimension table (and an Integer Dimension table if it becomes needed) <<
WHY? Where?
I am trying to write a query to return the Line Number, as per the
expected results listed in the table above. This line number is
calculated as follows: <<

What is a "line number"? It sound likes a physical locator, but that would be soooo wrong.
For each date range, in sequence of the test_start_date to test_end_date, assign the minimum line number such that it does not have the same line number as any prior date range for a date within the respective ranges. <<
unh?
If that doesn't make sense, I'll try to illustrate using the above data. <<
Why did you fail to use ISO-8601 dates!? It is bitch for your free labor to re-type your postings for you!
Note - all dates in dd/m format. <<
Gee, you should follow ISO-8601 rules for displaying temporal data (https://xkcd.com/1179/). Why are you soooo special?

CREATE TABLE ReportRanges
(report_name VARCHAR(30) NOT NULL PRIMARY KEY,
report_start_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,
report_end_date DATE NOT NULL,
CONSTRAINT date_ordering
CHECK (report_start_date <= report_end_date),
etc);

These reports can overlap; a fiscal quarter will be contained in the range of its fiscal year. There can be gaps between them; we have to wait a year between each "Annual Going out Of Business Sale!" and there might be long stretches of time without any special sales. But we want to know their ranges so that the table is fairly constant once it is created.

The basic skeleton for use with these tables is

SELECT R.report_name, << summary computations >>
FROM ReportRanges AS R, [Events] AS E
WHERE E.event_date BETWEEN R.report_start_date AND report_end_date
AND R.report_name IN (<<report name list>>)
GROUP BY R.report_name;

The <<report name list>> would probably be events nested inside each other, like fiscal quarters inside a fiscal year. While this is a useful and portable programming trick, you need to consider replacing it with the newer OLAP extensions to the GROUP BY clause such as ROLLUP and CUBE.
Loading...