find missing dates from date range
I guess you could always generate the date sequence and just use a NOT IN
to eliminate the dates that actually exist. This will max out at a 1024 day range, but is easy to shrink or extend, the date column is called "mydate" and is in the table "table1";
SELECT * FROM (
SELECT DATE_ADD('2013-08-01', INTERVAL t4+t16+t64+t256+t1024 DAY) day
FROM
(SELECT 0 t4 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 ) t4,
(SELECT 0 t16 UNION ALL SELECT 4 UNION ALL SELECT 8 UNION ALL SELECT 12 ) t16,
(SELECT 0 t64 UNION ALL SELECT 16 UNION ALL SELECT 32 UNION ALL SELECT 48 ) t64,
(SELECT 0 t256 UNION ALL SELECT 64 UNION ALL SELECT 128 UNION ALL SELECT 192) t256,
(SELECT 0 t1024 UNION ALL SELECT 256 UNION ALL SELECT 512 UNION ALL SELECT 768) t1024
) b
WHERE day NOT IN (SELECT mydate FROM Table1) AND day<'2013-08-13';
From the "I would add an SQLfiddle if it wasn't down" dept.
Thanks for help here is the query i am end up with and its working
SELECT * FROM
(
SELECT DATE_ADD('2013-08-01', INTERVAL t4+t16+t64+t256+t1024 DAY) missingDates
FROM
(SELECT 0 t4 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 ) t4,
(SELECT 0 t16 UNION ALL SELECT 4 UNION ALL SELECT 8 UNION ALL SELECT 12 ) t16,
(SELECT 0 t64 UNION ALL SELECT 16 UNION ALL SELECT 32 UNION ALL SELECT 48 ) t64,
(SELECT 0 t256 UNION ALL SELECT 64 UNION ALL SELECT 128 UNION ALL SELECT 192) t256,
(SELECT 0 t1024 UNION ALL SELECT 256 UNION ALL SELECT 512 UNION ALL SELECT 768) t1024
) b
WHERE
missingDates NOT IN (SELECT DATE_FORMAT(start_date,'%Y-%m-%d')
FROM
working GROUP BY start_date)
AND
missingDates < '2013-08-13';
My bet would be probably to create a dedicated Calendar
table just to be able to use it on a LEFT JOIN
.
You could create the table on per need basis, but as it will not represent a such large amount of data, the simplest and probably most efficient approach is to create it once for all, as I do below using a stored procedure:
--
-- Create a dedicated "Calendar" table
--
CREATE TABLE Calendar (day DATE PRIMARY KEY);
DELIMITER //
CREATE PROCEDURE init_calendar(IN pStart DATE, IN pEnd DATE)
BEGIN
SET @theDate := pStart;
REPEAT
-- Here I use *IGNORE* in order to be able
-- to call init_calendar again for extend the
-- "calendar range" without to bother with
-- "overlapping" dates
INSERT IGNORE INTO Calendar VALUES (@theDate);
SET @theDate := @theDate + INTERVAL 1 DAY;
UNTIL @theDate > pEnd END REPEAT;
END; //
DELIMITER ;
CALL init_calendar('2010-01-01','2015-12-31');
In this example, the Calendar hold 2191 consecutive days, which represent at a roughly estimate less that 15KB. And storing all the dates from the 21th century will represent less that 300KB...
Now, this is your actual data table as described in the question:
--
-- *Your* actual data table
--
CREATE TABLE tbl (theDate DATE);
INSERT INTO tbl VALUES
('2013-08-02'),
('2013-08-02'),
('2013-08-02'),
('2013-08-03'),
('2013-08-05'),
('2013-08-08'),
('2013-08-08'),
('2013-08-09'),
('2013-08-10'),
('2013-08-13'),
('2013-08-13'),
('2013-08-13');
And finally the query:
--
-- Now the query to find date not "in range"
--
SET @start = '2013-08-01';
SET @end = '2013-08-13';
SELECT Calendar.day FROM Calendar LEFT JOIN tbl
ON Calendar.day = tbl.theDate
WHERE Calendar.day BETWEEN @start AND @end
AND tbl.theDate IS NULL;
Producing:
+------------+
| day |
+------------+
| 2013-08-01 |
| 2013-08-04 |
| 2013-08-06 |
| 2013-08-07 |
| 2013-08-11 |
| 2013-08-12 |
+------------+
This is how i would do it:
$db_dates = array (
'2013-08-02',
'2013-08-03',
'2013-08-05',
'2013-08-08',
'2013-08-09',
'2013-08-10',
'2013-08-13'
);
$missing = array();
$month = "08";
$year = "2013";
$day_start = 1;
$day_end = 14
for ($i=$day_start; $i<$day_end; $i++) {
$day = $i;
if ($i<10) {
$day = "0".$i;
}
$check_date = $year."-".$month."-".$day;
if (!in_array($check_date, $db_dates)) {
array_push($missing, $check_date);
}
}
print_r($missing);
I made it just to that interval but you can just define another interval or make it work for the whole year.