How to calculate the slope in SQL
If you're defining slope as just the slope from the earliest point to the latest point, and if score only increases with date, then you can get the output above with this:
SELECT *
FROM scores
JOIN
(SELECT foo.keyword,
(MAX(score)-MIN(score)) / DATEDIFF(MAX(date),MIN(date)) AS score
FROM scores
GROUP BY keyword) a
USING(keyword);
However if you want linear regression, or if scores can decrease as well as increase with time, you'll need something more complex.
The cleanest one I could make:
SELECT
Scores.Date, Scores.Keyword, Scores.Score,
(N * Sum_XY - Sum_X * Sum_Y)/(N * Sum_X2 - Sum_X * Sum_X) AS Slope
FROM Scores
INNER JOIN (
SELECT
Keyword,
COUNT(*) AS N,
SUM(CAST(Date as float)) AS Sum_X,
SUM(CAST(Date as float) * CAST(Date as float)) AS Sum_X2,
SUM(Score) AS Sum_Y,
SUM(CAST(Date as float) * Score) AS Sum_XY
FROM Scores
GROUP BY Keyword
) G ON G.Keyword = Scores.Keyword;
It uses Simple Linear Regression to calculate the slope.
Result:
Date Keyword Score Slope
2012-01-22 water bottle 0,010885442 0,000334784345222076
2012-01-23 water bottle 0,011203949 0,000334784345222076
2012-01-24 water bottle 0,008460835 0,000334784345222076
2012-01-25 water bottle 0,010363991 0,000334784345222076
2012-01-26 water bottle 0,011800716 0,000334784345222076
2012-01-27 water bottle 0,012948411 0,000334784345222076
2012-01-28 water bottle 0,012732459 0,000334784345222076
2012-01-29 water bottle 0,011682568 0,000334784345222076
Every database system seems to have a different approach to converting dates to numbers:
- MySQL:
TO_SECONDS(date)
orTO_DAYS(date)
- Oracle:
TO_NUMBER(TO_CHAR(date, 'J'))
ordate - TO_DATE('1','yyyy')
- MS SQL Server:
CAST(date AS float)
(or equivalentCONVERT
)