Select a random sample of results from a query result

The SAMPLE clause will give you a random sample percentage of all rows in a table.

For example, here we obtain 25% of the rows:

SELECT * FROM emp SAMPLE(25)

The following SQL (using one of the analytical functions) will give you a random sample of a specific number of each occurrence of a particular value (similar to a GROUP BY) in a table.

Here we sample 10 of each:

SELECT * FROM (
SELECT job, sal, ROW_NUMBER()
OVER (
PARTITION BY job ORDER BY job
) SampleCount FROM emp
)
WHERE SampleCount <= 10

SELECT * FROM TABLE_NAME SAMPLE(1)

Will give you olny an approximate 1% share rather than exactly 1/100 of the number of observations. The likely reason is than Oracle generates a random flag for each observation as to whether include in in the sample that it generates. The argument 1 (1%) in such a generation process takes the role of probability of each observation's being selected into the sample.

If this is true, the actual distribution of sample sizes will be binomial.


This in not a perfect answer but will get much better performance.

SELECT  *
FROM    (
    SELECT  *
    FROM    mytable sample (0.01)
    ORDER BY
            dbms_random.value
    )
WHERE rownum <= 1000

Sample will give you a percent of your actual table, if you really wanted a 1000 rows you would need to adjust that number. More often I just need an arbitrary number of rows anyway so I don't limit my results. On my database with 2 million rows I get 2 seconds vs 60 seconds.

select * from mytable sample (0.01)

SELECT  *
FROM    (
        SELECT  *
        FROM    mytable
        ORDER BY
                dbms_random.value
        )
WHERE rownum <= 1000