How to iterate over all rows of a table in Oracle db

Recently I was writing a program to copy data from Oracle to Big Query. This program needs to fetch all rows in a given table (aka a table scan) and then write them to the destination. We should not fetch all the rows in one call since the table may have millions of rows in it corresponding to gigabytes of data. We want to batch the reads and read maybe a thousand records at a time. So how do we do this?

How not to do it

I was familiar with the offset ... limit technique of paging in MySQL. The equivalent syntax in Oracle is offset N rows fetch next M rows only. So we could do this as example:

select * from table offset 0 rows fetch next 1000 rows only;
select * from table offset 1000 rows fetch next 1000 rows only;
select * from table offset 2000 rows fetch next 2000 rows only;
...

The first query will be very fast (offset is 0) but if you run this on a table with more than a million rows you will notice the query time steadily increases as you progress to higher and higher offset. This is because when we do offset N rows fetch next M rows only the database still reads N+M rows. It just discards the N rows before returning to client. The effect might be unnoticeable on small tables but with tables containing more than a million records, we could clearly see the difference.

Using Seek Pagination

Searching online I came across the method of seek pagination. The idea is to use the where clause to quickly seek to a position in the database and then fetch next M rows only. Here is what the queries would look like:

select * from table order by col offset 0 rows fetch next 1000 rows only;
select * from table order by col where col > last_read fetch next 1000 rows only;
select * from table order by col where col > last_read fetch next 1000 rows only;
...

The first query gets the first batch of records. You have to keep track of the last value read and store it in a variable last_read. Replace last_read in above with the actual value. For this method to work, it requires that col (or a combination of columns) is indexed with a B-tree index and has a unique constraint on it (meaning no two rows can have same value of col). Unfortunately we could not satisfy this requirement in many of our tables.

Also whereas this is supposed to work in theory, in practice as soon as we added an order by clause to our queries the query time jumped by 10x! even though we were ordering on an indexed column. select * from table order by col offset 0 rows fetch next 1000 rows only; was 10x slower than select * from table offset 0 rows fetch next 1000 rows only;

So we went back to the offset ... fetch method.

However …

When we don’t order the rows

select * from table offset 0 rows fetch next 1000 rows only;
select * from table offset 1000 rows fetch next 1000 rows only;
select * from table offset 2000 rows fetch next 2000 rows only;
...

it is possible (in theory) that above queries may return some rows more than once and some rows may never be returned. In practice, I found that does not happen (almost). To test this, I fetched the rowid pseudocolumn as follows:

select rowid from table offset 0 rows fetch next 1000 rows only
select rowid from table offset 1000 rows fetch next 1000 rows only
select rowid from table offset 2000 rows fetch next 1000 rows only
...

I put the rowid in a HashSet

Object obj = resultSet.getObject("rowid");
                    set.add(obj.toString());

and we can see if the rowids we are getting are distinct. What I found was surprising. I was getting distinct rowids until I reached the end of the database. Then strange things started happening and I started getting duplicate rowids. This is because – if you have a database that is being written onto as you are making the queries – the different queries are running on different snapshots of the database.

The Right Way

Turns out the right way to iterate over all the rows is much simpler. Just do:

select * from table

It is the simplest SQL query of all. You do not need to order the rows. Above will return each row once and only once. When we run above query using the Oracle JDBC Driver, it does not fetch all the rows all at once – the original concern which set me off in wrong direction. The rows are fetched in batches as we iterate over the ResultSet – it happens invisibly to the programmer. The JDBC driver allows the programmer to set the batch size. This is done using the all important setFetchSize method. So the right way is to fetch all the rows in a single query select * from table and make use of setFetchSize to control the number of rows you want to fetch in a batch. The JDBC driver will make multiple calls but all those calls are tied to the single query. The best part is that when we used this method it was 100x faster! than seek pagination using order by. The seek pagination method also suffers from the multiple snapshots problem as we are making multiple queries to the database.

One problem

There is one problem you might run into when doing a select * from table on a big table (i.e., if you have a long running query). It is the error oracle.jdbc.OracleDatabaseException: ORA-01555: snapshot too old:

Caused by: oracle.jdbc.OracleDatabaseException: ORA-01555: snapshot too old: rollback segment number 50 with name "_SYSSMU50_3232338357$" too small

	at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:513) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:461) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:1104) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:553) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:269) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:655) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:229) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.T4CStatement.fetch(T4CStatement.java:1037) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.OracleStatement.fetchMoreRows(OracleStatement.java:3456) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.InsensitiveScrollableResultSet.fetchMoreRows(InsensitiveScrollableResultSet.java:742) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.InsensitiveScrollableResultSet.absoluteInternal(InsensitiveScrollableResultSet.java:698) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at oracle.jdbc.driver.InsensitiveScrollableResultSet.next(InsensitiveScrollableResultSet.java:412) ~[ojdbc10-19.15.0.0.1.jar:19.15.0.0.1]
	at com.example.App.doTableScan(App.java:69) ~[classes/:?]
	at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1736) ~[?:?]
	... 3 more

Btw from this callstack we can see that internally the JDBC driver calls oracle.jdbc.driver.OracleStatement.fetchMoreRows to fetch the next batch of rows as we iterate over a ResultSet.