Azzera filtri
Azzera filtri

ODBC driver slower than JDBC driver - How to pull 1 million rows from Oracle

5 visualizzazioni (ultimi 30 giorni)
I have a very basic but large query on an Oracle database that pulls back around 1 million rows.
I have timed this query using other application at around 4 minutes.
I've used the JDBC driver in Matlab which of course has to loop through each row which makes it massively inefficient, this is timed at around 15 minutes which isn't bad considering it's looping through everything but that performance isn't good enough.
I then moved on to the ODBC driver within Matlab - this seems to be even worse, it has actually hung up now after running the query for 20 minutes.
My question therefore is - why is the native client slower than the Java client
Secondly if that can't be answered, what's the quickest way for me to get the data from an Oracle DB to Matlab.
This should be a basic request but I seem to be hitting real limitations here.
  1 Commento
Alessandro Roux
Alessandro Roux il 23 Mar 2016
Hi Simon,
I believe that you can set the number of rows that you "fetch" with JDBC by setting "FetchInBatches" or by specifying a "RowLimit" in the "fetch" command. Doing so will allow you to pull more information per "fetch", which can increase the performance of the query.
% Setting "fetch" to retrieve four rows at a time...
%%using FetchInBatches
setdbprefs('FetchInBatches', 'yes');
setdbprefs('FetchBatchSize', 4 );
%%using RowLimit
fetch(curs, 4 %{RowLimitParameter%} );
In regards to the performance of native ODBC vs. JDBC, I would expect native ODBC to be quicker. MathWorks documentation seems to suggest it being the case, as well:
Could you post the code that you are using to query the database? It may help us suggest potential optimizations.
Hope this helps, Alessandro

Accedi per commentare.

Risposte (1)

James Anderson
James Anderson il 1 Nov 2017
I spent a lot of time on this - but with SQL server, however I think the same logic applied. Firstly, I found that it is better to download in batches; Matlab now has this option built in to the fetch command, the below will import all rows of data in batches of a specified number of rows at a time.
results = fetch(conn,sqlquery,fetchbatchsize)
Secondly, if you can import your data as numeric only - you will save a lot of time:
setdbprefs('DataReturnFormat','numeric');
but don't forget to set the return format back to the default when you are finished:
setdbprefs('DataReturnFormat','cellarray');
below is a function you can use for doing the whole job:
function Result = batchfetch(conn,sqlcmd,DataReturnFormat)
% aditional addon functions to the usual fetch command
% allows batch processing (supposedly increased speed) and
% options to download as numeric rather than cellarray
% ReturnFormat is either cellarray or numeric
try
if nargin == 2
setdbprefs('DataReturnFormat','cellarray');
else
switch DataReturnFormat
case 'numeric'
setdbprefs('DataReturnFormat','numeric');
case 'cellarray'
setdbprefs('DataReturnFormat','cellarray');
otherwise
error('DBContextBatchFetch:InputArguments','Unknown ReturnFormat')
end
end
% new fetch function as of 2016a
Result = fetch(conn,sqlcmd,20000);
catch Mexc
end
dbprefs = setdbprefs;
if ~strcmpi(dbprefs.DataReturnFormat','cellarray')
setdbprefs('DataReturnFormat','cellarray');
end
if exist('Mexc','var') ~=0
rethrow(Mexc)
end
end
We also tried with ODBC - and found that it wasn't necessarily faster - and you have the added inconvenience that you have to manually set up the ODBC connection on the Windows machine you are on, where as JDBC can be defined purely in Matlab - if you put your database connection settings in a config file, you can then easily change the database by just altering your config file.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by