ODBC driver slower than JDBC driver - How to pull 1 million rows from Oracle

Question

Simon Nicholls il 18 Mar 2016

0
Link

Link diretto a questa domanda

https://it.mathworks.com/matlabcentral/answers/274225-odbc-driver-slower-than-jdbc-driver-how-to-pull-1-million-rows-from-oracle

Risposto: James Anderson il 1 Nov 2017

I have a very basic but large query on an Oracle database that pulls back around 1 million rows.

I have timed this query using other application at around 4 minutes.

I've used the JDBC driver in Matlab which of course has to loop through each row which makes it massively inefficient, this is timed at around 15 minutes which isn't bad considering it's looping through everything but that performance isn't good enough.

I then moved on to the ODBC driver within Matlab - this seems to be even worse, it has actually hung up now after running the query for 20 minutes.

My question therefore is - why is the native client slower than the Java client

Secondly if that can't be answered, what's the quickest way for me to get the data from an Oracle DB to Matlab.

This should be a basic request but I seem to be hitting real limitations here.

1 Commento
Mostra -1 commenti meno recentiNascondi -1 commenti meno recenti

Alessandro Roux il 23 Mar 2016

Apri in MATLAB Online

Hi Simon,

I believe that you can set the number of rows that you "fetch" with JDBC by setting "FetchInBatches" or by specifying a "RowLimit" in the "fetch" command. Doing so will allow you to pull more information per "fetch", which can increase the performance of the query.

 % Setting "fetch" to retrieve four rows at a time...
 %%using FetchInBatches
setdbprefs('FetchInBatches', 'yes');
setdbprefs('FetchBatchSize', 4 );
 %%using RowLimit
fetch(curs, 4 %{RowLimitParameter%} );

In regards to the performance of native ODBC vs. JDBC, I would expect native ODBC to be quicker. MathWorks documentation seems to suggest it being the case, as well:

http://www.mathworks.com/help/database/ug/choosing-between-odbc-and-jdbc-drivers.html

Could you post the code that you are using to query the database? It may help us suggest potential optimizations.

Hope this helps, Alessandro

Accedi per commentare.

Accedi per rispondere a questa domanda.

Answer 1

James Anderson il 1 Nov 2017

0
Link

Link diretto a questa risposta

https://it.mathworks.com/matlabcentral/answers/274225-odbc-driver-slower-than-jdbc-driver-how-to-pull-1-million-rows-from-oracle#answer_288801

Apri in MATLAB Online

I spent a lot of time on this - but with SQL server, however I think the same logic applied. Firstly, I found that it is better to download in batches; Matlab now has this option built in to the fetch command, the below will import all rows of data in batches of a specified number of rows at a time.

results = fetch(conn,sqlquery,fetchbatchsize)

Secondly, if you can import your data as numeric only - you will save a lot of time:

setdbprefs('DataReturnFormat','numeric');

but don't forget to set the return format back to the default when you are finished:

setdbprefs('DataReturnFormat','cellarray');

below is a function you can use for doing the whole job:

        function Result = batchfetch(conn,sqlcmd,DataReturnFormat)
            % aditional addon functions to the usual fetch command
            % allows batch processing (supposedly increased speed) and
            % options to download as numeric rather than cellarray
            % ReturnFormat is either cellarray or numeric
            try
                if nargin == 2
                    setdbprefs('DataReturnFormat','cellarray');
                else
                    switch DataReturnFormat
                        case 'numeric'
                            setdbprefs('DataReturnFormat','numeric');
                        case 'cellarray'
                            setdbprefs('DataReturnFormat','cellarray');
                        otherwise
                            error('DBContextBatchFetch:InputArguments','Unknown ReturnFormat')
                    end
                end
                % new fetch function as of 2016a
                Result = fetch(conn,sqlcmd,20000);
            catch Mexc
            end
            dbprefs = setdbprefs;
            if ~strcmpi(dbprefs.DataReturnFormat','cellarray')
                setdbprefs('DataReturnFormat','cellarray');
            end
            if exist('Mexc','var') ~=0
                rethrow(Mexc)
            end
        end

We also tried with ODBC - and found that it wasn't necessarily faster - and you have the added inconvenience that you have to manually set up the ODBC connection on the Windows machine you are on, where as JDBC can be defined purely in Matlab - if you put your database connection settings in a config file, you can then easily change the database by just altering your config file.