Finding value pairs in subsequent arrays

16 visualizzazioni (ultimi 30 giorni)
I am working with options data and am looking for Call-Put Pairs in a huge data set. I am fairly new to MatLab and in desparate need of help.
For example, I have a table with 4 columns and 11 arays, which looks like this:
What i need is to remove all arrays, which do not have matching Call "C" or vice versa a matching Put "P" to them, with same "date" and "exdate" and "strike_price".
So in the end, i would only need array 10&11, as they match in column 1/2/4 and do not match in cloumn 3.
Can anyone help me? Thanks in advance and all the best.

Risposta accettata

Cam Salzberger
Cam Salzberger il 13 Set 2021
Hello Kai,
If the order doesn't matter (i.e. the "call" doesn't need to be before the "put"), you could separate out the data into two tables, one for call entries and one for put entries. It might look something like:
whichCall = strcmp(dataTable.cp_flag, "C");
callTable = dataTable(whichCall, :);
putTable = dataTable(~whichCall, :);
Then you can go through one of them row-by-row (probably best if it's whichever table is usually shorter), and see if there is a matching entry in the other table. If there's a matching entry, keep the one in the table you are iterating. Otherwise, remove that line. I'll actually track it individually and remove it all at the end, for better indexing.
nCalls = size(callTable, 1);
whichCallRowsKeep = false(nCalls, 1);
for k = 1:nCalls
% Use element-wise AND (&) to compare each row together
whichMatchAllThree = ...
putTable.date == callTable.date(k) & ...
putTable.exdate == callTable.exdate(k) & ...
putTable.strike_price == callTable.strike_price(k);
% If there are any matching put rows, keep the call row
whichCallRowsKeep(k) = any(whichMatchAllThree);
end
% Remove all call rows with no matching put entry
filteredCallTable = callTable(whichCallRowsKeep, :);
-Cam
  1 Commento
Kai Koslowsky
Kai Koslowsky il 14 Set 2021
Modificato: Kai Koslowsky il 14 Set 2021
Hello Cam,
thanks for your quick response.
With a few adjustments in the beginning, your way worked! I used (maybe not the easiest way, but worked) grp2inx to convert dateTable.cp_flag into an index and from there on separated Calls and Puts into their own Tables. The strcmp-Function only gave me the first array for both Tables, with Call 0 and Put 1 (do not know what i did wrong here).
As i have a dataset of ca. 11 Mio arrays, i tried your second part for only a small amount of arrays and it worked perfectly. I did the same thing again, but now i looked for all the Puts to keep and just used the loop the other way around. In the end i combined both tables and used sortrows-function to put the Call-Put pairs back together.
You made my day Cam, thank you so very much.
All the best and take care,
Kai.

Accedi per commentare.

Più risposte (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by