Creating a variable problem

1 visualizzazione (ultimi 30 giorni)
Alejandro
Alejandro il 14 Gen 2024
Commentato: Shivam il 17 Gen 2024
I am trying to create a (instrumental) variable for my linear regression.
The variable is intended to be: Number of drug generic products not offered by firm i. This is, I want to count all the generic products that all firms sell, except the "own" one.
The key variables are:
firm: A list of 44 firms numbered from 1 to 44
indicator: Being a dummy variable that takes value 0 if the drug is generic, 1 if it is branded.
productid: The unique identifier of each product in my dataset.
The thing is that my dataset is panel data, and I want to count only the unique first instance of generic for each firm and productid. Ideally, what I would like to do is to iterate ovear each productid for each firm then take the first instance of the generic for each firm/productid combination, and then sum that count. Once I have that count, I just have to take all the generics of my dataset (82) and then subtract the sum I just did for each firm. This is what I tried so far:
% Iterate over each firm
uniqueFirms = unique(m.firm);
for i = 1:length(uniqueFirms)
firm = uniqueFirms(i);
% Get unique product IDs for the current firm
firmProductIDs = unique(m.productid(m.firm == firm));
% Iterate over each productid for the firm
for j = 1:length(firmProductIDs)
pid = firmProductIDs(j);
% Find the first generic product for the current productid within the firm
firstGenericIndex = find(m.firm == firm & m.productid == pid & m.indicator == 0, 1, 'first');
if ~isempty(firstGenericIndex)
m.first_generic_by_firm(firstGenericIndex) = 1;
end
end
end
% Total number of generics in the dataset
totalGenerics = 82;
% Initialize a column to store the count of generics not offered by each firm
m.generics_not_offered_by_firm = zeros(height(m), 1);
% Iterate over each firm to perform the subtraction
for i = 1:length(uniqueFirms)
firm = uniqueFirms(i);
% Count the first instances of generics for the firm
countGenericsByFirm = sum(m.first_generic_by_firm(m.firm == firm));
% Subtract from total and assign to the relevant rows
m.generics_not_offered_by_firm(m.firm == firm) = totalGenerics - countGenericsByFirm;
end
The final result is just a vector of zeros in the variable
m.generics_not_offered_by_firm
Also the variable
firstGenericIndex
only stores a vector of zeros.
Could anyone help me with that? Maybe you can propose another approach. If you need further information just let me know
Thanks,
Alejandro.

Risposta accettata

Shivam
Shivam il 14 Gen 2024
Hi,
Based on the information provided, I understand that you want to calculate the "Number of generic drug products unavailable from firm i," which involves pinpointing the initial introduction of a generic product by each distinct firm-productid combination within the data. Eventually, you want to get the overall generic drug count.
You can follow the below workaround to achieve the goal:
% Sort the table by firm, productid, and then by indicator to ensure generics come first
m = sortrows(m, {'firm', 'productid', 'indicator'});
% Find the unique combinations of firm and productid for generics (indicator == 0)
[uniqueComb, ia, ~] = unique(m(m.indicator == 0, {'firm', 'productid'}), 'rows', 'stable');
% Create a logical index for the first instance of each unique combination
firstGenericIndex = false(height(m), 1);
firstGenericIndex(ia) = true;
% Use accumarray to count the number of first generics for each firm
countGenericsByFirm = accumarray(m.firm(firstGenericIndex), 1, [], @sum);
% Total number of generics in the dataset
totalGenerics = 82;
% Initialize a column to store the count of generics not offered by each firm
m.generics_not_offered_by_firm = zeros(height(m), 1);
% Use the countGenericsByFirm to fill in the generics_not_offered_by_firm
for i = 1:length(unique(m.firm))
firm = unique(m.firm(i));
m.generics_not_offered_by_firm(m.firm == firm) = totalGenerics - countGenericsByFirm(firm);
end
I hope it helps.
Thanks
  2 Commenti
Alejandro
Alejandro il 17 Gen 2024
Hi! Thanks for your answer and your time. :)
I tried using the code you provided. It seems something is not working because the variable m.generics_not_offered_by_firm results in a whole vector of zeros.
Also the countGenericsByFirm should be a vector of variables right? it displays a 1x1 vector being 9. Maybe the problem is here.
Shivam
Shivam il 17 Gen 2024
Hey,
Can you attach your files for me to debug the issue? Since, I tried by creating a dummy data and it worked.

Accedi per commentare.

Più risposte (0)

Prodotti


Release

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by