MATLAB Answers

papafos
0

How to define the range for reading a csv file

Asked by papafos
on 14 Apr 2014
Latest activity Commented on by papafos
on 4 May 2014
Hello,
I have the following problem.I have a batch of .csv files that have the following format:
[10 rows of text data]
[useful numerical data varying number of rows x 40colums]
[10 rows of text data always starting with the specific word "ACP"]
[useful numerical data varying number of rows x 40colums]
What I need to do is to focus only to the numerical data. So,I need a way to detect for every fileat which point starts the second part of the text data and neglect it.In other words,I want to find the range of the first and the second part of the numerical data.
Thanks in advance.

  0 Comments

Sign in to comment.

Tags

2 Answers

Answer by Image Analyst
on 14 Apr 2014

I think that second batch of text lines in between the numerical data is what ruins it for you and prevents you from using functions like csvread(), readtable(), or importdata(). So I think you'll have to make a little custom routine to go through it line by line with textscan() or fgetl() and sscanf() to extract the data only when your line does not start with ACP. Use strfind() to see if ACP is on each line.

  0 Comments

Sign in to comment.


Answer by Alberto
on 14 Apr 2014

The problem here is that you can't predict how many rows the numerical data will have.
I recommend extract all using a generic command, like textscan for example, and then try to convert the types you get to numeric. If it can be done, that will be your desired data; if cannot then its textdata so it can be ignored.

  3 Comments

I don't agree because csvread() and readtable() don't need to know in advance how many rows there are, and they would work fine. That's no problem. The problem is the second set of "header" data stuck in the middle between the tables of numbers. But we do agree on using textscan() as one way to solve it, as long as the first argument is the line of text and not the file ID.
You are right, the problem with the first block is we don't know when to stop scanning before text.
I finally found a way to isolate the first and the second set of data. I used both importadata() and textscan() and took into account some conditions that hold for all the files regarding the rows of the second part of text. Thank you for the response.

Sign in to comment.