新的Matlab,所以这可能比我意识到的更简单。
我正在处理大量用于数据分析的文本文件,并希望将它们划分为多个类别。它们采用类似于Tp_angle_RunNum.txt和Ts_angle_RunNum.text的格式。我希望有4个不同的组,angle1的Tp和Ts文件,angle2的相同。
任何帮助都将不胜感激!
发布于 2022-10-04 04:26:13
这里有几件事:
让我们来看看这些概念中的几个,以及一些示例代码。我没有您的文件,但是我可以用dir命令的输出和一些虚拟文件来演示.我将把它分成两个部分,一个dirTable函数(它是我喜欢使用的dir包装器,而不是dir,并继续使用我的路径)和一个使用它的脚本。我建议您复制代码,并利用代码节一次运行一个节。如果这对您来说是新的,请参阅在代码中创建和运行区段上的doc页面。
ditable.m
% Filename: dirTable.m
function dirListing = dirTable( names )
arguments
names (:,1) string {mustBeText} = pwd()
end
dirListing = table();
for nIdx = 1:numel( names )
tempListing = dir( names( nIdx ) );
dirListing = [ dirListing;...
struct2table( tempListing,'AsArray', true ) ]; %#ok<AGROW>
end
if ~isempty( dirListing )
%Adjust some of the types...
dirListing.name = categorical( dirListing.name );
dirListing.folder = categorical( dirListing.folder );
dirListing.date = datetime( dirListing.date );
end
end示例脚本
%% Create some dummy files - assuming windows, or I'd use the "touch" cmd.
cmds = "type nul >> Ts_42_" + (1:3) + ".txt";
cmds = [cmds;"type nul >> Tp_42_" + (1:3) + ".txt"];
cmds = [cmds;"type nul >> Ts_21_" + (1:3) + ".txt"];
cmds = [cmds;"type nul >> Tp_21_" + (1:3) + ".txt"];
for idx = 1:numel(cmds)
system(cmds(idx));
end
%% Get the directory listing for all the files
% Note, the filenames come out as categoricals by my design, though that
% doesnt help much for this example - in fact - I'll have to cast to
% convert the categoricals to string a few times. Thats ok, its not a
% heavy lift. If you use dir directly, you'd not only be casting to
% string, but you'd also have to deal with the structure and clunky if/else
% conditions everywhere.
listing = dirTable();
%% Define patterns for each of the 4 groups
% - pretending the first code cell doesnt exist.
Tp_Angle1_pattern = "Tp_21";
Ts_Angle1_pattern = "Ts_21";
Tp_Angle2_pattern = "Tp_42";
Ts_Angle2_pattern = "Ts_42";
%% Cycle a group's data, creating a single table from all the files
% I could be more clever here and loop through the patterns as well and
% create a table of tables; however, I am going to keep this code easier
% to read at the cost of repetitiveness. I will however use a local
% function to gather all the runs from one group into a single table.
Tp_Angle1_matches = string(listing.name).startsWith(Tp_Angle1_pattern);
Tp_Angle1_filenames = string(listing.name(Tp_Angle1_matches));
Tp_Angle1_data = aggregateDataFilesToTable(Tp_Angle1_filenames);
% Repeat for each group... Or loop the above code for a single table
% if you loop for a single table, make sure to add column(s) for the group
% information
%% My local function for reading all the files in a group
function data_table = aggregateDataFilesToTable(filenames)
arguments
filenames (:,1) string
end
% We could assume that since we're using run numbers at the end of the
% filename, that we'll get the filenames pre-sorted for us. If not zero
% padding the numbers, then need to extract the run number to determine the
% sort order of the files to read in. I'm going to be lazy and assume zero
% padded for simplicity.
data_table = table();
for fileIdx = 1:numel(filenames)
% For the following line, two things:
% 1) the [data_table;readtable()] syntax appends the table from
% readtable to the end of data_table.
% 2) The comment at the end acknowledges that this variable is growing
% in a loop, which is usually not the best practice; however, since I
% have no way of knowing the total table dimensions ahead of time, I
% cannot pre-allocate the table before the loop - hence the table()
% call before the for loop. If you have a way of knowing this ahead of
% time, do pre-allocate!
data_table = [data_table;readtable(filenames(fileIdx))]; %#ok<AGROW>
end
end注1:在没有参数的函数调用中,不需要使用空的父类;但是,当其他人知道正在调用函数而不是读取变量时,我发现更容易阅读。
注2:我知道虚拟文件是空的。对于本例来说,这并不重要,因为追加到空表的空表是另一个空表。而OP的问题是关于文件的操作和分组等。
注意3:如果语法对您来说是新的,那么在我的示例中,这两个函数都使用函数参数块,这是在R2019b中引入的--它们比不验证输入或使用更复杂的验证输入的方法更容易维护和读取代码。我本来打算把它从这个例子中删除,但是它们已经在我的dirTable函数中了,所以我想我只需要解释一下。
https://stackoverflow.com/questions/73817241
复制相似问题