The purpose of the regex search is to determine all template class instances from C++ header files. The class instances can be formarted such as:
CMyClass<int> myClassInstance;
CMyClass2<
int,
int
> myClass2Instacen;
The search is performed by loading the entire file into a string:
open(FILE, $file);
$string = join('',<FILE>);
close(FILE);
And the following regex is used to determine the class instances even if the class instance spans more then one line in the string:
$search_string = "\s*\w[^typename].*<(\s*\w\s*,?\n?)*)>\s*\w+.*";
$string =~ m/$search_string/;
The problem is that the search returns one hit only even though more class instances exist in the files.
Is it possible to get all hits by use of this approach from one of the regex backreferences variables?
-
What you require is the
\G
modifier. It starts the next match of your string after the last match.Here is the documentation from Perl Doc (SO is having trouble with the link, so you'll have to copy and paste):
http://perldoc.perl.org/perlfaq6.html#What-good-is-'%5cG'-in-a-regular-expression%3f
Chas. Owens : Direct link to section referred to: http://perldoc.perl.org/perlfaq6.html#What-good-is-%27\G%27-in-a-regular-expression%3fGavin Miller : Thanks Chas :) -
First, if you are going to slurp files, you should use File::Slurp. Then you can do:
my $contents = read_file $file;
read_file will croak on error.
Second, [^typename] does not exclude just the string 'typename' but also any string containing any of those characters. Other than that, it is not obvious to me that the pattern you use will consistently match the things you want it to match, but I can't comment on that right now.
Finally, to get all the matches in the file one by one, use the g modifier in a loop:
my $source = '3 5 7'; while ( $source =~ /([0-9])/g ) { print "$1\n"; }
Now that I have had a chance to look at your pattern, I am still not sure of what to make of [^typename], but here is an example program that captures the part between the angle brackets (as that seems to be the only thing you are capturing above):
use strict; use warnings; use File::Slurp; my $pattern = qr{ ^ \w+ <\s*((?:\w+(?:,\s*)?)+)\s*> \s* \w+\s*; }mx; my $source = read_file \*DATA; while ( $source =~ /$pattern/g ) { my $match = $1; $match =~ s/\s+/ /g; print "$match\n"; } __DATA__ CMyClass<int> myClassInstance; CMyClass2< int, int > myClass2Instacen; C:\Temp> t.pl int int, int
Now, I suspect you would prefer the following, however:
my $pattern = qr{ ^ ( \w+ <\s*(?:\w+(?:,\s*)?)+\s*> \s* \w+ ) \s*; }mx;
which yields:
C:\Temp> t.pl CMyClass<int> myClassInstance CMyClass2< int, int > myClass2Instacen
-
I'd do something like this,
assuming you've got some text file like,#!/usr/bin/perl -w use strict; use warnings; local(*F); open(F,$ARGV[0]); my $text = do{local($/);}; my (@hits) = $text =~ m/([a-z]{3})/gsi; print "@hits\n";
/home/user$ more a.txt a bb dkl jidij lksj lai suj ldifk kjdfkj bb bb kdjfkal idjksdj fbb kjd fkjd fbb kadfjl bbb bb bb bbd i
this will print out all the hits from the regex:
/home/user$ ./a.pl a.txt dkl jid lks lai suj ldi kjd fkj kdj fka idj ksd fbb kjd fkj fbb kad fjl bbb bbd
and a specific solution for your problem, using the same approach, might look like,
#!/usr/bin/perl -w use strict; use warnings; my $text = <<ENDTEXT; CMyClass<int> myClassInstance; CMyClass2< int, int > myClass2Instacen; CMyClass35< int, int > myClass35Instacen; ENDTEXT my $basename = "MyClass"; my (@instances) = $text =~ m/\s*(${basename}[0-9]*\s*\<.*? (?=\>\s*${basename}) \>\s*${basename}.*?;)/xgsi; for(my $i=0; $i<@instances; $i++){ print $i."\t".$instances[$i]."\n\n"; }
of course you'll probably need to tweak the regex a bit more to fit all the edge cases in your data but that should be a pretty good start.
Alexandr Ciornii : open my $fh, $ARGV[0] is better than local(*F); open(F,$ARGV[0]); use Perl::Critic on your examples.blackkettle : i tried Perl::Critic on my examples (bit of a hassle to install) but it doesn't give any comments/warnings/errors for my example. also, i noted that the pre and code block are not properly escaping my left-right angle brackets...
0 comments:
Post a Comment