MATLAB 正则表达式regexp函数使用

本文介绍MATLAB正则表达式regexp函数常见使用方法。startIndex = regexp(str,expression)这个语句会返回所有符合条件的字符串的第一个字符的位置，注意是所有符合的。

肆拾伍

24418人浏览 · 2019-08-18 16:51:31

肆拾伍 · 2019-08-18 16:51:31 发布

本文介绍MATLAB正则表达式regexp函数常见使用方法。

startIndex = regexp(str,expression)
这个语句会返回所有符合条件的字符串的第一个字符的位置，注意是所有符合的。

str = 'bat cat can car coat court CUT ct CAT-scan';
expression = 'c[aeiou]+t';
startIndex = regexp(str,expression)

在这里插入图片描述
2. [startIndex,endIndex] = regexp(str,expression)
这个语句返回起始位置和结束位置，注意是所有符合的。

在这里插入图片描述
注意: 如果输入为元胞数组，那么返回的也是元胞数组，例如：

str = {'Madrid, Spain','Romeo and Juliet','MATLAB is great'};
capExpr = '[A-Z]';
capStartIndex = regexp(str,capExpr);

在这里插入图片描述
3. matchStr = regexp(str,expression,‘match’)
用于返回符合条件的字符串，而不是索引。 ‘match’是取词。

str = 'EXTRA! The regexp function helps you relax.';
expression = '\w*x\w*';
matchStr = regexp(str,expression,'match')

4. splitStr = regexp(str,expression,‘split’)
用于在指定位置将字符串分开

str = ['Split ^this text into ^several pieces'];
expression = '\^';
splitStr = regexp(str,expression,'split') % 在有^处分割

在这里插入图片描述
5. [match,noMatch] = regexp(str,expression,‘match’,‘split’)
同时利用’match’和’split’两种方式，用两个输出参数接收，第一个为符合条件的字符串，第二个为剩下的字符串，由符合条件的字符串作为分割界限。

str = 'She sells sea shells by the seashore.';
expression = '[Ss]h.';
[match,noMatch] = regexp(str,expression,'match','split')

在这里插入图片描述
6. [tokens,matches] = regexp(str,expression,‘tokens’,‘match’)
tokens返回捕获的字符串，matches返回匹配的字符串。

str = '<title>My Title</title><p>Here is some text.</p>';
expression = '<(\w+).*>.*</\1>';
[tokens,matches] = regexp(str,expression,'tokens','match');

在这里插入图片描述
从上图总我们可以看到，tokens返回的是括号中捕获到的内容，而matches返回的是符合整个表达式字符串。

matchWithIgnorecase = regexp(str,expression,‘match’,‘ignorecase’)
ignorecase 这个参数用来指定不区分大小写的。
tokenNames = regexp(str,expression,‘names’);
用于给捕获到的字符串命名，最后用结构体存储。

str = '01/11/2000  20-02-2020  03/30/2000  16-04-2020';
expression = ['(?<month>\d+)/(?<day>\d+)/(?<year>\d+)|'...
              '(?<day>\d+)-(?<month>\d+)-(?<year>\d+)'];
tokenNames = regexp(str,expression,'names');  % ?<name> 是将符合这个表达式的字符串命名为name