Lookahead and Lookbehind

Lookahead and Lookbehind (前瞻和后瞻)

Sometimes it is necessary to detect merely those matches for a pattern that are preceded and followed by another pattern. (有时,仅检测前后跟随另一个模式的模式的匹配是必要的。)

Specific syntaxes are used to meet that goal. They are known as lookahead and lookbehind. Together they are called lookaround. (使用特定语法来实现这一目标。它们被称为前瞻和后瞻。它们一起被称为lookaround。)

As a rule, lookaround corresponds to characters, giving up the match and returning only the result: no match or match. That’s why they are considered assertions. They don’t employ characters in a string, and only state whether the match is successful or not.

To be more precise, let’s try to find the price from the following string: 1 lesson costs 15€. Here you can see a number, followed by €.

Lookahead

Lookahead (前瞻)

The syntax of lookahead is the following:

X(?=Y) ( XY)

It commands to look for X but match only when it is followed by X. Any pattern can be used instead of X and Y. (它命令查找X ,但仅在后面跟着X时才匹配。可以使用任何模式代替X和Y。)

For an integer number that is followed by €, the regular expression will be \d+(?=€). Here, how it looks like:

let str = "1 lesson costs 15€";
console.log(str.match(/\d+(?=€)/)); // 15, the number 1 is ignored, as it is not followed by the sign €

The lookahead is just a test, hence the parentheses contests (?=…) are not included in the result 10. (前瞻只是一个测试,因此括号竞赛( ? =… )不包括在结果10中。)

While looking for X(?=Y), the engine of the regular expression detects X and then checks whether there is Y right after it. In case there is no Y, then the match is skipped, and the search goes on. (在寻找X (? = Y ) ,正则表达式的引擎检测X ,然后检查后面是否有Y。如果没有Y ,则跳过匹配,继续搜索。)

A pattern like X(?=Y)(?=Z) considers searching for X followed by and then Z simultaneously. It can be possible only when Y and Z are mutually exclusive. (像X (? = Y) (? = Z)考虑先搜索X ,然后搜索Z。只有当Y和Z相互排斥时才有可能。)

Here is an example:

let str = "1 lesson costs 15€";
console.log(str.match(/\d+(?=\s)(?=.*15)/)); // 1

Negative Lookahead

Negative Lookahead (向后)

Now, imagine you need to get the quantity instead of the price from the same string. In our case, it’s a number \d+, not followed by €. (现在,想象一下,您需要从相同的字符串中获取数量而不是价格。在我们的例子中,它是一个数字\ d + ,而不是后面的€。)

You can use the negative lookahead for that purpose. (您可以将负面前瞻用于此目的。)

The syntax of the negative lookahead is X(?!Y), considering the search for X, only if it is not followed by Y, like here:

let str = "2 lessons cost 30€";
console.log(str.match(/\d+(?!€)/)); // 2 (skipping the price)

Lookbehind

Lookbehind (向后看)

As it was noted above, lookahead allows adding a condition for what is ahead. Now, let’s discover lookbehind. The same logic works here. Lookbehind allows adding a condition for what is behind. In other words, it allows matching a pattern only if there is something before it. (如上所述, Lookahead允许为未来添加条件。现在,让我们来看看背后。同样的逻辑也适用于此。Lookbehind允许为背后的内容添加条件。换句话说,它只允许匹配之前的模式。)

Lookbehind can also be positive and negative. The positive lookbehind syntax is (?<=Y)X, considering that X will be matched only if there is Y before it. the syntax of the negative lookbehind is (?, considering that X will be matched, only if there is no Y before it.

Let’s check out an example:

let str = "1 lesson costs $15";
// the dollar sign is escaped \$
alert(str.match(/(?<=\$)\d+/)); // 15,the sole number is skipped

In the example above, the price is changed to US dollars. But, if you want to get the quantity, then you should use the negative lookbehind, like this:

let str = "2 lessons cost $30";
console.log(str.match(/(?<!\$)\d+/)); // 2 ,the price is skipped

Capturing Groups

Capturing Groups (正在捕获组)

Usually, the contents inside the lookaround parentheses don’t become part of the result. (通常,查找括号中的内容不会成为结果的一部分。)

But, there are situations when it’s necessary to capture the lookaround expression or just a part of it. It is possible though wrapping that part into additional parentheses. (但是,在某些情况下,有必要捕捉周围的表情或只是其中的一部分。可以通过将该部分包裹到其他括号中。)

The currency sign (€|kr) is captured along with the amount, in the example below:

let str = "1 lesson costs 15€";
let regexp = /\d+(?=(€|kr))/; // additional parentheses around €|kr
console.log(str.match(regexp)); // 15, €

The same situation is with lookbehind in this example:

let str = "1 lesson costs $15";
let regexp = /(?<=(\$|£))\d+/;
console.log(str.match(regexp)); // 15, $

Summary

Summary (概要)

For matching a pattern that is preceded and/or followed by another one are used the lookaround syntaxes. Lookahead is useful for matching something depending on the context after it, and lookbehind- the context before it. (为了匹配前面和/或后面跟着另一个模式的模式,使用了查找语法。Lookahead对于根据后面的上下文和后面的上下文进行匹配非常有用。)

The same thing is done manually for simple regular expressions. In other words, matching everything, in any context, then filtering it in the loop. (对于简单的正则表达式,也可以手动执行相同的操作。换句话说,在任何上下文中匹配所有内容,然后在循环中对其进行过滤。)

For example, str.match and str.matchAll return matches as arrays with the index property. (例如, str.match和str.matchAll返回的数组与index属性匹配。)

But, lookaround is much more convenient, especially for more complex regular expressions. (但是,查找要方便得多,特别是对于更复杂的正则表达式。)



请遵守《互联网环境法规》文明发言,欢迎讨论问题
扫码反馈

扫一扫,反馈当前页面

咨询反馈
扫码关注
返回顶部