《Python数据分析技术栈》第03章 01 正则表达式（Regular expressions）

本文介绍: Python数据分析技术栈》第03章 01 正则表达式（Regular expressions）正则表达式是一种包含字符（如字母和数字）和元字符（如 * 和 $ 符号）的模式。正则表达式可用于搜索、替换或提取具有可识别模式的数据，例如日期、邮政编码、HTML 标记、电话号码等。正则表达式还可用于验证密码和电子邮件地址等字段，确保用户的输入格式正确。

A regular expression is a pattern containing both characters (like letters and digits) and metacharacters (like the * and $ symbols). Regular expressions can be used whenever we want to search, replace, or extract data with an identifiable pattern, for example, dates, postal codes, HTML tags, phone numbers, and so on. They can also be used to validate fields like passwords and email addresses, by ensuring that the input from the user is in the correct format.

正则表达式是一种包含字符（如字母和数字）和元字符（如 * 和 $ 符号）的模式。正则表达式可用于搜索、替换或提取具有可识别模式的数据，例如日期、邮政编码、HTML 标记、电话号码等。正则表达式还可用于验证密码和电子邮件地址等字段，确保用户的输入格式正确。

Support for regular expressions is provided by the re module in Python, which can be imported using the following statement:

Python 中的 re 模块提供了对正则表达式的支持，可以使用以下语句导入该模块：

import re

If you have not already installed the re module, go to the Anaconda Prompt and enter the following command:

pip install re

search_pattern=re.compile(r'and')

search_pattern.search('Today and tomorrow')

re.search('and','Today and tomorrow')

re.findall('3','98371234')

re.search('3','98371234')

re.match('3','98371234')

re.split('3','98371234')

re.sub('3','three','98371234')

re.findall("ba.","bar bat bad ba. ban")

regex=re.compile(r'[crbmdhw]ash')
regex.findall('cash rash bash mash dash hash wash crash ash')

regex=re.compile(r'Austr[a]?[l]?[a]?[s]?ia')
regex.findall('Austria Australia Australasia Asia')

re.findall("abc[1]*","abc1 abc111 abc1 abc abc111111111111 abc01")

Backslash () metacharacter

d Matches a digit (0–9)
D Matches any character that is not a digit
w Matches an alphanumeric character, which could be a lowercase letter (a–z), an uppercase letter (A–Z), or a digit (0–9)
W Matches any character which is not alphanumeric
s Matches any whitespace character
S Matches any non-whitespace character

d 匹配数字（0-9）
D 可匹配非数字的任何字符
w 与字母数字字符匹配，可以是小写字母 (a-z)、大写字母 (A-Z) 或数字 (0-9)
W 匹配任何非字母数字字符
s 匹配任何空白字符
S 匹配任何非空格字符

regex=re.compile(r'W.H.O')
regex.search('W.H.O norms')

re.findall("[a-z]+123","a123 b123 123 ab123 xyz123")

regex=re.compile(r'[d]{3}-[d]{3}-[d]{4}')
regex.findall('987-999-8888 99122222 911-911-9111')

regex=re.compile(r'[w]{6,10}')
regex.findall('abcd abcd1234,abc$$$$$,abcd12 abcdef')

re.search(r'[d]$','aa*5')

re.search(r'^[s]',' a bird')

显示所有内容

声明：本站所有文章，如无特殊说明或标注，均为本站原创发布。任何个人或组织，在未征得本站同意时，禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益，可联系我们进行处理。

regular 正则表达式

01 正则表达式（Regular expressions）

使用正则表达式解决问题的步骤（Steps for solving problems with regular expressions）

简写（Shortcut (combining steps 2 and 3)）

正则表达式的 Python 函数（Python functions for regular expressions）

元角色（Metacharacters）

Dot (.) metacharacter

Square brackets ([]) as metacharacters

Question mark (?) metacharacter

Asterisk (*) metacharacter

Backslash () metacharacter

常用的字符类别（Commonly Used Character）

反斜杠符号的另一种用法：转义元字符（Another usage of the backslash symbol: Escaping metacharacters）

Plus (+) metacharacter

Curly braces {} as metacharacters

Dollar ($) metacharacter

Caret (^) metacharacter

发表回复取消回复

01 正则表达式（Regular expressions）

使用正则表达式解决问题的步骤（Steps for solving problems with regular expressions）

简写（Shortcut (combining steps 2 and 3)）

正则表达式的 Python 函数（Python functions for regular expressions）

元角色（Metacharacters）

Dot (.) metacharacter

Square brackets ([]) as metacharacters

Question mark (?) metacharacter

Asterisk (*) metacharacter

Backslash () metacharacter

常用的字符类别（Commonly Used Character）

反斜杠符号的另一种用法： 转义元字符（Another usage of the backslash symbol: Escaping metacharacters）

Plus (+) metacharacter

Curly braces {} as metacharacters

Dollar ($) metacharacter

Caret (^) metacharacter

相关文章

发表回复 取消回复

反斜杠符号的另一种用法：转义元字符（Another usage of the backslash symbol: Escaping metacharacters）

发表回复取消回复