Python | 正規表現（re）

Python

2022.03.052025.11.05

実務でよく使う正規表現テンプレート集（Python用）
1. メールアドレス
2. 日本の電話番号（市外局番あり）
3. 日付（YYYY-MM-DD）
4. 郵便番号（日本）
5. URL
6. 半角数字（電話・IDなどの数字だけ）
7. 半角英数字（ユーザー名やパスワードの簡易チェック）
8. 練習用：一度に複数パターンを抽出
1. 便利な使い方メモ
メール・電話番号・日付」を一括で抽出して整理するPythonスクリプト
1. このスクリプトの特徴

実務でよく使う正規表現テンプレート集（Python用）

では Python で実務的にすぐ使える正規表現テンプレート集 を作ります。
メール、電話番号、日付、郵便番号、URL など、よくあるパターンをまとめました。各例には 使い方コード付き で、そのままコピーして使えます。初心者でも分かるようにコメントも充実させています。

import re

import re

Python

1. メールアドレス

# パターン
email_pattern = r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}"

# 使い方
text = "お問い合わせは taro@example.com まで"
emails = re.findall(email_pattern, text)
print(emails)  # ['taro@example.com']

# パターン
email_pattern = r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}"

# 使い方
text = "お問い合わせは taro@example.com まで"
emails = re.findall(email_pattern, text)
print(emails)  # ['taro@example.com']

Python

💡 ポイント：

[A-Za-z0-9._%+-]+ → ユーザー名部分
@ → @ 記号
[A-Za-z0-9.-]+ → ドメイン名
\.[A-Za-z]{2,} → 最後のドットとTLD（例: .com）

2. 日本の電話番号（市外局番あり）

# パターン（ハイフンあり）
phone_pattern = r"\b0\d{1,4}-\d{1,4}-\d{4}\b"

# 使い方
text = "連絡先: 03-1234-5678, 携帯: 090-9876-5432"
phones = re.findall(phone_pattern, text)
print(phones)  # ['03-1234-5678', '090-9876-5432']

# パターン（ハイフンあり）
phone_pattern = r"\b0\d{1,4}-\d{1,4}-\d{4}\b"

# 使い方
text = "連絡先: 03-1234-5678, 携帯: 090-9876-5432"
phones = re.findall(phone_pattern, text)
print(phones)  # ['03-1234-5678', '090-9876-5432']

Python

💡 ポイント：

\b → 単語境界（余計な文字に引っかからない）
0\d{1,4} → 0から始まる市外局番
\d{1,4}-\d{4} → 途中と最後の数字

3. 日付（YYYY-MM-DD）

date_pattern = r"\b\d{4}-\d{2}-\d{2}\b"

text = "今日は2025-11-05、明日は2025-11-06です"
dates = re.findall(date_pattern, text)
print(dates)  # ['2025-11-05', '2025-11-06']

date_pattern = r"\b\d{4}-\d{2}-\d{2}\b"

text = "今日は2025-11-05、明日は2025-11-06です"
dates = re.findall(date_pattern, text)
print(dates)  # ['2025-11-05', '2025-11-06']

Python

💡 ポイント：

\d{4} → 年（4桁）
\d{2} → 月・日（2桁固定）
ハイフンで区切る

4. 郵便番号（日本）

zipcode_pattern = r"\b\d{3}-\d{4}\b"

text = "住所: 東京都新宿区西新宿2-8-1 〒160-0023"
zipcodes = re.findall(zipcode_pattern, text)
print(zipcodes)  # ['160-0023']

zipcode_pattern = r"\b\d{3}-\d{4}\b"

text = "住所: 東京都新宿区西新宿2-8-1 〒160-0023"
zipcodes = re.findall(zipcode_pattern, text)
print(zipcodes)  # ['160-0023']

Python

💡 ポイント：

日本の郵便番号は XXX-XXXX 形式が多い

5. URL

url_pattern = r"https?://[A-Za-z0-9./?=&_%+-]+"

text = "公式サイト https://example.com/page?query=1 を確認してください"
urls = re.findall(url_pattern, text)
print(urls)  # ['https://example.com/page?query=1']

url_pattern = r"https?://[A-Za-z0-9./?=&_%+-]+"

text = "公式サイト https://example.com/page?query=1 を確認してください"
urls = re.findall(url_pattern, text)
print(urls)  # ['https://example.com/page?query=1']

Python

💡 ポイント：

https? → http または https
[A-Za-z0-9./?=&_%+-]+ → URL の文字

6. 半角数字（電話・IDなどの数字だけ）

number_pattern = r"\d+"

text = "ID: 12345, 金額: 6789円"
numbers = re.findall(number_pattern, text)
print(numbers)  # ['12345', '6789']

number_pattern = r"\d+"

text = "ID: 12345, 金額: 6789円"
numbers = re.findall(number_pattern, text)
print(numbers)  # ['12345', '6789']

Python

7. 半角英数字（ユーザー名やパスワードの簡易チェック）

alnum_pattern = r"[A-Za-z0-9]+"

text = "ユーザーID: Alice123"
ids = re.findall(alnum_pattern, text)
print(ids)  # ['Alice123']

alnum_pattern = r"[A-Za-z0-9]+"

text = "ユーザーID: Alice123"
ids = re.findall(alnum_pattern, text)
print(ids)  # ['Alice123']

Python

8. 練習用：一度に複数パターンを抽出

text = "連絡先: taro@example.com, 090-1234-5678, 2025-11-05"
emails = re.findall(email_pattern, text)
phones = re.findall(phone_pattern, text)
dates = re.findall(date_pattern, text)

print("メール:", emails)
print("電話:", phones)
print("日付:", dates)

text = "連絡先: taro@example.com, 090-1234-5678, 2025-11-05"
emails = re.findall(email_pattern, text)
phones = re.findall(phone_pattern, text)
dates = re.findall(date_pattern, text)

print("メール:", emails)
print("電話:", phones)
print("日付:", dates)

Python

結果：

メール: ['taro@example.com']
電話: ['090-1234-5678']
日付: ['2025-11-05']

便利な使い方メモ

re.search() → 最初の1件だけ欲しいとき
re.findall() → 全部欲しいとき
re.sub() → 一括置換に便利
生文字列 r"..." を必ず使う → \ が面倒にならない
パターンを re.compile() して変数に保存すると再利用できる

💡 追加の応用アイデア

フォーム入力のバリデーション（メール・電話・郵便番号チェック）
ログ解析（日時・IP・ステータスコード抽出）
CSV やテキストファイルから特定パターンの行だけ抽出

メール・電話番号・日付」を一括で抽出して整理するPythonスクリプト

では先ほどの正規表現テンプレートを使って、実務向けに「メール・電話番号・日付」を一括で抽出して整理するPythonスクリプト」を作ります。

複数行テキストやファイルからも対応可能
結果を辞書やリストで整理して表示
初心者向けにコメントつき

# =============================================
# 実務向け：メール・電話・日付を一括抽出
# =============================================
import re

# -----------------------------
# 1. 正規表現テンプレート
# -----------------------------
email_pattern   = r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}"
phone_pattern   = r"\b0\d{1,4}-\d{1,4}-\d{4}\b"
date_pattern    = r"\b\d{4}-\d{2}-\d{2}\b"

# -----------------------------
# 2. テキスト入力例
# -----------------------------
text = """
お問い合わせは taro@example.com までお願いします。
連絡先: 03-1234-5678, 携帯: 090-9876-5432
契約日: 2025-11-05, 支払期日: 2025-12-01
その他: hanako@example.co.jp
"""

# -----------------------------
# 3. 正規表現で抽出
# -----------------------------
emails = re.findall(email_pattern, text)
phones = re.findall(phone_pattern, text)
dates  = re.findall(date_pattern, text)

# -----------------------------
# 4. 整理して出力
# -----------------------------
results = {
    "メール": emails,
    "電話": phones,
    "日付": dates
}

# 結果表示
for key, lst in results.items():
    print(f"{key}:", lst)

# -----------------------------
# 5. 応用：ユニーク化や整形も可能
# -----------------------------
# 重複を削除
emails_unique = list(set(emails))
phones_unique = list(set(phones))
dates_unique  = list(set(dates))

print("\n---ユニーク化後---")
print("メール:", emails_unique)
print("電話:", phones_unique)
print("日付:", dates_unique)

# =============================================
# 実務向け：メール・電話・日付を一括抽出
# =============================================
import re

# -----------------------------
# 1. 正規表現テンプレート
# -----------------------------
email_pattern   = r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}"
phone_pattern   = r"\b0\d{1,4}-\d{1,4}-\d{4}\b"
date_pattern    = r"\b\d{4}-\d{2}-\d{2}\b"

# -----------------------------
# 2. テキスト入力例
# -----------------------------
text = """
お問い合わせは taro@example.com までお願いします。
連絡先: 03-1234-5678, 携帯: 090-9876-5432
契約日: 2025-11-05, 支払期日: 2025-12-01
その他: hanako@example.co.jp
"""

# -----------------------------
# 3. 正規表現で抽出
# -----------------------------
emails = re.findall(email_pattern, text)
phones = re.findall(phone_pattern, text)
dates  = re.findall(date_pattern, text)

# -----------------------------
# 4. 整理して出力
# -----------------------------
results = {
    "メール": emails,
    "電話": phones,
    "日付": dates
}

# 結果表示
for key, lst in results.items():
    print(f"{key}:", lst)

# -----------------------------
# 5. 応用：ユニーク化や整形も可能
# -----------------------------
# 重複を削除
emails_unique = list(set(emails))
phones_unique = list(set(phones))
dates_unique  = list(set(dates))

print("\n---ユニーク化後---")
print("メール:", emails_unique)
print("電話:", phones_unique)
print("日付:", dates_unique)

Python

このスクリプトの特徴

複数行テキスト対応：text 内のどこからでも抽出
正規表現テンプレートを使い回せる：メール、電話、日付それぞれのパターンを別々に管理
整理して出力：辞書形式で見やすく表示
重複除去（応用）：set() を使うと同じ値を1回だけ抽出
拡張性あり：郵便番号やURLなども追加可能

💡 応用例：

ファイルからテキストを読み込んで同じ処理をする場合：

with open("sample.txt", "r", encoding="utf-8") as f:
    text = f.read()
# → あとは上の re.findall() で同じ処理

with open("sample.txt", "r", encoding="utf-8") as f:
    text = f.read()
# → あとは上の re.findall() で同じ処理

Python

月	火	水	木	金	土	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31