Python | 「正規表現 × raw文字列」練習問題集

Python

2015.03.012025.11.03

では、「正規表現 × raw文字列」専用の練習問題集を作ります。
初心者向けに ステップアップ形式 で、基礎 → 中級 → 応用まで用意し、すべて 解答と解説付き にします。

【基礎レベル】（バックスラッシュと raw文字列の基本）
1. 問題1
2. 問題2
【中級レベル】（メールアドレス・単語抽出など）
1. 問題3
2. 問題4
【応用レベル】（実用的なパターン抽出）
練習問題のポイントまとめ

【基礎レベル】（バックスラッシュと raw文字列の基本）

問題1

文章から数字だけを抽出してください。

import re

text = "私は今年30歳で、弟は12歳です。"
pattern = r"..."  # ここを完成させる
numbers = re.findall(pattern, text)
print(numbers)

import re

text = "私は今年30歳で、弟は12歳です。"
pattern = r"..."  # ここを完成させる
numbers = re.findall(pattern, text)
print(numbers)

Python

解答例

pattern = r"\d+"
numbers = re.findall(pattern, text)
print(numbers)

pattern = r"\d+"
numbers = re.findall(pattern, text)
print(numbers)

Python

解説

\d → 数字
+ → 1文字以上
raw文字列にすると \d をそのまま書ける

問題2

Windowsのファイルパス C:\Users\Python を抽出する正規表現を書こう。

import re

text = "ファイルはC:\\Users\\Pythonに保存されています"
pattern = r"..."
path = re.findall(pattern, text)
print(path)

import re

text = "ファイルはC:\\Users\\Pythonに保存されています"
pattern = r"..."
path = re.findall(pattern, text)
print(path)

Python

解答例

pattern = r"C:\\Users\\Python"
path = re.findall(pattern, text)
print(path)

pattern = r"C:\\Users\\Python"
path = re.findall(pattern, text)
print(path)

Python

解説

raw文字列を使うと \\ が1つの \ として扱われる
そのまま書くと見やすく、間違いにくい

【中級レベル】（メールアドレス・単語抽出など）

問題3

文章からメールアドレスをすべて抽出してください。

import re

text = "連絡は info@example.com または support@python.org までお願いします。"
pattern = r"..."
emails = re.findall(pattern, text)
print(emails)

import re

text = "連絡は info@example.com または support@python.org までお願いします。"
pattern = r"..."
emails = re.findall(pattern, text)
print(emails)

Python

解答例

pattern = r"[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+"
emails = re.findall(pattern, text)
print(emails)

pattern = r"[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+"
emails = re.findall(pattern, text)
print(emails)

Python

解説

[a-zA-Z0-9_.+-]+ → メールユーザー名
@ → そのまま
[a-zA-Z0-9-]+ → ドメイン名
\. → ドット（raw文字列で書く）
[a-zA-Z0-9-.]+ → ドメインの末尾

問題4

文の先頭の単語だけを抽出してください。

import re

text = "Pythonは楽しい言語です。"
pattern = r"..."
first_word = re.findall(pattern, text)
print(first_word)

import re

text = "Pythonは楽しい言語です。"
pattern = r"..."
first_word = re.findall(pattern, text)
print(first_word)

Python

解答例

pattern = r"^\w+"
first_word = re.findall(pattern, text)
print(first_word)

pattern = r"^\w+"
first_word = re.findall(pattern, text)
print(first_word)

Python

解説

^ → 文の先頭
\w+ → 1文字以上の単語文字
raw文字列で \w がそのまま使える

【応用レベル】（実用的なパターン抽出）

問題5

文章から日本語の電話番号（例: 090-1234-5678）を抽出してください。

import re

text = "私の電話は090-1234-5678、友達は070-9876-5432です。"
pattern = r"..."
phones = re.findall(pattern, text)
print(phones)

import re

text = "私の電話は090-1234-5678、友達は070-9876-5432です。"
pattern = r"..."
phones = re.findall(pattern, text)
print(phones)

Python

解答例

pattern = r"\d{3}-\d{4}-\d{4}"
phones = re.findall(pattern, text)
print(phones)

pattern = r"\d{3}-\d{4}-\d{4}"
phones = re.findall(pattern, text)
print(phones)

Python

解説

\d{3} → 3桁の数字
- → ハイフン
\d{4} → 4桁の数字
raw文字列で書くと \d をそのまま使える

問題6

文章からURLを抽出してください（例: https://example.com）

import re

text = "公式サイトは https://example.com です"
pattern = r"..."
urls = re.findall(pattern, text)
print(urls)

import re

text = "公式サイトは https://example.com です"
pattern = r"..."
urls = re.findall(pattern, text)
print(urls)

Python

解答例

pattern = r"https?://[a-zA-Z0-9./-]+"
urls = re.findall(pattern, text)
print(urls)

pattern = r"https?://[a-zA-Z0-9./-]+"
urls = re.findall(pattern, text)
print(urls)

Python

解説

https? → http または https
:// → そのまま
[a-zA-Z0-9./-]+ → ドメインやパスの文字
raw文字列で書くことで \ や特殊文字のエスケープが簡単

問題7（チャレンジ）

文章中の単語の間の空白だけを抽出してください。

import re

text = "Python は 楽しい 言語 です"
pattern = r"..."
spaces = re.findall(pattern, text)
print(spaces)

import re

text = "Python は 楽しい 言語 です"
pattern = r"..."
spaces = re.findall(pattern, text)
print(spaces)

Python

解答例

pattern = r"\s+"
spaces = re.findall(pattern, text)
print(spaces)

pattern = r"\s+"
spaces = re.findall(pattern, text)
print(spaces)

Python

解説

\s → 空白文字（スペース・タブなど）
+ → 1文字以上
raw文字列を使うことで \s がそのまま使える

練習問題のポイントまとめ

raw文字列で書くと、\ をそのまま使える
正規表現では \d や \w などのバックスラッシュが多くなるので必須
ファイルパス・メールアドレス・電話番号・URLなど、実務でよく使う文字列パターンの抽出に便利
最初は raw文字列なしで書くと、 \\ が多くなって読みにくくなるので注意

月	火	水	木	金	土	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31