Python Excel操作逆引き集 | 読み込み時にカラムの変換関数を使う

PythonでExcel読み込み時にカラム変換関数を使う入門 — converters={'col': lambda x: x.strip()}
基本の使い方
よくある変換パターンとテンプレート
読み込み後の整形と組み合わせ
つまずきやすいポイントと回避策
実践テンプレート
ミニ例題（練習用）
まとめ

PythonでExcel読み込み時にカラム変換関数を使う入門 — converters={'col': lambda x: x.strip()}

Excelの列に「余分な空白」「単位付き文字列」「カンマ入り数値」などが混ざっていると、そのままでは集計や分析が難しいですよね。pandas.read_excel の converters 引数を使うと、読み込み時点で各セルに変換関数を適用できます。これで「きれいな形」にしてからDataFrameに取り込めます。

基本の使い方

import pandas as pd

# Product列の値から前後の空白を除去して読み込む
df = pd.read_excel(
    "report.xlsx",
    converters={"Product": lambda x: str(x).strip()}
)

print(df.head())

import pandas as pd

# Product列の値から前後の空白を除去して読み込む
df = pd.read_excel(
    "report.xlsx",
    converters={"Product": lambda x: str(x).strip()}
)

print(df.head())

Python

効果: 「りんご」→「りんご」など、余分な空白を削除して取り込めます。
ポイント:
- converters は「列名→関数」の辞書。
- 関数はセルごとに呼ばれ、返した値がそのままDataFrameに入ります。

よくある変換パターンとテンプレート

1. 空白除去（strip）

df = pd.read_excel("data.xlsx", converters={"商品名": lambda x: str(x).strip()})

df = pd.read_excel("data.xlsx", converters={"商品名": lambda x: str(x).strip()})

Python

2. カンマ入り数値を数値化

df = pd.read_excel(
    "sales.xlsx",
    converters={"金額": lambda x: float(str(x).replace(",", "")) if pd.notna(x) else None}
)

df = pd.read_excel(
    "sales.xlsx",
    converters={"金額": lambda x: float(str(x).replace(",", "")) if pd.notna(x) else None}
)

Python

3. 単位付き文字列を除去して数値化

df = pd.read_excel(
    "items.xlsx",
    converters={"重量": lambda x: float(str(x).replace("kg", "").strip()) if pd.notna(x) else None}
)

df = pd.read_excel(
    "items.xlsx",
    converters={"重量": lambda x: float(str(x).replace("kg", "").strip()) if pd.notna(x) else None}
)

Python

4. コード列を文字列固定（先頭ゼロ保持）

df = pd.read_excel(
    "codes.xlsx",
    converters={"商品コード": lambda x: str(x).zfill(6) if pd.notna(x) else None}
)

df = pd.read_excel(
    "codes.xlsx",
    converters={"商品コード": lambda x: str(x).zfill(6) if pd.notna(x) else None}
)

Python

5. 日付文字列を安全に変換

df = pd.read_excel(
    "orders.xlsx",
    converters={"注文日": lambda x: pd.to_datetime(x, errors="coerce")}
)

df = pd.read_excel(
    "orders.xlsx",
    converters={"注文日": lambda x: pd.to_datetime(x, errors="coerce")}
)

Python

読み込み後の整形と組み合わせ

複数列に converters を適用する:

df = pd.read_excel(
    "report.xlsx",
    converters={
        "商品名": lambda x: str(x).strip(),
        "数量": lambda x: int(str(x).replace(",", "")) if str(x).strip() else None,
        "金額": lambda x: float(str(x).replace(",", "")) if str(x).strip() else None
    }
)

df = pd.read_excel(
    "report.xlsx",
    converters={
        "商品名": lambda x: str(x).strip(),
        "数量": lambda x: int(str(x).replace(",", "")) if str(x).strip() else None,
        "金額": lambda x: float(str(x).replace(",", "")) if str(x).strip() else None
    }
)

Python

dtype と併用:
dtype は単純な型固定、converters は加工＋型変換。両方組み合わせると強力です。

つまずきやすいポイントと回避策

欠損セル（NaN）をそのまま渡すとエラーになることがある
→ if pd.notna(x) でチェックしてから処理する。
文字列化してから処理すると安全
→ str(x) にしてから replace や strip を使うと、数値やNaNでも落ちにくい。
複雑な変換は関数を定義して使うと読みやすい

def clean_amount(x):
    if pd.isna(x): return None
    return float(str(x).replace(",", "").replace("円", "").strip())

df = pd.read_excel("sales.xlsx", converters={"金額": clean_amount})

def clean_amount(x):
    if pd.isna(x): return None
    return float(str(x).replace(",", "").replace("円", "").strip())

df = pd.read_excel("sales.xlsx", converters={"金額": clean_amount})

Python

実践テンプレート

テンプレ1：金額列を数値化して月次合計

import pandas as pd

df = pd.read_excel(
    "monthly.xlsx",
    parse_dates=["日付"],
    converters={"金額": lambda x: float(str(x).replace(",", "")) if pd.notna(x) else None}
)

monthly = (
    df.assign(month=df["日付"].dt.to_period("M"))
      .groupby("month", as_index=False)["金額"].sum()
)
print(monthly)

import pandas as pd

df = pd.read_excel(
    "monthly.xlsx",
    parse_dates=["日付"],
    converters={"金額": lambda x: float(str(x).replace(",", "")) if pd.notna(x) else None}
)

monthly = (
    df.assign(month=df["日付"].dt.to_period("M"))
      .groupby("month", as_index=False)["金額"].sum()
)
print(monthly)

Python

テンプレ2：商品コードをゼロ埋め、商品名の空白除去

import pandas as pd

df = pd.read_excel(
    "products.xlsx",
    converters={
        "商品コード": lambda x: str(x).zfill(6) if pd.notna(x) else None,
        "商品名": lambda x: str(x).strip()
    }
)
print(df.head())

import pandas as pd

df = pd.read_excel(
    "products.xlsx",
    converters={
        "商品コード": lambda x: str(x).zfill(6) if pd.notna(x) else None,
        "商品名": lambda x: str(x).strip()
    }
)
print(df.head())

Python

テンプレ3：重量列から「kg」を除去して数値化

import pandas as pd

df = pd.read_excel(
    "weights.xlsx",
    converters={"重量": lambda x: float(str(x).replace("kg","").strip()) if pd.notna(x) else None}
)
print(df.describe())

import pandas as pd

df = pd.read_excel(
    "weights.xlsx",
    converters={"重量": lambda x: float(str(x).replace("kg","").strip()) if pd.notna(x) else None}
)
print(df.describe())

Python

ミニ例題（練習用）

例題1: 商品名の空白を除去して先頭10行を表示

import pandas as pd
df = pd.read_excel("items.xlsx", converters={"商品名": lambda x: str(x).strip()})
print(df.head(10))

import pandas as pd
df = pd.read_excel("items.xlsx", converters={"商品名": lambda x: str(x).strip()})
print(df.head(10))

Python

例題2: 金額列のカンマを除去して合計

import pandas as pd
df = pd.read_excel("sales.xlsx", converters={"金額": lambda x: float(str(x).replace(",", "")) if pd.notna(x) else None})
print("合計:", df["金額"].sum())

import pandas as pd
df = pd.read_excel("sales.xlsx", converters={"金額": lambda x: float(str(x).replace(",", "")) if pd.notna(x) else None})
print("合計:", df["金額"].sum())

Python

例題3: 注文日を日付型に変換して期間抽出

import pandas as pd
df = pd.read_excel("orders.xlsx", converters={"注文日": lambda x: pd.to_datetime(x, errors="coerce")})
mask = (df["注文日"] >= "2025-01-01") & (df["注文日"] < "2025-07-01")
print(df.loc[mask].head())

import pandas as pd
df = pd.read_excel("orders.xlsx", converters={"注文日": lambda x: pd.to_datetime(x, errors="coerce")})
mask = (df["注文日"] >= "2025-01-01") & (df["注文日"] < "2025-07-01")
print(df.loc[mask].head())

Python

まとめ

読み込み時に加工 → converters
空白除去・カンマ削除・単位除去・ゼロ埋め・日付変換などに便利。
欠損チェックを忘れずに。 pd.notna(x) で安全に。
複雑な処理は関数を定義して渡すと読みやすい。

月	火	水	木	金	土	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31