Python Excel操作逆引き集 | 読み込み時に列名をリネームして取り込む

Pythonで読み込み時に列名をリネームして取り込む入門 — names=[…]
基本の使い方
既存ヘッダーを上書きする（header=0 と names を併用）
よくあるパターン別テンプレート
応用：MultiIndex（複数ヘッダー行）との併用
読み込み後の整形ワザ（相性の良い処理）
つまずきやすいポイントと回避策
実践テンプレート
ミニ例題（練習用）

Pythonで読み込み時に列名をリネームして取り込む入門 — names=[…]

「Excelの列名が使いにくい」「列名がない」「毎回同じ名前にしたい」— 読み込み時に names=[...] を使うと、DataFrameの列名を一括で指定できます。初心者向けに、動くコード、テンプレート、つまずき対策をまとめました。

基本の使い方

import pandas as pd

# 列名がない（または使わない）場合に、読み込み時点で列名を指定
df = pd.read_excel(
    "report.xlsx",
    header=None,                       # 既存のヘッダー行を使わない
    names=["date", "item", "qty", "price"]  # ここで列名を決める
)

print(df.head())
print(df.columns)  # Index(['date','item','qty','price'], dtype='object')

import pandas as pd

# 列名がない（または使わない）場合に、読み込み時点で列名を指定
df = pd.read_excel(
    "report.xlsx",
    header=None,                       # 既存のヘッダー行を使わない
    names=["date", "item", "qty", "price"]  # ここで列名を決める
)

print(df.head())
print(df.columns)  # Index(['date','item','qty','price'], dtype='object')

Python

ポイント:
- header=None: 既存の見出し行を無視して、names をそのまま列名にします。
- 列数一致: names の数は、読み込む列数と一致させます（合わないとエラー）。

既存ヘッダーを上書きする（header=0 と names を併用）

import pandas as pd

df = pd.read_excel(
    "messy.xlsx",
    header=0,  # 1行目を見出しとして扱いつつ
    names=["date", "product", "quantity", "amount"]  # その名前に置き換える
)

import pandas as pd

df = pd.read_excel(
    "messy.xlsx",
    header=0,  # 1行目を見出しとして扱いつつ
    names=["date", "product", "quantity", "amount"]  # その名前に置き換える
)

Python

効果: 元の列名が「日付」「商品名」などでも、読み込み直後から英語名や分析用名に統一できます。

よくあるパターン別テンプレート

説明行を飛ばしつつ、列名を自分で付ける

import pandas as pd

df = pd.read_excel(
    "monthly.xlsx",
    skiprows=5,                         # 6行目からが表
    header=None,
    names=["date", "product", "qty", "amount"]
)

import pandas as pd

df = pd.read_excel(
    "monthly.xlsx",
    skiprows=5,                         # 6行目からが表
    header=None,
    names=["date", "product", "qty", "amount"]
)

Python

狙い: ロゴや注記が先頭にある帳票への定番対応。

欲しい列だけ読み、同時に名前を付ける（列番号指定）

import pandas as pd

df = pd.read_excel(
    "no_header.xlsx",
    header=None,
    usecols=[0, 2, 4],                  # 0,2,4列だけ
    names=["date", "item", "amount"]    # 3列ぶんの名前
)

import pandas as pd

df = pd.read_excel(
    "no_header.xlsx",
    header=None,
    usecols=[0, 2, 4],                  # 0,2,4列だけ
    names=["date", "item", "amount"]    # 3列ぶんの名前
)

Python

狙い: 列名が不安定な帳票でも、位置で選んで安定化。

既存列を安定名に上書き（列名揺れ対策）

import pandas as pd

df = pd.read_excel("japanese.xlsx", header=0, names=["date", "product", "qty", "amount_jpy"])

import pandas as pd

df = pd.read_excel("japanese.xlsx", header=0, names=["date", "product", "qty", "amount_jpy"])

Python

狙い: 「金額」「金額(円)」などの揺れを、読み込み時点で排除。

応用：MultiIndex（複数ヘッダー行）との併用

多段ヘッダーを使わずフラットにしたい:
- 注意: 本来の上段情報は捨てることになるため、必要なら後から別列で補う。

import pandas as pd
# 本来 header=[0,1] の帳票でも、上段は捨てて自分の名前へ
df = pd.read_excel("multi.xlsx", header=None, skiprows=2, names=["date","A","B","C"])

import pandas as pd
# 本来 header=[0,1] の帳票でも、上段は捨てて自分の名前へ
df = pd.read_excel("multi.xlsx", header=None, skiprows=2, names=["date","A","B","C"])

Python

多段ヘッダーを読み込んだ後にフラット化する（参考）:
多段を活かしたい場合は header=[0,1] を使い、後処理で "_".join(...) してフラット化が安全。

読み込み後の整形ワザ（相性の良い処理）

列名の空白除去・正規化:

df = df.rename(columns=lambda c: str(c).strip())

df = df.rename(columns=lambda c: str(c).strip())

Python

型変換の定番セット:

df["date"] = pd.to_datetime(df["date"], errors="coerce")
df["qty"] = pd.to_numeric(df["qty"], errors="coerce")
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")

df["date"] = pd.to_datetime(df["date"], errors="coerce")
df["qty"] = pd.to_numeric(df["qty"], errors="coerce")
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")

Python

コード列の先頭ゼロ保持:

df["product"] = df["product"].astype("string")

df["product"] = df["product"].astype("string")

Python

つまずきやすいポイントと回避策

names の数と列数が合わない:
- 対策: 先に usecols で絞って、列数を names に合わせる。
header=None を忘れて上書きされない:
- 対策: 既存ヘッダーを無視したいなら必ず header=None。既存を上書きしたいなら header=0 と names を併用。
日付を dtype=str にすると扱いづらい:
- 対策: 日付は parse_dates=["date"] で読み、混在時は後処理で pd.to_datetime(..., errors="coerce")。
多段ヘッダー帳票で列ズレ:
- 対策: skiprows で見出し開始行を合わせる。結合セルがあるとズレやすいので、まずは小さく読み込んで df.columns を確認。

実践テンプレート

説明行を飛ばして英語列名で読み、月次合計

import pandas as pd

df = pd.read_excel(
    "sales.xlsx",
    skiprows=4,
    header=None,
    names=["date", "product", "qty", "amount"],
    parse_dates=["date"]
)
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")

monthly = (
    df.assign(m=df["date"].dt.to_period("M"))
      .groupby("m", as_index=False)["amount"].sum()
)
print(monthly)

import pandas as pd

df = pd.read_excel(
    "sales.xlsx",
    skiprows=4,
    header=None,
    names=["date", "product", "qty", "amount"],
    parse_dates=["date"]
)
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")

monthly = (
    df.assign(m=df["date"].dt.to_period("M"))
      .groupby("m", as_index=False)["amount"].sum()
)
print(monthly)

Python

列番号で選んで、取り込み時に最終形の名前にする

import pandas as pd

df = pd.read_excel(
    "no_header.xlsx",
    header=None,
    usecols=[0, 3, 5],
    names=["order_id", "order_date", "amount"],
    parse_dates=["order_date"]
)
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")
print(df.head())

import pandas as pd

df = pd.read_excel(
    "no_header.xlsx",
    header=None,
    usecols=[0, 3, 5],
    names=["order_id", "order_date", "amount"],
    parse_dates=["order_date"]
)
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")
print(df.head())

Python

既存ヘッダーを上書きしてから型整形

import pandas as pd

df = pd.read_excel(
    "report.xlsx",
    header=0,
    names=["date", "customer", "region", "revenue"]
)
df["date"] = pd.to_datetime(df["date"], errors="coerce")
df["revenue"] = pd.to_numeric(df["revenue"], errors="coerce")
print(df.info())

import pandas as pd

df = pd.read_excel(
    "report.xlsx",
    header=0,
    names=["date", "customer", "region", "revenue"]
)
df["date"] = pd.to_datetime(df["date"], errors="coerce")
df["revenue"] = pd.to_numeric(df["revenue"], errors="coerce")
print(df.info())

Python

ミニ例題（練習用）

例題1: 列名なしシートに列名を付けて先頭10行を表示

import pandas as pd
df = pd.read_excel("sales.xlsx", header=None, names=["date","item","qty","amount"])
print(df.head(10))

import pandas as pd
df = pd.read_excel("sales.xlsx", header=None, names=["date","item","qty","amount"])
print(df.head(10))

Python

例題2: 説明行を飛ばし、列番号で選んで名前を付ける

import pandas as pd
df = pd.read_excel("report.xlsx", skiprows=3, header=None, usecols=[0,2,4], names=["date","qty","amount"])
print(df.head())

import pandas as pd
df = pd.read_excel("report.xlsx", skiprows=3, header=None, usecols=[0,2,4], names=["date","qty","amount"])
print(df.head())

Python

例題3: 既存ヘッダーを英語名に統一して集計

import pandas as pd
df = pd.read_excel("japanese.xlsx", header=0, names=["date","product","qty","amount"])
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")
print("合計:", df["amount"].sum())

import pandas as pd
df = pd.read_excel("japanese.xlsx", header=0, names=["date","product","qty","amount"])
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")
print("合計:", df["amount"].sum())

Python

月	火	水	木	金	土	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31