Python Excel操作逆引き集 | コメント行（#など）を無視して読み込む

Pythonでコメント行を無視してExcelを読み込む入門 — comment='#'
基本の使い方
よくあるパターン別テンプレート
読み込み後の整形ワザ
つまずきやすいポイントと回避策
実践テンプレート
ミニ例題（練習用）
まとめ

Pythonでコメント行を無視してExcelを読み込む入門 — comment='#'

ExcelやCSV形式のファイルには、データの前に「#」や「//」で始まるコメント行が入っていることがあります。これらをそのまま読み込むと「文字列」として扱われてしまい、集計や型変換の邪魔になります。pandas.read_excel では comment 引数を使って、指定した文字で始まる行を読み飛ばすことができます。

基本の使い方

import pandas as pd

# '#' で始まる行をコメントとして無視
df = pd.read_excel("data.xlsx", comment="#")

print(df.head())

import pandas as pd

# '#' で始まる行をコメントとして無視
df = pd.read_excel("data.xlsx", comment="#")

print(df.head())

Python

効果: ファイル内の # で始まる行は読み込まれず、データ部分だけが DataFrame に入ります。
確認: df.head() や df.info() で、余計な文字列行が消えているかチェック。

よくあるパターン別テンプレート

1. コメント行を飛ばして通常の表を読む

df = pd.read_excel("report.xlsx", comment="#", header=0)

df = pd.read_excel("report.xlsx", comment="#", header=0)

Python

狙い: 先頭に「# このファイルは…」などの説明がある帳票をスッキリ読み込む。

2. コメント行＋説明行をまとめて飛ばす

df = pd.read_excel("report.xlsx", comment="#", skiprows=3, header=0)

df = pd.read_excel("report.xlsx", comment="#", skiprows=3, header=0)

Python

狙い: コメント行とタイトル行をまとめてスキップして、表部分から読み込む。

3. コメント記号を別のものにする

df = pd.read_excel("data.xlsx", comment="//")

df = pd.read_excel("data.xlsx", comment="//")

Python

狙い: 「//」や「;」など、独自のコメント記号を使っているファイルに対応。

読み込み後の整形ワザ

列名の空白除去:

df = df.rename(columns=lambda c: str(c).strip())

df = df.rename(columns=lambda c: str(c).strip())

Python

数値変換:

df["Amount"] = pd.to_numeric(df["Amount"], errors="coerce")

df["Amount"] = pd.to_numeric(df["Amount"], errors="coerce")

Python

日付変換:

df["Date"] = pd.to_datetime(df["Date"], errors="coerce")

df["Date"] = pd.to_datetime(df["Date"], errors="coerce")

Python

つまずきやすいポイントと回避策

コメント記号は「行頭」にある必要がある
→ 行の途中に「#」がある場合は無視されません。行全体がコメント扱いになるのは「先頭が #」のとき。
Excel特有の「セル内コメント（メモ）」は対象外
→ comment 引数で無視できるのは「行頭に記号がある行」。Excelのセルコメントは別物で、pandasでは読み込めません。
複数種類のコメント記号は一度に指定できない
→ 例えば「#」と「//」を同時に無視したい場合は、まず広めに読み込んでから df = df[~df.iloc[:,0].str.startswith(("#","//"))] のように後処理で除外します。

実践テンプレート

コメント行を飛ばして月次集計

import pandas as pd

df = pd.read_excel(
    "monthly.xlsx",
    comment="#",
    parse_dates=["Date"],
    usecols=["Date", "Revenue"]
)

monthly = (
    df.assign(month=df["Date"].dt.to_period("M"))
      .groupby("month", as_index=False)["Revenue"].sum()
)
print(monthly)

import pandas as pd

df = pd.read_excel(
    "monthly.xlsx",
    comment="#",
    parse_dates=["Date"],
    usecols=["Date", "Revenue"]
)

monthly = (
    df.assign(month=df["Date"].dt.to_period("M"))
      .groupby("month", as_index=False)["Revenue"].sum()
)
print(monthly)

Python

コメント行＋説明行を飛ばして列名を自分で付ける

import pandas as pd

df = pd.read_excel(
    "survey.xlsx",
    comment="#",
    skiprows=2,
    header=None,
    names=["id", "age", "gender", "score"]
)
print(df.head())

import pandas as pd

df = pd.read_excel(
    "survey.xlsx",
    comment="#",
    skiprows=2,
    header=None,
    names=["id", "age", "gender", "score"]
)
print(df.head())

Python

ミニ例題（練習用）

例題1: コメント行を無視して先頭10行を表示

import pandas as pd
df = pd.read_excel("sales.xlsx", comment="#")
print(df.head(10))

import pandas as pd
df = pd.read_excel("sales.xlsx", comment="#")
print(df.head(10))

Python

例題2: コメント行＋説明行を飛ばして列名を付ける

import pandas as pd
df = pd.read_excel("report.xlsx", comment="#", skiprows=3, header=None, names=["date","item","qty","price"])
print(df.info())

import pandas as pd
df = pd.read_excel("report.xlsx", comment="#", skiprows=3, header=None, names=["date","item","qty","price"])
print(df.info())

Python

例題3: 「//」で始まるコメント行を無視して読み込む

import pandas as pd
df = pd.read_excel("data.xlsx", comment="//")
print(df.head())

import pandas as pd
df = pd.read_excel("data.xlsx", comment="//")
print(df.head())

Python

まとめ

コメント行を無視 → comment='#'
説明行も飛ばす → skiprows と併用
列名がない → header=None + names=[...]
Excelのセルコメントは対象外（行頭記号のみ有効）

月	火	水	木	金	土	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31