Python 業務自動化 | ファイル・フォルダ自動化：基本操作－ファイル分割

Python

2026.03.16

ファイル分割は「大きすぎるファイルを扱いやすいサイズに分ける」ための重要な自動化テクニック
行数でファイルを分割する基本（CSV・ログで最もよく使う）
1. 行数で分割するテンプレート
2. 深掘りポイント
サイズでファイルを分割する（メール添付・転送用）
1. 指定サイズ（MB）で分割するテンプレート
2. 深掘りポイント
CSV をヘッダー付きで分割する（実務で非常に重要）
1. CSV のヘッダーを毎ファイルに付けるテンプレート
2. 深掘りポイント
条件で分割する（例：日付ごとに分割）
1. ログファイルを日付ごとに分割する例
2. 深掘りポイント
バイナリファイル（画像・動画・ZIP）を安全に分割する
1. バイナリ分割の基本
2. 深掘りポイント
pathlib を使った読みやすいファイル分割
1. Path オブジェクトで直感的に書ける
2. メリット
ファイル分割を業務で使うときの重要ポイント

ファイル分割は「大きすぎるファイルを扱いやすいサイズに分ける」ための重要な自動化テクニック

業務では、巨大なログファイル・CSV・テキストデータを扱うことがよくあります。
しかし、ファイルが大きすぎると次のような問題が起きます。

メール添付できない
アプリが読み込めない
処理に時間がかかる
サーバー容量を圧迫する

そこで役立つのが ファイル分割（スプリット） です。
Python では、行数で分割・サイズで分割・任意条件で分割など、柔軟な分割が可能です。

ここでは、初心者でも理解しやすいように、基本から実務テンプレートまで丁寧に解説します。

行数でファイルを分割する基本（CSV・ログで最もよく使う）

行数で分割するテンプレート

def split_by_lines(src, lines_per_file):
    count = 0
    file_index = 1
    dst = open(f"{src}_{file_index}.txt", "w", encoding="utf-8")

    with open(src, "r", encoding="utf-8") as f:
        for line in f:
            dst.write(line)
            count += 1

            if count >= lines_per_file:
                dst.close()
                file_index += 1
                dst = open(f"{src}_{file_index}.txt", "w", encoding="utf-8")
                count = 0

    dst.close()
    print("分割完了")

split_by_lines("bigfile.txt", 1000)

def split_by_lines(src, lines_per_file):
    count = 0
    file_index = 1
    dst = open(f"{src}_{file_index}.txt", "w", encoding="utf-8")

    with open(src, "r", encoding="utf-8") as f:
        for line in f:
            dst.write(line)
            count += 1

            if count >= lines_per_file:
                dst.close()
                file_index += 1
                dst = open(f"{src}_{file_index}.txt", "w", encoding="utf-8")
                count = 0

    dst.close()
    print("分割完了")

split_by_lines("bigfile.txt", 1000)

Python

深掘りポイント

lines_per_file に 1000 を指定すると「1000 行ごと」に分割される
出力ファイル名は bigfile.txt_1, bigfile.txt_2 のように連番
行単位の分割は CSV・ログファイルで最も実務的

サイズでファイルを分割する（メール添付・転送用）

指定サイズ（MB）で分割するテンプレート

import os

def split_by_size(src, max_mb):
    max_bytes = max_mb * 1024 * 1024
    file_index = 1
    written = 0

    dst = open(f"{src}_{file_index}", "wb")

    with open(src, "rb") as f:
        for chunk in iter(lambda: f.read(1024), b""):
            if written + len(chunk) > max_bytes:
                dst.close()
                file_index += 1
                dst = open(f"{src}_{file_index}", "wb")
                written = 0

            dst.write(chunk)
            written += len(chunk)

    dst.close()
    print("分割完了")

split_by_size("large.bin", 5)  # 5MBごとに分割

import os

def split_by_size(src, max_mb):
    max_bytes = max_mb * 1024 * 1024
    file_index = 1
    written = 0

    dst = open(f"{src}_{file_index}", "wb")

    with open(src, "rb") as f:
        for chunk in iter(lambda: f.read(1024), b""):
            if written + len(chunk) > max_bytes:
                dst.close()
                file_index += 1
                dst = open(f"{src}_{file_index}", "wb")
                written = 0

            dst.write(chunk)
            written += len(chunk)

    dst.close()
    print("分割完了")

split_by_size("large.bin", 5)  # 5MBごとに分割

Python

深掘りポイント

バイナリモード（rb / wb）で扱うため、画像・動画・ZIP なども分割可能
chunk 単位で読み書きするためメモリ効率が良い
メール添付の上限（例：25MB）に合わせて分割できる

CSV をヘッダー付きで分割する（実務で非常に重要）

CSV のヘッダーを毎ファイルに付けるテンプレート

def split_csv_with_header(src, lines_per_file):
    with open(src, "r", encoding="utf-8") as f:
        header = f.readline()

        file_index = 1
        count = 0
        dst = open(f"{src}_{file_index}.csv", "w", encoding="utf-8")
        dst.write(header)

        for line in f:
            dst.write(line)
            count += 1

            if count >= lines_per_file:
                dst.close()
                file_index += 1
                dst = open(f"{src}_{file_index}.csv", "w", encoding="utf-8")
                dst.write(header)
                count = 0

        dst.close()

split_csv_with_header("data.csv", 5000)

def split_csv_with_header(src, lines_per_file):
    with open(src, "r", encoding="utf-8") as f:
        header = f.readline()

        file_index = 1
        count = 0
        dst = open(f"{src}_{file_index}.csv", "w", encoding="utf-8")
        dst.write(header)

        for line in f:
            dst.write(line)
            count += 1

            if count >= lines_per_file:
                dst.close()
                file_index += 1
                dst = open(f"{src}_{file_index}.csv", "w", encoding="utf-8")
                dst.write(header)
                count = 0

        dst.close()

split_csv_with_header("data.csv", 5000)

Python

深掘りポイント

CSV はヘッダーがないと読み込めないツールが多い
各分割ファイルにヘッダーを付けるのが実務では必須
行数ベースの分割と組み合わせると大量データ処理が安定する

条件で分割する（例：日付ごとに分割）

ログファイルを日付ごとに分割する例

import os

def split_by_date(src):
    files = {}

    with open(src, "r", encoding="utf-8") as f:
        for line in f:
            date = line[:10]  # 例: "2024-03-15"
            if date not in files:
                files[date] = open(f"{src}_{date}.log", "w", encoding="utf-8")
            files[date].write(line)

    for f in files.values():
        f.close()

split_by_date("access.log")

import os

def split_by_date(src):
    files = {}

    with open(src, "r", encoding="utf-8") as f:
        for line in f:
            date = line[:10]  # 例: "2024-03-15"
            if date not in files:
                files[date] = open(f"{src}_{date}.log", "w", encoding="utf-8")
            files[date].write(line)

    for f in files.values():
        f.close()

split_by_date("access.log")

Python

深掘りポイント

ログの先頭に日付がある前提
日付ごとにファイルを自動生成
ログ解析・監査・保守で非常に役立つ

バイナリファイル（画像・動画・ZIP）を安全に分割する

バイナリ分割の基本

def split_binary(src, chunk_size=1024*1024):
    file_index = 1

    with open(src, "rb") as f:
        while True:
            chunk = f.read(chunk_size)
            if not chunk:
                break

            with open(f"{src}_{file_index}", "wb") as dst:
                dst.write(chunk)

            file_index += 1

split_binary("movie.mp4", 10 * 1024 * 1024)  # 10MBごと

def split_binary(src, chunk_size=1024*1024):
    file_index = 1

    with open(src, "rb") as f:
        while True:
            chunk = f.read(chunk_size)
            if not chunk:
                break

            with open(f"{src}_{file_index}", "wb") as dst:
                dst.write(chunk)

            file_index += 1

split_binary("movie.mp4", 10 * 1024 * 1024)  # 10MBごと

Python

深掘りポイント

バイナリは「行」という概念がないためサイズで分割する
chunk_size を変えるだけで柔軟に調整可能
大容量ファイルの転送やバックアップで使える

pathlib を使った読みやすいファイル分割

Path オブジェクトで直感的に書ける

from pathlib import Path

def split_text(path: Path, lines_per_file: int):
    with path.open("r", encoding="utf-8") as f:
        header = None
        file_index = 1
        count = 0

        dst = (path.parent / f"{path.stem}_{file_index}{path.suffix}").open("w", encoding="utf-8")

        for line in f:
            dst.write(line)
            count += 1

            if count >= lines_per_file:
                dst.close()
                file_index += 1
                dst = (path.parent / f"{path.stem}_{file_index}{path.suffix}").open("w", encoding="utf-8")
                count = 0

        dst.close()

split_text(Path("big.txt"), 1000)

from pathlib import Path

def split_text(path: Path, lines_per_file: int):
    with path.open("r", encoding="utf-8") as f:
        header = None
        file_index = 1
        count = 0

        dst = (path.parent / f"{path.stem}_{file_index}{path.suffix}").open("w", encoding="utf-8")

        for line in f:
            dst.write(line)
            count += 1

            if count >= lines_per_file:
                dst.close()
                file_index += 1
                dst = (path.parent / f"{path.stem}_{file_index}{path.suffix}").open("w", encoding="utf-8")
                count = 0

        dst.close()

split_text(Path("big.txt"), 1000)