Skip to content

Add option to output digest to stdout #264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jun 19, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ repos:
- id: black

- repo: https://github.com/asottile/pyupgrade
rev: v3.19.1
rev: v3.20.0
hooks:
- id: pyupgrade
description: "Automatically upgrade syntax for newer versions."
Expand Down Expand Up @@ -73,7 +73,7 @@ repos:
- id: djlint-reformat-jinja

- repo: https://github.com/igorshubovych/markdownlint-cli
rev: v0.44.0
rev: v0.45.0
hooks:
- id: markdownlint
description: "Lint markdown files."
Expand All @@ -88,7 +88,7 @@ repos:
files: ^src/

- repo: https://github.com/pycqa/pylint
rev: v3.3.6
rev: v3.3.7
hooks:
- id: pylint
name: pylint for source
Expand Down
31 changes: 26 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,26 +78,35 @@ Issues and feature requests are welcome to the repo.
The `gitingest` command line tool allows you to analyze codebases and create a text dump of their contents.

```bash
# Basic usage
# Basic usage (writes to digest.txt by default)
gitingest /path/to/directory

# From URL
gitingest https://github.com/cyclotruc/gitingest
```

For private repositories, use the `--token/-t` option.

# For private repositories, use the --token option
```bash
# Get your token from https://github.com/settings/personal-access-tokens
gitingest https://github.com/username/private-repo --token github_pat_...

# Or set it as an environment variable
export GITHUB_TOKEN=github_pat_...
gitingest https://github.com/username/private-repo
```

# See more options
By default, the digest is written to a text file (`digest.txt`) in your current working directory. You can customize the output in two ways:

- Use `--output/-o <filename>` to write to a specific file.
- Use `--output/-o -` to output directly to `STDOUT` (useful for piping to other tools).

See more options and usage details with:

```bash
gitingest --help
```

This will write the digest in a text file (default `digest.txt`) in your current working directory.

## 🐍 Python package usage

```python
Expand All @@ -110,6 +119,18 @@ summary, tree, content = ingest("path/to/directory")
summary, tree, content = ingest("https://github.com/cyclotruc/gitingest")
```

For private repositories, you can pass a token:

```python
# Using token parameter
summary, tree, content = ingest("https://github.com/username/private-repo", token="github_pat_...")

# Or set it as an environment variable
import os
os.environ["GITHUB_TOKEN"] = "github_pat_..."
summary, tree, content = ingest("https://github.com/username/private-repo")
```

By default, this won't write a file but can be enabled with the `output` argument.

```python
Expand Down
32 changes: 22 additions & 10 deletions src/gitingest/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,8 @@ def main(
source : str
A directory path or a Git repository URL.
output : str, optional
Output file path. Defaults to `<repo_name>.txt`.
The path where the output file will be written. If not specified, the output will be written
to a file named `<repo_name>.txt` in the current directory. Use '-' to output to stdout.
max_size : int
Maximum file size (in bytes) to consider.
exclude_pattern : Tuple[str, ...]
Expand Down Expand Up @@ -113,14 +114,16 @@ async def _async_main(
Analyze a directory or repository and create a text dump of its contents.

This command analyzes the contents of a specified source directory or repository, applies custom include and
exclude patterns, and generates a text summary of the analysis which is then written to an output file.
exclude patterns, and generates a text summary of the analysis which is then written to an output file
or printed to stdout.

Parameters
----------
source : str
A directory path or a Git repository URL.
output : str, optional
Output file path. Defaults to `<repo_name>.txt`.
The path where the output file will be written. If not specified, the output will be written
to a file named `<repo_name>.txt` in the current directory. Use '-' to output to stdout.
max_size : int
Maximum file size (in bytes) to consider.
exclude_pattern : Tuple[str, ...]
Expand All @@ -143,23 +146,32 @@ async def _async_main(
exclude_patterns = set(exclude_pattern)
include_patterns = set(include_pattern)

# Choose a default output path if none provided
if output is None:
output = OUTPUT_FILE_NAME
output_target = output if output is not None else OUTPUT_FILE_NAME

if output_target == "-":
click.echo("Analyzing source, preparing output for stdout...", err=True)
else:
click.echo(f"Analyzing source, output will be written to '{output_target}'...", err=True)

summary, _, _ = await ingest_async(
source=source,
max_file_size=max_size,
include_patterns=include_patterns,
exclude_patterns=exclude_patterns,
branch=branch,
output=output,
output=output_target,
token=token,
)

click.echo(f"Analysis complete! Output written to: {output}")
click.echo("\nSummary:")
click.echo(summary)
if output_target == "-": # stdout
click.echo("\n--- Summary ---", err=True)
click.echo(summary, err=True)
click.echo("--- End Summary ---", err=True)
click.echo("Analysis complete! Output sent to stdout.", err=True)
else: # file
click.echo(f"Analysis complete! Output written to: {output_target}")
click.echo("\nSummary:")
click.echo(summary)

except Exception as exc:
# Convert any exception into Click.Abort so that exit status is non-zero
Expand Down
8 changes: 7 additions & 1 deletion src/gitingest/entrypoint.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
import inspect
import os
import shutil
import sys
from typing import Optional, Set, Tuple, Union

from gitingest.cloning import clone_repo
Expand Down Expand Up @@ -93,7 +94,12 @@ async def ingest_async(

summary, tree, content = ingest_query(query)

if output is not None:
if output == "-":
loop = asyncio.get_running_loop()
output_data = tree + "\n" + content
await loop.run_in_executor(None, sys.stdout.write, output_data)
await loop.run_in_executor(None, sys.stdout.flush)
elif output is not None:
with open(output, "w", encoding="utf-8") as f:
f.write(tree + "\n" + content)

Expand Down
105 changes: 72 additions & 33 deletions tests/test_cli.py
Original file line number Diff line number Diff line change
@@ -1,41 +1,80 @@
"""Tests for the gitingest cli."""
"""Tests for the Gitingest CLI."""

import os
from inspect import signature
from pathlib import Path
from typing import List

from click.testing import CliRunner
import pytest
from _pytest.monkeypatch import MonkeyPatch
from click.testing import CliRunner, Result

from gitingest.cli import main
from gitingest.config import MAX_FILE_SIZE, OUTPUT_FILE_NAME


def test_cli_with_default_options():
runner = CliRunner()
result = runner.invoke(main, ["./"])
output_lines = result.output.strip().split("\n")
assert f"Analysis complete! Output written to: {OUTPUT_FILE_NAME}" in output_lines
assert os.path.exists(OUTPUT_FILE_NAME), f"Output file was not created at {OUTPUT_FILE_NAME}"

os.remove(OUTPUT_FILE_NAME)


def test_cli_with_options():
runner = CliRunner()
result = runner.invoke(
main,
[
"./",
"--output",
str(OUTPUT_FILE_NAME),
"--max-size",
str(MAX_FILE_SIZE),
"--exclude-pattern",
"tests/",
"--include-pattern",
"src/",
],
)
output_lines = result.output.strip().split("\n")
assert f"Analysis complete! Output written to: {OUTPUT_FILE_NAME}" in output_lines
assert os.path.exists(OUTPUT_FILE_NAME), f"Output file was not created at {OUTPUT_FILE_NAME}"

os.remove(OUTPUT_FILE_NAME)
@pytest.mark.parametrize(
"cli_args, expect_file",
[
pytest.param(["./"], True, id="default-options"),
pytest.param(
[
"./",
"--output",
str(OUTPUT_FILE_NAME),
"--max-size",
str(MAX_FILE_SIZE),
"--exclude-pattern",
"tests/",
"--include-pattern",
"src/",
],
True,
id="custom-options",
),
],
)
def test_cli_writes_file(tmp_path: Path, monkeypatch: MonkeyPatch, cli_args: List[str], expect_file: bool) -> None:
"""Run the CLI and verify that the SARIF file is created (or not)."""
# Work inside an isolated temp directory
monkeypatch.chdir(tmp_path)

result = _invoke_isolated_cli_runner(cli_args)

assert result.exit_code == 0, result.stderr

# Summary line should be on STDOUT
stdout_lines = result.stdout.splitlines()
assert f"Analysis complete! Output written to: {OUTPUT_FILE_NAME}" in stdout_lines

# File side-effect
sarif_file = tmp_path / OUTPUT_FILE_NAME
assert sarif_file.exists() is expect_file, f"{OUTPUT_FILE_NAME} existence did not match expectation"


def test_cli_with_stdout_output() -> None:
"""Test CLI invocation with output directed to STDOUT."""
result = _invoke_isolated_cli_runner(["./", "--output", "-", "--exclude-pattern", "tests/"])

# ─── core expectations (stdout) ────────────────────────────────────-
assert result.exit_code == 0, f"CLI exited with code {result.exit_code}, stderr: {result.stderr}"
assert "---" in result.stdout, "Expected file separator '---' not found in STDOUT"
assert "src/gitingest/cli.py" in result.stdout, "Expected content (e.g., src/gitingest/cli.py) not found in STDOUT"
assert not os.path.exists(OUTPUT_FILE_NAME), f"Output file {OUTPUT_FILE_NAME} was unexpectedly created."

# ─── the summary must *not* pollute STDOUT, must appear on STDERR ───
summary = "Analysis complete! Output sent to stdout."
stdout_lines = result.stdout.splitlines()
stderr_lines = result.stderr.splitlines()
assert summary not in stdout_lines, "Unexpected summary message found in STDOUT"
assert summary in stderr_lines, "Expected summary message not found in STDERR"
assert f"Output written to: {OUTPUT_FILE_NAME}" not in stderr_lines


def _invoke_isolated_cli_runner(args: List[str]) -> Result:
"""Return a CliRunner that keeps stderr apart on Click 8.0-8.1."""
kwargs = {}
if "mix_stderr" in signature(CliRunner.__init__).parameters:
kwargs["mix_stderr"] = False # Click 8.0–8.1
runner = CliRunner(**kwargs)
return runner.invoke(main, args)
Loading