Skip to content

Place repository root level readme first in file content output #46

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 25, 2024

Conversation

bouzaghrane
Copy link
Contributor

@bouzaghrane bouzaghrane commented Dec 24, 2024

Modified create_file_content_string() to show root level README.md first in the repository content output file using a two pass approach. This is motivated by the structure proposed by AnswerDotAI for llms.txt files.

Ran the test suite and all 11 tests pass.

@bouzaghrane bouzaghrane changed the title Place ingested repository readme first in file content output Place repository root level readme first in file content output Dec 24, 2024
@cyclotruc
Copy link
Member

Thank you very much and thanks for the discovery of llms.txt files
I'm merging this and I'll probably add later improvement around this idea

@cyclotruc cyclotruc merged commit b315d52 into coderamp-labs:main Dec 25, 2024
4 checks passed
@rawwerks
Copy link

rawwerks commented Feb 12, 2025

hey @cyclotruc - i think "llms.txt" would be a really easy addition / mode to add to gitingest: AnswerDotAI/llms-txt#37

the format is gaining in popularity.

for example, i could pay to crawl my own website: https://llmstxt.firecrawl.dev/

or i could just do something like gitingest --llmstxt on the code base to generate this directly.

@cyclotruc
Copy link
Member

@rawwerks that's a very good point, but do you mean add a llm.txt to gitingest.com or an option to generate llm.txt files with gitingest?

I'm not sure what I would need to include in the digest to make it usefull for this format, would you have any clue on that?

@rawwerks
Copy link

i mean "add an option to generate llm.txt files with gitingest"

here's the spec => https://github.com/AnswerDotAI/llms-txt

  • An H1 with the name of the project or site. This is the only required section
  • A blockquote with a short summary of the project, containing key information necessary for understanding the rest of the file
  • Zero or more markdown sections (e.g. paragraphs, lists, etc) of any type except headings, containing more detailed information about the project and how to interpret the provided files
  • Zero or more markdown sections delimited by H2 headers, containing "file lists" of URLs where further detail is available
  • Each "file list" is a markdown list, containing a required markdown hyperlink name, then optionally a : and notes about the file.

Here is a mock example:

# Title

> Optional description goes here

Optional details go here

## Section name

- [Link title](https://link_url): Optional link details

## Optional

- [Link title](https://link_url)

Note that the "Optional" section has a special meaning---if it's included, the URLs provided there can be skipped if a shorter context is needed. Use it for secondary information which can often be skipped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants