将整个站点都转成符合 AI 需求的 txt 格式

https://github.com/egoist/sitefetch

sitefetch

Fetch an entire site and save it as a text file (to be used with AI models).

image

Install

One-off usage (choose one of the followings):

bunx sitefetch
npx sitefetch
pnpx sitefetch

Install globally (choose one of the followings):

bun i -g sitefetch
npm i -g sitefetch
pnpm i -g sitefetch

Usage

sitefetch https://egoist.dev -o site.txt

# or better concurrency
sitefetch https://egoist.dev -o site.txt --concurrency 10

Match specific pages

Use the -m, --match flag to specify the pages you want to fetch:

sitefetch https://vite.dev -m "/blog/**" -m "/guide/**"

The match pattern is tested against the pathname of target pages, powered by micromatch, you can check out all the supported matching features.

Content selector

We use mozilla/readability to extract readable content from the web page, but on some pages it might return irrelevant contents, in this case you can specify a CSS selector so we know where to find the readable content:

sitefetch https://vite.dev --content-selector ".content"

Plug

If you like this, please check out my LLM chat app: https://chatwise.app

API

import { fetchSite } from "sitefetch"

await fetchSite("https://egoist.dev", {
  //...options
})

Check out options in types.ts.

License

MIT.

暂无评论

发送评论 编辑评论


				
|´・ω・)ノ
ヾ(≧∇≦*)ゝ
(☆ω☆)
(╯‵□′)╯︵┴─┴
 ̄﹃ ̄
(/ω\)
∠( ᐛ 」∠)_
(๑•̀ㅁ•́ฅ)
→_→
୧(๑•̀⌄•́๑)૭
٩(ˊᗜˋ*)و
(ノ°ο°)ノ
(´இ皿இ`)
⌇●﹏●⌇
(ฅ´ω`ฅ)
(╯°A°)╯︵○○○
φ( ̄∇ ̄o)
ヾ(´・ ・`。)ノ"
( ง ᵒ̌皿ᵒ̌)ง⁼³₌₃
(ó﹏ò。)
Σ(っ °Д °;)っ
( ,,´・ω・)ノ"(´っω・`。)
╮(╯▽╰)╭
o(*////▽////*)q
>﹏<
( ๑´•ω•) "(ㆆᴗㆆ)
😂
😀
😅
😊
🙂
🙃
😌
😍
😘
😜
😝
😏
😒
🙄
😳
😡
😔
😫
😱
😭
💩
👻
🙌
🖕
👍
👫
👬
👭
🌚
🌝
🙈
💊
😶
🙏
🍦
🍉
😣
Source: github.com/k4yt3x/flowerhd
颜文字
Emoji
小恐龙
花!
上一篇
下一篇