Logo

0x3d.Site

is designed for aggregating information.
Welcome
check repository here

robotstxt

Crates.io Docs.rs Apache 2.0

A native Rust port of Google's robots.txt parser and matcher C++ library.

  • Native Rust port, no third-part crate dependency
  • Zero unsafe code
  • Preserves all behavior of original library
  • Consistent API with the original library
  • 100% google original test passed

Installation

[dependencies]
robotstxt = "0.3.0"

Quick start

use robotstxt::DefaultMatcher;

let mut matcher = DefaultMatcher::default();
let robots_body = "user-agent: FooBot\n\
                   disallow: /\n";
assert_eq!(false, matcher.one_agent_allowed_by_robots(robots_body, "FooBot", "https://foo.com/"));

About

Quoting the README from Google's robots.txt parser and matcher repo:

The Robots Exclusion Protocol (REP) is a standard that enables website owners to control which URLs may be accessed by automated clients (i.e. crawlers) through a simple text file with a specific syntax. It's one of the basic building blocks of the internet as we know it and what allows search engines to operate.

Because the REP was only a de-facto standard for the past 25 years, different implementers implement parsing of robots.txt slightly differently, leading to confusion. This project aims to fix that by releasing the parser that Google uses.

The library is slightly modified (i.e. some internal headers and equivalent symbols) production code used by Googlebot, Google's crawler, to determine which URLs it may access based on rules provided by webmasters in robots.txt files. The library is released open-source to help developers build tools that better reflect Google's robots.txt parsing and matching.

Crate robotstxt aims to be a faithful conversion, from C++ to Rust, of Google's robots.txt parser and matcher.

Testing

$ git clone https://github.com/Folyd/robotstxt
Cloning into 'robotstxt'...
$ cd robotstxt/tests 
...
$ mkdir c-build && cd c-build
...
$ cmake ..
...
$ make
...
$ make test
Running tests...
Test project ~/robotstxt/tests/c-build
    Start 1: robots-test
1/1 Test #1: robots-test ......................   Passed    0.33 sec

License

The robotstxt parser and matcher Rust library is licensed under the terms of the Apache license. See LICENSE for more information.

Rust
Rust
Rust is a modern programming language focused on safety, speed, and concurrency. It prevents common bugs like null pointer dereferencing and data races, making it ideal for system programming and high-performance applications.
Create business apps like assembling blocks | ILLA Cloud
Create business apps like assembling blocks | ILLA Cloud
GitHub - 0x59616e/SteinsOS: An operating system written in Rust
GitHub - 0x59616e/SteinsOS: An operating system written in Rust
GitHub - j0ru/kickoff: Minimalistic program launcher
GitHub - j0ru/kickoff: Minimalistic program launcher
Workflow runs · rust-lang/rustup
Workflow runs · rust-lang/rustup
GitHub - paradigmxyz/artemis: A simple, modular, and fast framework for writing MEV bots in Rust.
GitHub - paradigmxyz/artemis: A simple, modular, and fast framework for writing MEV bots in Rust.
Workflow runs · sigp/lighthouse
Workflow runs · sigp/lighthouse
GitHub - flox/flox: Developer environments you can take with you
GitHub - flox/flox: Developer environments you can take with you
Unified Architecture - OPC Foundation
Unified Architecture - OPC Foundation
MaidSafe
MaidSafe
ttyperacer / terminal-typeracer · GitLab
ttyperacer / terminal-typeracer · GitLab
GitHub - cloudhead/rx: 👾 Modern and minimalist pixel editor
GitHub - cloudhead/rx: 👾 Modern and minimalist pixel editor
GitHub - rust-ethereum/ethabi: Encode and decode smart contract invocations
GitHub - rust-ethereum/ethabi: Encode and decode smart contract invocations
GitHub - watchexec/watchexec: Executes commands in response to file modifications
GitHub - watchexec/watchexec: Executes commands in response to file modifications
GitHub - osa1/tiny: A terminal IRC client
GitHub - osa1/tiny: A terminal IRC client
GitHub - sergree/whatbpm: 💓 Today's Trending Values for EDM Production
GitHub - sergree/whatbpm: 💓 Today's Trending Values for EDM Production
GitHub - rsaarelm/magog: A roguelike game in Rust
GitHub - rsaarelm/magog: A roguelike game in Rust
GitHub - autonomys/subspace: Subspace Network reference implementation
GitHub - autonomys/subspace: Subspace Network reference implementation
GitHub - nicohman/eidolon: Provides a single TUI-based registry for drm-free, wine and steam games on linux, accessed through a rofi launch menu.
GitHub - nicohman/eidolon: Provides a single TUI-based registry for drm-free, wine and steam games on linux, accessed through a rofi launch menu.
GitHub - shshemi/tabiew: A lightweight, terminal-based application to view and query delimiter separated value formatted documents, such as CSV or TSV files.
GitHub - shshemi/tabiew: A lightweight, terminal-based application to view and query delimiter separated value formatted documents, such as CSV or TSV files.
GitHub - joamag/boytacean: A GB emulator that is written in Rust 🦀!
GitHub - joamag/boytacean: A GB emulator that is written in Rust 🦀!
GitHub - chaosprint/glicol: Graph-oriented live coding language and music/audio DSP library written in Rust
GitHub - chaosprint/glicol: Graph-oriented live coding language and music/audio DSP library written in Rust
GitHub - wasmerio/winterjs: Winter is coming... ❄️
GitHub - wasmerio/winterjs: Winter is coming... ❄️
GitHub - withoutboats/notty: A new kind of terminal
GitHub - withoutboats/notty: A new kind of terminal
GitHub - cfal/shoes: A multi-protocol proxy server written in Rust (HTTP, HTTPS, SOCKS5, Vmess, Vless, Shadowsocks, Trojan, Snell)
GitHub - cfal/shoes: A multi-protocol proxy server written in Rust (HTTP, HTTPS, SOCKS5, Vmess, Vless, Shadowsocks, Trojan, Snell)
GitHub - eigerco/beerus: A stateless trustless Starknet light client in Rust 🦀
GitHub - eigerco/beerus: A stateless trustless Starknet light client in Rust 🦀
GitHub - quilt/etk: evm toolkit
GitHub - quilt/etk: evm toolkit
GitHub - pop-os/system76-power: System76 Power Management
GitHub - pop-os/system76-power: System76 Power Management
GitHub - Limeth/ethaddrgen: Custom Ethereum vanity address generator made in Rust
GitHub - Limeth/ethaddrgen: Custom Ethereum vanity address generator made in Rust
The Tor Project / Core / Arti · GitLab
The Tor Project / Core / Arti · GitLab
Production
Production
Rust
More on Rust

Programming Tips & Tricks

Code smarter, not harder—insider tips and tricks for developers.

Error Solutions

Turn frustration into progress—fix errors faster than ever.

Shortcuts

The art of speed—shortcuts to supercharge your workflow.
  1. Collections 😎
  2. Frequently Asked Question's 🤯

Tools

available to use.

Made with ❤️

to provide resources in various ares.