File crawling performance

Crawling the publish directory might be slow for some big sites. There might be a few opportunities of optimizing it:
  - Each `readdir` already performs a `stat` syscall, so doing it again in https://github.com/andyrichardson/netlify-plugin-ttl-cache/blob/54127d8050d92f1335d37c7666c3161aafa82f11/src/index.js#L21 might be redundant
  - If no `exclude` input is specified, there is no need to perform a `test()` on the filename. Even though the default regular expression `a^` should be fast and never match, it might become more expensive when performed thousands of times.
  - Directories part of `exclude` might not need to crawled

There might also be some potential bugs with the directory crawling. For example, if a file was a symlink to one of its parent directory, would the crawline keep running until memory is exhausted?

I am wondering whether using a tried-and-tested library [like `readdirp`](https://github.com/paulmillr/readdirp) might help fix all of this, and also simplify the code? What are your thoughts?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

File crawling performance #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

File crawling performance #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions