plocate, a much faster locate

plocate is a locate(1) based on posting lists, completely replacing mlocate with a much faster (and smaller) index. It is suitable as a default locate on your system.


  cassarossa:~> time mlocate movit-fosdem-talk
  /export/cassarossa/itk/sesse/public_html/movit-fosdem-talk.odp
  /export/cassarossa/itk/sesse/public_html/movit-fosdem-talk.pdf
  mlocate movit-fosdem-talk  19.75s user 0.33s system 99% cpu 20.118 total
  
  cassarossa:~> time plocate movit-fosdem-talk
  /export/cassarossa/itk/sesse/public_html/movit-fosdem-talk.odp
  /export/cassarossa/itk/sesse/public_html/movit-fosdem-talk.pdf
  plocate movit-fosdem-talk  0.01s user 0.00s system 78% cpu 0.008 total

  cassarossa:~> ls -lh /var/lib/[mp]locate/*.db
  -rw-r----- 1 root mlocate 1.1G Apr  2 06:26 /var/lib/mlocate/mlocate.db
  -rw-r----- 1 root plocate 466M Apr  2 06:28 /var/lib/plocate/plocate.db
        

In the example above, plocate found two files out of 27 million in just a few milliseconds.

plocate works by creating an inverted index over trigrams (combinations of three bytes) in the search strings, which allows it to rapidly narrow down the set of candidates to a very small list, instead of linearly scanning through every entry. It does nearly all I/O asynchronously using io_uring if available (Linux 5.1+), which reduces the impact of seek latency on systems without SSDs. Like mlocate and slocate, the returned file set is user-dependent, ie. a user will only see a file if find(1) would list it (all directories from the root have +rx permissions).