First, thanks for your work :)
I'm trying to silence llama.cpp output and keep only the answer.
I've closed stderr temporally while loading the model (this is not a nice approach, but it works).
unsafe {
libc::close(libc::STDERR_FILENO);
}
let llama = LLama::new(model,&options);
unsafe {
let wr = "w".as_ptr() as *const c_char;
let fd = libc::fdopen(libc::STDERR_FILENO, wr);
libc::dup2(fd as i32, libc::STDERR_FILENO);
}
But when I call predict I still have an unwanted output count 0.
Maybe you can change it to log::debug!("count {}", reverse_count); ?