Commit 3a9910b
sched_ext: Implement scx_bpf_now()
Returns a high-performance monotonically non-decreasing clock for the current
CPU. The clock returned is in nanoseconds.
It provides the following properties:
1) High performance: Many BPF schedulers call bpf_ktime_get_ns() frequently
to account for execution time and track tasks' runtime properties.
Unfortunately, in some hardware platforms, bpf_ktime_get_ns() -- which
eventually reads a hardware timestamp counter -- is neither performant nor
scalable. scx_bpf_now() aims to provide a high-performance clock by
using the rq clock in the scheduler core whenever possible.
2) High enough resolution for the BPF scheduler use cases: In most BPF
scheduler use cases, the required clock resolution is lower than the most
accurate hardware clock (e.g., rdtsc in x86). scx_bpf_now() basically
uses the rq clock in the scheduler core whenever it is valid. It considers
that the rq clock is valid from the time the rq clock is updated
(update_rq_clock) until the rq is unlocked (rq_unpin_lock).
3) Monotonically non-decreasing clock for the same CPU: scx_bpf_now()
guarantees the clock never goes backward when comparing them in the same
CPU. On the other hand, when comparing clocks in different CPUs, there
is no such guarantee -- the clock can go backward. It provides a
monotonically *non-decreasing* clock so that it would provide the same
clock values in two different scx_bpf_now() calls in the same CPU
during the same period of when the rq clock is valid.
An rq clock becomes valid when it is updated using update_rq_clock()
and invalidated when the rq is unlocked using rq_unpin_lock().
Let's suppose the following timeline in the scheduler core:
T1. rq_lock(rq)
T2. update_rq_clock(rq)
T3. a sched_ext BPF operation
T4. rq_unlock(rq)
T5. a sched_ext BPF operation
T6. rq_lock(rq)
T7. update_rq_clock(rq)
For [T2, T4), we consider that rq clock is valid (SCX_RQ_CLK_VALID is
set), so scx_bpf_now() calls during [T2, T4) (including T3) will
return the rq clock updated at T2. For duration [T4, T7), when a BPF
scheduler can still call scx_bpf_now() (T5), we consider the rq clock
is invalid (SCX_RQ_CLK_VALID is unset at T4). So when calling
scx_bpf_now() at T5, we will return a fresh clock value by calling
sched_clock_cpu() internally. Also, to prevent getting outdated rq clocks
from a previous scx scheduler, invalidate all the rq clocks when unloading
a BPF scheduler.
One example of calling scx_bpf_now(), when the rq clock is invalid
(like T5), is in scx_central [1]. The scx_central scheduler uses a BPF
timer for preemptive scheduling. In every msec, the timer callback checks
if the currently running tasks exceed their timeslice. At the beginning of
the BPF timer callback (central_timerfn in scx_central.bpf.c), scx_central
gets the current time. When the BPF timer callback runs, the rq clock could
be invalid, the same as T5. In this case, scx_bpf_now() returns a fresh
clock value rather than returning the old one (T2).
[1] https://github.com/sched-ext/scx/blob/main/scheds/c/scx_central.bpf.c
Signed-off-by: Changwoo Min <changwoo@igalia.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Andrea Righi <arighi@nvidia.com>
Signed-off-by: Tejun Heo <tj@kernel.org>1 parent ea9b262 commit 3a9910b
3 files changed
+101
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
789 | 789 | | |
790 | 790 | | |
791 | 791 | | |
| 792 | + | |
792 | 793 | | |
793 | 794 | | |
794 | 795 | | |
| |||
800 | 801 | | |
801 | 802 | | |
802 | 803 | | |
| 804 | + | |
| 805 | + | |
803 | 806 | | |
804 | | - | |
| 807 | + | |
805 | 808 | | |
806 | 809 | | |
807 | 810 | | |
| 811 | + | |
808 | 812 | | |
809 | 813 | | |
810 | 814 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4911 | 4911 | | |
4912 | 4912 | | |
4913 | 4913 | | |
4914 | | - | |
| 4914 | + | |
4915 | 4915 | | |
4916 | 4916 | | |
4917 | 4917 | | |
| |||
4994 | 4994 | | |
4995 | 4995 | | |
4996 | 4996 | | |
| 4997 | + | |
| 4998 | + | |
| 4999 | + | |
| 5000 | + | |
| 5001 | + | |
| 5002 | + | |
| 5003 | + | |
| 5004 | + | |
| 5005 | + | |
4997 | 5006 | | |
4998 | 5007 | | |
4999 | 5008 | | |
| |||
7599 | 7608 | | |
7600 | 7609 | | |
7601 | 7610 | | |
| 7611 | + | |
| 7612 | + | |
| 7613 | + | |
| 7614 | + | |
| 7615 | + | |
| 7616 | + | |
| 7617 | + | |
| 7618 | + | |
| 7619 | + | |
| 7620 | + | |
| 7621 | + | |
| 7622 | + | |
| 7623 | + | |
| 7624 | + | |
| 7625 | + | |
| 7626 | + | |
| 7627 | + | |
| 7628 | + | |
| 7629 | + | |
| 7630 | + | |
| 7631 | + | |
| 7632 | + | |
| 7633 | + | |
| 7634 | + | |
| 7635 | + | |
| 7636 | + | |
| 7637 | + | |
| 7638 | + | |
| 7639 | + | |
| 7640 | + | |
| 7641 | + | |
| 7642 | + | |
| 7643 | + | |
| 7644 | + | |
| 7645 | + | |
| 7646 | + | |
| 7647 | + | |
| 7648 | + | |
| 7649 | + | |
| 7650 | + | |
| 7651 | + | |
| 7652 | + | |
| 7653 | + | |
| 7654 | + | |
| 7655 | + | |
| 7656 | + | |
| 7657 | + | |
| 7658 | + | |
| 7659 | + | |
| 7660 | + | |
| 7661 | + | |
| 7662 | + | |
| 7663 | + | |
| 7664 | + | |
| 7665 | + | |
| 7666 | + | |
| 7667 | + | |
| 7668 | + | |
| 7669 | + | |
| 7670 | + | |
| 7671 | + | |
| 7672 | + | |
7602 | 7673 | | |
7603 | 7674 | | |
7604 | 7675 | | |
| |||
7630 | 7701 | | |
7631 | 7702 | | |
7632 | 7703 | | |
| 7704 | + | |
7633 | 7705 | | |
7634 | 7706 | | |
7635 | 7707 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
754 | 754 | | |
755 | 755 | | |
756 | 756 | | |
| 757 | + | |
757 | 758 | | |
758 | 759 | | |
759 | 760 | | |
| |||
766 | 767 | | |
767 | 768 | | |
768 | 769 | | |
769 | | - | |
770 | 770 | | |
771 | 771 | | |
| 772 | + | |
| 773 | + | |
772 | 774 | | |
773 | 775 | | |
774 | 776 | | |
| |||
1725 | 1727 | | |
1726 | 1728 | | |
1727 | 1729 | | |
| 1730 | + | |
| 1731 | + | |
| 1732 | + | |
| 1733 | + | |
| 1734 | + | |
| 1735 | + | |
| 1736 | + | |
| 1737 | + | |
| 1738 | + | |
| 1739 | + | |
| 1740 | + | |
| 1741 | + | |
| 1742 | + | |
| 1743 | + | |
| 1744 | + | |
| 1745 | + | |
1728 | 1746 | | |
1729 | 1747 | | |
1730 | 1748 | | |
| 1749 | + | |
| 1750 | + | |
| 1751 | + | |
1731 | 1752 | | |
1732 | 1753 | | |
1733 | 1754 | | |
| |||
1759 | 1780 | | |
1760 | 1781 | | |
1761 | 1782 | | |
1762 | | - | |
| 1783 | + | |
1763 | 1784 | | |
1764 | 1785 | | |
1765 | 1786 | | |
| |||
0 commit comments