@@ -35,7 +35,14 @@ TBD: document the exact limitations and differences.
3535
3636## threadCPUTime# prim op
3737
38- Available in the 9.2.8 RTS patch.
38+ Available in the
39+ [ GHC 9.2.8 RTS patch] ( https://github.com/composewell/ghc/releases/tag/ghc-9.2.8-perf-counters-1-rc1 ) .
40+
41+ Install the patched GHC using:
42+
43+ ```
44+ ghcup install -u https://github.com/composewell/ghc/releases/download/ghc-9.2.8-perf-counters-1-rc1/ghc-9.2.8.20231130-x86_64-unknown-linux.tar.xz ghc
45+ ```
3946
4047This is a very simple and easy to use mechanism. The RTS is modified
4148such that we record the accurate time and allocation information in a
@@ -49,11 +56,14 @@ and B in a program, diff will tell us the time spent and allocations
4956between the two points.
5057
5158We have to ensure that we are diffing the data for the same thread id at
52- both the points. See examples directory for a working example.
53-
54- The API has some measurement overhead but it is not very high. If we are
55- nesting measurements be aware that outer measurement will measure the
56- measurement overhead of the inner one.
59+ both the points. See [ this example program] ( ./threadCPUTime.hs ) .
60+
61+ The API has some measurement overhead but it is not very high. If we
62+ are nesting measurements be aware that outer measurement will measure
63+ the measurement overhead of the inner one. If you are measuring a
64+ relatively small amount of time then reduce the overhead (approx 2
65+ microseconds and 300 byte allocations, measure the exact value using an
66+ empty code block).
5767
5868This is very useful in micro-measurements and analysis of the CPU cost
5969different segments of code in a particular Haskell thread without worrying
0 commit comments