-
Notifications
You must be signed in to change notification settings - Fork 2
Expand file tree
/
Copy pathREADME-tools
More file actions
1494 lines (1136 loc) · 67.3 KB
/
README-tools
File metadata and controls
1494 lines (1136 loc) · 67.3 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
============================
README file for GASNet tools
============================
GASNet tools specification version: 1.21 // work-in-progress, not "closed"
Authors: Dan Bonachea <dobonachea@lbl.gov>
Paul H. Hargrove <PHHargrove@lbl.gov>
The GASNet tools are a set of communication-independent utilities that are used
to implement GASNet, and constitute a useful portability tool for GASNet
clients and even for other portable software that might not require the GASNet
communication services. The GASNet tools are available to all regular GASNet
clients, and are also available in a stripped-down "tools-only" software
distribution which is intended for bundling with third-party software that does
not require communication services. This file documents both.
============================================
Tools-Only Distribution Install Instructions
============================================
The GASNet_Tools distribution contains just the sources required to build and
use the GASNet tools, without any GASNet conduit support. It can only be used
in executables which do not link a GASNet conduit (which already provide the
tools).
* Step 1: Configure
Unpack the distribution and run:
configure (options)
A few of the more important options available:
--help : display all available configure options
--prefix=/install/path : set the directory where GASNet Tools will be installed
--enable-debug : Enables hundreds of system-wide sanity checks, at a cost in performance.
Highly recommended during software development.
--disable-pthreads: Can be used to disable the thread-safe version of the tools.
* Step 2: Build
Use make to build the tools library:
make
make check (optional, but recommended - builds and runs a correctness test)
* Step 3: Install
Use make to install the tools library:
make install
* Step 4: Use the library
Once installed, client code should #include <gasnet_tools.h> from
$prefix/include and link the appropriate library in $prefix/lib.
Clients which use multiple pthreads should link the thread-safe library,
and define GASNETT_THREAD_SAFE before including gasnet_tools.h, eg:
cc -o myprogram myprogram.c -I$prefix/include -DGASNETT_THREAD_SAFE=1 \
-L$prefix/lib -lgasnet_tools-par -lpthreads
Where $prefix is the prefix passed during Step 1.
Clients which never use pthreads may link the single-threaded version of the
tools using -lgasnet_tools-seq and the GASNETT_THREAD_SINGLE define.
Additionally, client code used to build shared libraries (compiled with
-fPIC, -KPIC or a similar compiler-dependent flag) should pass
-DGASNETI_FORCE_PIC=1 to ensure use of PIC-safe code in gasnet_tools.h.
On a few platforms, additional system libraries or compiler flags may be required
for gasnet_tools to work correctly. Clients seeking maximum portability
are recommended to get their compiler and linker flags by including the
generated Makefile fragments $prefix/include/gasnet_tools-{seq,par}.mak
in their Makefile and using the provided make variables they provide.
See the comments at the top of those files for exact usage documentation.
Alternatively, pkg-config files for the gasnet_tools-{seq,par} packages
are installed in $prefix/lib/pkgconfig and provide the same build information.
See README for pkg-config usage instructions.
===============================
GASNet Tools User Documentation
===============================
The remainder of this file documents the usage of GASNet tools, regardless of
which distribution is in use.
-------------
General Usage
-------------
* All clients of GASNet tools should #include <gasnet_tools.h> before any other header.
The only exception is source files that use both a GASNet conduit and GASNet
tools, which must include gasnetex.h before gasnet_tools.h.
* All of the supported public interfaces in GASNet tools are named using the
'gasnett_' or 'GASNETT_' prefix. Clients of GASNet tools should *ONLY* invoke
names with this prefix. Use of names with any other prefix (notably including
'gasneti_') is totally unsupported and subject to change and breakage without
notice.
* Many of the 'functions' provided by GASNet tools are actually implemented as
macros or inline functions for performance reasons. This distinction is explicitly
undocumented and open to change without notice, and may even differ across
platforms in a given release. To ensure correctness, clients should never
attempt to take the address of a GASNet tool 'function' or #undef its name.
* The GASNet tools have been ported to all the platforms listed in the main
GASNet README. They may work on others as well. Please contact us if you have
a new platform you'd like to see supported.
* For questions on using the GASNet tools, contact gasnet-users@lbl.gov.
It's especially recommended to contact us before bundling the tools in your
software package.
------
Timers
------
GASNet tools provides high-granularity, low-overhead wall-clock timers using
system-specific support, where available.
gasnett_tick_t - timer datatype representing an integer number of "ticks"
where a "tick" has a system-specific interpretation
safe to be handled using integer operations (+,-,<,>,==)
gasnett_tick_t gasnett_ticks_now() - returns the current tick count
note that tick values are THREAD-specific, and do NOT represent a globally-synchronized timer.
In specific, tick values are very likely to have a different base value across nodes, and
might even advance at substantially different rates on different nodes.
Therefore tick values and tick intervals from different threads should never be directly compared or
arithmetically combined, without first converting the relevant tick intervals to wall time intervals.
uint64_t gasnett_ticks_to_ns(gasnett_tick_t ticks) - convert ticks to nanoseconds as a uint64_t
GASNETT_TICK_MIN - a value representing the minimum value storable in a gasnett_tick_t
GASNETT_TICK_MAX - a value representing the maximum value storable in a gasnett_tick_t
Environment:
For Linux on x86, x86-64 or MIC processors, the default timer is the TSC
which requires choosing a mechanism for calibration. This can be controlled
via environment variables:
* GASNET_TSC_RATE
GASNET_TSC_RATE=walltime
Measures the TSC tick rate against the OS-provided wallclock time.
This is the default.
GASNET_TSC_RATE=cpuinfo
Obtains the TSC tick rate from information in /proc/cpuinfo
This is known to be incorrect for certain recent CPU models.
GASNET_TSC_RATE=[Hz]
If given an integer in the range 1 Million to 100 Billion, this will be
taken as the TSC tick rate in Hz (cycles per second). To avoid the
ambiguity between binary (M=2^20) and decimal (M=10^6), no suffixes are
accepted.
* GASNET_TSC_RATE_TOLERANCE
This is a floating-point value (read by gasnett_getenv_dbl_withdefault())
which indicates the relative error permitted in the calibration of the
TSC. For instance the value 0.001 would permit a relative error as
large as 1 part in 1000, or 0.1%. Exceeding this level of permitted
relative error will result in a warning message.
* GASNET_TSC_RATE_HARD_TOLERANCE
This environment variable functions like GASNET_TSC_RATE_TOLERANCE, except
that exceeding this value results in a fatal error.
* GASNET_TSC_VERBOSE
This boolean setting requests console output regarding calibration activity
-----
Sleep
-----
int gasnett_nsleep(uint64_t ns_delay) - nanosecond resolution sleep
Sleep for at least the indicated number of nanoseconds. If interrupted by a
signal, may terminate the sleep early returning non-zero with errno = EINTR.
If ns_delay is zero, this function returns without sleeping.
---------------
Memory barriers
---------------
Memory barriers are used to implement lock-free synchronization and data sharing across
the threads of a process.
gasnett_local_wmb:
A local memory write barrier - ensure all stores to local mem from this thread are
globally completed across this SMP before issuing any subsequent loads or stores.
(i.e. all loads issued from any CPU subsequent to this call
returning will see the new value for any previously issued
stores from this proc, and any subsequent stores from this CPU
are guaranteed to become globally visible after all previously issued
stores from this CPU)
This must also include whatever is needed to prevent the compiler from reordering
loads and stores across this point.
gasnett_local_rmb:
A local memory read barrier - ensure all subsequent loads from local mem from this thread
will observe previously issued stores from any CPU which have globally completed.
For instance, on the Alpha this ensures
that queued cache invalidations are processed and on the PPC this discards any loads
that were executed speculatively.
This must also include whatever is needed to prevent the compiler from reordering
loads and stores across this point.
gasnett_local_mb:
A "full" local memory barrer. This is equivalent to both a wmb() and rmb().
All outstanding loads and stores must be completed before any subsequent ones
may begin.
gasnett_weak_wmb:
gasnett_weak_rmb:
gasnett_weak_mb:
These are equivalent to the corresponding gasnett_local_* except that in a build
without threads these compile away to nothing.
gasnett_compiler_fence:
A barrier to compiler optimizations that would reorder any memory references across
this point in the code.
Note that for all of the memory barriers, we require only that a given architecture's
"normal" loads and stores are ordered as required. "Extended" instructions such as
MMX, SSE, SSE2, Altivec and vector ISAs on various other machines often bypass some
or all of the machine's memory hierarchy and therefore may not be ordered by the same
instructions. Authors of MMX-based memcpy and similar code must therefore take care
to add appropriate flushes to their code.
For more info on memory barriers: http://gee.cs.oswego.edu/dl/jmm/cookbook.html
-----------------
Atomic operations
-----------------
GASNet tools provides portable atomic memory operations for efficient inter-thread coordination.
Note the default atomic operations exposed by GASNet tools only expand to architecturally
atomic instructions in GASNETT_THREAD_SAFE mode. In single-threaded mode, they all expand to
appropriate regular (non-atomic) operations, which are often more efficient
than their atomic equivalents and should be indistinguishable in behavior for
programs with no concurrency.
The default atomics exposed by GASNet tools are *not* guaranteed to be atomic with respect
to signal handlers, and therefore should not be used for synchronizing with signal handlers.
If you need signal-safe atomics or atomic memory access in single-threaded codes, see
the section on strong atomics below.
* gasnett_atomic_t
This interface provides a special datatype (gasnett_atomic_t) representing an atomically
updated unsigned integer value and a set of atomic ops
Atomicity is guaranteed only if ALL accesses to the gasnett_atomic_t data happen
through the provided operations (i.e. it is an error to directly access the
contents of a gasnett_atomic_t), and if the gasnett_atomic_t data is only
addressable by the current process (e.g. not in a System V shared memory segment)
It is also an error to access an uninitialized gasnett_atomic_t with any operation
other than gasnett_atomic_set().
We define an unsigned type (gasnett_atomic_val_t) and a signed type
(gasnett_atomic_sval_t) and provide the following operations on all platforms:
gasnett_atomic_init(gasnett_atomic_val_t v)
Static initializer (macro) for an gasnett_atomic_t to value v.
void gasnett_atomic_set(gasnett_atomic_t *p,
gasnett_atomic_val_t v,
int flags);
Atomically sets *p to value v.
gasnett_atomic_val_t gasnett_atomic_read(gasnett_atomic_t *p, int flags);
Atomically read and return the value of *p.
void gasnett_atomic_increment(gasnett_atomic_t *p, int flags);
Atomically increment *p (no return value).
void gasnett_atomic_decrement(gasnett_atomic_t *p, int flags);
Atomically decrement *p (no return value).
int gasnett_atomic_decrement_and_test(gasnett_atomic_t *p, int flags);
Atomically decrement *p, return non-zero iff the new value is 0.
* Semi-portable atomic operations
The following two groups of useful atomic operations are available on most
platforms, but not all. Preprocessor definitions indicate what is available.
+ Group 1: add and subtract
gasnett_atomic_val_t gasnett_atomic_add(gasnett_atomic_t *p,
gasnett_atomic_val_t op,
int flags);
gasnett_atomic_val_t gasnett_atomic_subtract(gasnett_atomic_t *p,
gasnett_atomic_val_t op,
int flags);
These implement atomic (unsigned) addition and subtraction.
If the result would lie outside the range of gasnett_atomic_val_t,
then the excess high-order bits of the exact result are truncated.
Both return the value after the addition or subtraction.
GASNETT_HAVE_ATOMIC_ADD_SUB will be defined to 1 when these operations are available.
They are always either both available, or neither is available.
+ Group 2: conditional and unconditional swap
int gasnett_atomic_compare_and_swap(gasnett_atomic_t *p,
gasnett_atomic_val_t oldval,
gasnett_atomic_val_t newval,
int flags);
This operation is the atomic equivalent of:
if (*p == oldval) {
*p = newval;
return NONZERO;
} else {
return 0;
}
gasnett_atomic_val_t gasnett_atomic_swap(gasnett_atomic_t *p,
gasnett_atomic_val_t newval,
int flags);
This operation is the atomic equivalent of:
gasnett_atomic_val_t oldval = *p;
*p = newval;
return oldval;
GASNETT_HAVE_ATOMIC_CAS will be defined to 1 when these operations are available.
They are always either both available, or neither is available.
* Range of atomic type
Internally a gasnett_atomic_t is an unsigned type of at least 24-bits. No special
action is needed to store signed values via gasnett_atomic_set(), however because
the type may use less than a full word, gasnett_atomic_signed() is provided to
perform any required sign extension if a value read from a gasnett_atomic_t is
to be used as a signed type.
gasnett_atomic_signed(v) Converts a gasnett_atomic_val_t returned by
gasnett_atomic_{read,add,subtract} to a signed
gasnett_atomic_sval_t.
GASNETT_ATOMIC_MAX The largest representable unsigned value
(the smallest representable unsigned value is always 0).
GASNETT_ATOMIC_SIGNED_MIN The smallest (most negative) representable signed value.
GASNETT_ATOMIC_SIGNED_MAX The largest (most positive) representable signed value.
The atomic type is guaranteed to wrap around at its minimum and maximum values in
the normal manner expected of two's-complement integers. This includes the 'oldval'
and 'newval' arguments to gasnett_atomic_compare_and_swap(), and the 'v' arguments
to gasnett_atomic_init() and gasnett_atomic_set() which are wrapped (not clipped)
to the proper range prior to assignment (for 'newval' and 'v') or comparison (for
'oldval').
* Memory fence properties of atomic operations
NOTE: Atomic operations have no default memory fence properties, as this
varies by platform. Every atomic operation except _init() includes a 'flags'
argument to indicate the caller's minimum fence requirements.
Depending on the platform, the implementation may use fences stronger than
those requested, but never weaker.
Most cases where atomics are used to implement thread synchronization (eg where
the atomic operation indicates the availability or consumption of other data)
will need to include some fences to ensure consistency of other data (this includes
both non-atomic data, and other atomic variables).
Specifying the necessary fence properties
as arguments to the atomic operation helps to reduce duplication of fences on some
platforms (relative to issuing explicit fences before/after the atomic op), because it
allows the data fence to be combined with whatever fences are used to implement the
atomic operation.
The following fence flags are recognized and may be OR'd together for the flags argument of any
atomic operation:
GASNETT_ATOMIC_NONE - no fence (equivalent to passing 0)
GASNETT_ATOMIC_RMB_PRE - enforce a read/write/full fence before the atomic operation
GASNETT_ATOMIC_WMB_PRE
GASNETT_ATOMIC_MB_PRE
GASNETT_ATOMIC_RMB_POST - enforce a read/write/full fence after the atomic operation
GASNETT_ATOMIC_WMB_POST
GASNETT_ATOMIC_MB_POST
GASNETT_ATOMIC_RMB_POST_IF_TRUE
GASNETT_ATOMIC_RMB_POST_IF_FALSE
- These enforce a read fence after a boolean atomic operation that succeeds (true) or
fails (false).
- The boolean atomic operations are compare-and-swap and decrement-and-test.
Convenience names for specifying acquire/release semantics in critical sections built from atomics:
GASNETT_ATOMIC_REL equivalent to: GASNETT_ATOMIC_WMB_PRE
GASNETT_ATOMIC_ACQ equivalent to: GASNETT_ATOMIC_RMB_POST
GASNETT_ATOMIC_ACQ_IF_TRUE equivalent to: GASNETT_ATOMIC_RMB_POST_IF_TRUE
GASNETT_ATOMIC_ACQ_IF_FALSE equivalent to: GASNETT_ATOMIC_RMB_POST_IF_FALSE
* Storage of atomic type
Internally an atomic type may use storage significantly larger than the number
of significant bits. This additional space may be needed, for instance, to
meet platform-specific alignment constraints, or to hold a mutex on platforms
lacking any other means of ensuring atomicity.
* Fixed-width atomic types
The following fixed-width (32- and 64-bit) types/operations are available
on all platforms. These are guaranteed to consume exactly the "natural"
storage, without padding or any extra alignment. However, one or both may
use mutexes or lack signal-safety, even where gasnett_atomic_t does not.
Additionally, unlike gasnett_atomic_t, the same set of operations is present
on all platforms, even if that requires a mutex-based approach to support
the full range of operations. Therefore, there are no GASNETT_HAVE_ defines
for the fixed-width atomic operations.
gasnett_atomic32_t
gasnett_atomic64_t
Typedef
gasnett_atomic32_init(uint32_t v)
gasnett_atomic64_init(uint64_t v)
Static initializer (macro).
void gasnett_atomic32_set(gasnett_atomic32_t *p, uint32_t v, int flags);
void gasnett_atomic64_set(gasnett_atomic64_t *p, uint64_t v, int flags);
Atomically set *p to value v.
uint32_t gasnett_atomic32_read(gasnett_atomic32_t *p, int flags);
uint64_t gasnett_atomic64_read(gasnett_atomic64_t *p, int flags);
Atomically read and return the value of *p.
int gasnett_atomic32_compare_and_swap(gasnett_atomic32_t *p, uint32_t oldval,
uint32_t newval, int flags);
int gasnett_atomic64_compare_and_swap(gasnett_atomic64_t *p, uint64_t oldval,
uint64_t newval, int flags);
Atomic compare-and-swap of *p from oldval to newval.
uint32_t gasnett_atomic32_swap(gasnett_atomic32_t *p, uint32_t v, int flags);
uint64_t gasnett_atomic64_swap(gasnett_atomic64_t *p, uint64_t v, int flags);
Atomically set *p to value v, returning the previous value.
uint32_t gasnett_atomic32_add(gasnett_atomic32_t *p, uint32_t v, int flags);
uint64_t gasnett_atomic64_add(gasnett_atomic64_t *p, uint64_t v, int flags);
Atomically add value v to *p, returning the new value.
uint32_t gasnett_atomic32_subtract(gasnett_atomic32_t *p, uint32_t v,
int flags);
uint64_t gasnett_atomic64_subtract(gasnett_atomic64_t *p, uint64_t v,
int flags);
Atomically subtract value v from *p, returning the new value.
void gasnett_atomic32_increment(gasnett_atomic32_t *p, int flags);
void gasnett_atomic64_increment(gasnett_atomic64_t *p, int flags);
Atomically add 1 to *p.
void gasnett_atomic32_decrement(gasnett_atomic32_t *p, int flags);
void gasnett_atomic64_decrement(gasnett_atomic64_t *p, int flags);
Atomically subtract 1 from *p.
int gasnett_atomic32_decrement_and_test(gasnett_atomic32_t *p, int flags);
int gasnett_atomic64_decrement_and_test(gasnett_atomic64_t *p, int flags);
Atomically subtract 1 from *p, returning non-zero if *p becomes zero.
While some platforms do not enforce the same alignment constraints for all
types of a given width, the implementation of the fixed-width atomics
guarantee correct atomic operations on storage declared as any of the 4-byte
and 8-byte integer or floating point scalar types on a given platform. So,
assuming 4-byte float and 8-byte double, fixed-width atomic operations via
pointers generated by the following casts are correct:
(int32_t *) or (float *) cast to (gasnett_atomic32_t *)
(int64_t *) or (double *) cast to (gasnett_atomic64_t *)
where any signed or unsigned integral type of the same width may be used in
place of int32_t and int64_t. However, the fixed-width atomic operations do
NOT guarantee correct operation on arbitrarily aligned blocks of data. For
instance the following two examples are NOT permitted
EX1:
struct { int16_t a, b; } X;
gasnett_atomic32_set((gasnett_atomic32_t *)&X, 0, 0);
EX2:
struct { float Real, Img; } Y;
gasnett_atomic64_set((gasnett_atomic64_t *)&Y, 0, 0);
because some platforms might align these structures less strictly than the
integral and floating point types of equal size. However, since in C
unions are always aligned by their most-restrictive constituent type,
the following two examples ARE legal:
EX3:
union { float f;
struct { int16_t a, b; } u16s;
} X2;
gasnett_atomic32_set((gasnett_atomic32_t *)&X2, 0, 0);
EX4:
union { uint64_t u64;
struct { float Real, Img; } cplx32;
} Y2;
gasnett_atomic64_set((gasnett_atomic64_t *)&Y2, 0, 0);
Additionally, casts from (gasnett_atomic32_t *) or (gasnett_atomic64_t *) to
pointers to other types are NOT safe in general, because the alignment of
the atomic type might be less than required for the other type. When this
under-alignment occurs such casts could result in a fatal SIGBUS when the
pointer is dereferenced. To avoid this problem apply this rule-of-thumb:
Storage to be accessed via both a pointer to a fixed-width atomic
type and another pointer type must be declared as the non-atomic type,
or as a union of both types.
This will ensure the storage is suitably aligned for accesses via pointers
to both the atomic and non-atomic types (assuming, of course, that the
non-atomic type is one allowed by the previous paragraph.) Note that as a
general rule, use of union types may be preferable because they avoid
running afoul of pointer-aliasing rules in C/C++ that might otherwise
lead to incorrect behavior at high optimization levels.
It is not safe to concurrently access the same memory location as both
an atomic type and a non-atomic type. For the purpose of this distinction
only references using the gasnett_atomic32_ or gasnett_atomic64_ prefixes
are atomic. All non-GASNet references and any other GASNet references are
non-atomic (including all gets and puts, Active Message calls, etc.).
The client code is responsible for providing sufficient synchronization
(such as barriers or mutexes) to prevent the concurrent use of any given
memory location as both atomic and non-atomic. Use of non-atomic "flag"
variables is not sufficient synchronization (even when volatile) in the
presence of certain compiler optimizations. Additionally, use of an
atomic variable as a "flag" is only sufficient when memory fences are
used correctly. When practical, one possible mechanism to have the
client code separate the atomic and non-atomic treatment of memory into
distinct phases of the computation, separated by a barrier.
* Strong atomics
GASNet tools offers a "strong" atomics interface, which expands to the strongest available
atomic operations on a given platform, even in single threaded-codes. The syntax and semantics
for these operations is identical to those described above, with all name prefixes changed as follows:
gasnett_atomic_X to gasnett_strongatomic_X
gasnett_atomic32_X to gasnett_strongatomic32_X
gasnett_atomic64_X to gasnett_strongatomic64_X
On most, but not all, platforms, operations on gasnett_strongatomic_t are signal safe.
On the few platforms where this is not the case GASNETT_STRONGATOMIC_NOT_SIGNALSAFE
will be defined to 1.
Similarly, GASNETT_STRONGATOMIC32_NOT_SIGNALSAFE and GASNETT_STRONGATOMIC64_NOT_SIGNALSAFE
are defined to 1 IFF the implementation of the fixed-width atomics is not signal-safe.
Note that these two are set independently.
GASNETT_HAVE_STRONGATOMIC_ADD_SUB will be defined to 1 when gasnett_strongatomic_add()
and gasnett_strongatomic_subtract() are available. GASNETT_HAVE_STRONGATOMIC_CAS will
be defined to 1 when gasnett_strongatomic_compare_and_swap() and gasnett_strongatomic_swap()
are available. As with the non-strong case, these operations are always available for
the fixed-width types and thus there are no GASNETT_HAVE_ defines for the fixed-width
strong atomic operations.
-------------------------
Portable platform defines
-------------------------
Most systems have predefined preprocessor tokens for identifying the compiler, OS and architecture
in use. However, there is no uniform naming convention for such platform features, and often
a given feature (such as CPU family) will be indicated using a different name under
different combinations of OS and compiler.
GASNet tools provides a uniform naming scheme for detecting these preprocessor-provided
platform features, so that #if tests can be written concisely with expressions like:
#if PLATFORM_COMPILER_GNU && PLATFORM_OS_SOLARIS && PLATFORM_ARCH_X86
See the comments in gasnet_portable_platform.h for the details of the provided defines.
----------------------------------
Portable fixed-width integer types
----------------------------------
inttypes.h is part of the POSIX and C99 specs, but in practice support for it
varies wildly across systems. GASNet tools portably provides the fixed-bit-width
integral types via the following typedefs:
int8_t, uint8_t signed/unsigned 8-bit integral types
int16_t, uint16_t signed/unsigned 16-bit integral types
int32_t, uint32_t signed/unsigned 32-bit integral types
int64_t, uint64_t signed/unsigned 64-bit integral types
intptr_t, uintptr_t signed/unsigned types big enough to hold any pointer offset
--------------------
Compiler annotations
--------------------
Many compilers have pragmas, attributes or other compiler-specific mechanisms for annotating
declarations and code in useful ways which are not standardized by the C specification.
The following macros expand to appropriate annotations when available, or to safe, unannotated
versions when the given annotation is unavailable. See also "Feature control", below.
GASNETT_INLINE(fnname)
definition
Most forceful inlining demand available.
Might generate errors in cases where inlining is semantically impossible
(eg recursive functions, varargs fns)
fnname should be the name of the function, and definition should be the actual
definition of the function (declaration and body)
GASNETT_NEVER_INLINE(fnname,definition)
Most forceful demand available to disable inlining for function.
GASNETT_RESTRICT
The C99 'restrict' keyword, if supported by the compiler, or empty otherwise.
GASNETT_FORMAT_PRINTF(fnname,fmtarg,firstvararg,declarator)
GASNETT_FORMAT_PRINTF_FUNCPTR(fnname,fmtarg,firstvararg,declarator)
Annotate function fnname (defined by definition) as a printf-like function,
whose arguments should be checked for type compatibility with a format string whenever possible.
fmtarg is the 1-based index of the argument providing the format character string,
firstvararg is the 1-based index of the first ... argument which corresponds to
arguments to the format string.
declaration GASNETT_NORETURN;
GASNETT_NORETURNP(fnname)
Declare the given function as one that will never return (ie program will exit before return)
GASNETT_MALLOC
declarator
GASNETT_MALLOCP(fnname)
Declare the given function as one that returns new, unaliased memory (as with malloc)
GASNETT_PURE
declarator
GASNETT_PUREP(fnname)
Declare as pure function: one with no effects except the return value, and
return value depends only on the parameters and/or global variables.
prohibited from performing volatile accesses, compiler fences, I/O,
changing any global variables (including statically scoped ones), or
calling any functions that do so
GASNETT_CONST
declarator
GASNETT_CONSTP(fnname)
Declare as const function: a more restricted form of pure function, with all the
same restrictions, except additionally the return value must NOT
depend on global variables or anything pointed to by the arguments
GASNETT_HOT
declarator
Declare a function as frequently called.
Compilers may do many different things with this information.
GASNETT_COLD
declarator
Declare a function as infrequently called.
Compilers may do many different things with this information.
GASNETT_DEPRECATED
declarator
Declare a function as deprecated (subject to future removal).
Attempts to generate a warning if the function is called.
GASNETT_WARN_UNUSED_RESULT
declarator
Attempt to generate a warning if the return value of the declared function is ignored by caller.
GASNETT_USED
declarator
Declare the given function as one that must not be omitted, even if the compiler
believes the function cannot ever be called.
GASNETT_PREDICT_TRUE(expr)
GASNETT_PREDICT_FALSE(expr)
These macros yield a non-zero value if and only if expr has non-zero value.
Additionally, they pass a hint to the compiler that one expects the value to
be non-zero or zero, respectively. Use them to wrap a branch-controlling
expression when you have strong reason to believe the branch will frequently
go in one direction and that the branch is a bottleneck.
The macros if_pf() and if_pf() are implemented in terms of these macros.
Examples:
do { S; } while(GASNETT_PREDICT_FALSE(expr)); // single-trip is common case
V = GASNETT_PREDICT_TRUE(expr) ? (val1) : (val2); // val1 is common case
if_pf(cond) S;
if_pt(cond) S;
Drop-in replacements for the standard C 'if' keyword with branch-prediction hints.
if_pf and if_pt behave just like 'if' except they give the C compiler a hint that
the condition is predicted to be false (if_pf) and the branch not taken,
or predicted to be true (pt) and the branch taken.
These are equivalent to
if(GASNETT_PREDICT_TRUE(expr)) S;
and
if(GASNETT_PREDICT_FALSE(expr)) S;
respectively.
gasnett_constant_p(expr)
This expands to use of __builtin_constant_p() on compilers with the necessary
support, or to the constant 0 otherwise.
gasnett_unreachable()
This annotation marks the current code location as unreachable (using compiler-specific
mechanisms), to assist optimization of surrounding code.
gasnett_assume(cond)
States simple expression cond is always true, as an annotation directive to guide compiler analysis.
Becomes an assertion in DEBUG mode and an analysis directive (when available) in NDEBUG mode.
This notably differs from typical assertions in that the expression must remain valid in NDEBUG mode
(because it is not preprocessed away), and furthermore may or may not be evaluated at runtime.
To ensure portability and performance, cond should NOT contain any function calls or side-effects.
WARNING: passing a cond which is ever false in NDEBUG mode could lead the compiler
to mis-optimize surrounding code in unintuitive/unexpected ways.
GASNETT_FALLTHROUGH
This annotation is used to annotate a deliberate/explicit fallthrough of control from
one case statement to another, suppressing warnings about implicit fallthrough (the absence
of a break statement), such as those generated by GCC's -Wimplicit-fallthrough (also -Wextra).
Example:
switch (whatever) {
case 1: do_something(); GASNETT_FALLTHROUGH
case 2: something_else();
}
Note it should be placed after the last statement in the case that
falls-through and should NOT be followed by a semi-colon.
-----------------------------------------------------
Error-checking System Mutexes and Condition Variables
-----------------------------------------------------
GASNet tools provides convenience wrappers around the system's pthread mutexes
and condition variables. In debug mode, these wrappers add error checking
capabilities to detect common usage violations (such as attempts to recursively
acquire a mutex, or release a mutex that has not been acquired). The wrappers
also implement workarounds for known bugs in the pthread implementations of
several systems.
In non-threaded builds, these wrappers still compile and expand to
appropriate no-ops, unless compiled with -DGASNETT_USE_TRUE_MUTEXES=1
which will force gasnett_mutex_t to always use true locking (even
without -DGASNETT_THREAD_SAFE=1).
Unlike pthread_mutex_t, these locks may NEVER be obtained recursively, and
in debug builds this is detected as a usage violation. Similarly, they are
not safe to use for inter-process synchronization in shared memory segments.
* Otherwise, the following function similarly to the pthread_mutex symbols of the same name:
gasnett_mutex_t
GASNETT_MUTEX_INITIALIZER
void gasnett_mutex_init(gasnett_mutex_t *)
void gasnett_mutex_destroy(gasnett_mutex_t *)
int gasnett_mutex_destroy_ignoreerr(gasnett_mutex_t *)
mutex creation and destruction, as with pthread_mutex_t
gasnett_mutex_destroy_ignoreerr performs no error checking and silently returns any errors
(eg as may occur when attempting to destroy a locked mutex)
void gasnett_mutex_lock(gasnett_mutex_t *)
void gasnett_mutex_unlock(gasnett_mutex_t *)
lock and unlock (checks for recursive locking errors)
int gasnett_mutex_trylock(gasnett_mutex_t *)
non-blocking trylock - returns EBUSY on failure, 0 on success
* Additional mutex utilities:
void gasnett_mutex_assertlocked(gasnett_mutex_t *)
void gasnett_mutex_assertunlocked(gasnett_mutex_t *)
In debug builds, these functions respectively assert that the given mutex is
currently locked or not locked by the calling thread, generating a fatal error
if the assertion is violated. Has no effect in non-debug builds.
* The following function identically to the pthread_cond symbols of the same name:
gasnett_cond_t
GASNETT_COND_INITIALIZER
void gasnett_cond_init(gasnett_cond_t *pc)
void gasnett_cond_destroy(gasnett_cond_t *pc)
condition variable creation and destruction, as with pthread_cond_t
void gasnett_cond_signal(gasnett_cond_t *pc)
void gasnett_cond_broadcast(gasnett_cond_t *pc)
signal at least one / all current waiters on a gasnet_cond_t, while holding the associated mutex
void gasnett_cond_wait(gasnett_cond_t *pc, gasnett_mutex_t *pl)
release gasnett_mutex_t pl (which must be held) and block WITHOUT POLLING
until gasnett_cond_t pc is signalled by another thread, or until the system
decides to wake this thread for no good reason (which it may or may not do).
Upon wakeup for any reason, the mutex will be reacquired before returning.
It's an error to wait if there is only one thread, and can easily lead to
deadlock if the last thread goes to sleep. No thread may call wait unless it
can guarantee that (A) some other thread will eventually signal it to wake
up and (B) some other thread is still polling (except in tools-only mode,
where there is no polling). The system may or may not also randomly signal
threads to wake up for no good reason, so upon awaking the thread MUST
verify using its own means that the condition it was waiting for has
actually been signalled (ie that the client-level "outer" condition has
been set).
In order to prevent races leading to missed signals and deadlock, signaling
threads must always hold the associated mutex while signaling, and ensure the
outer condition is set *before* releasing the mutex. Additionally, all waiters
must check the outer condition *after* acquiring the same mutex and *before*
calling wait (which atomically releases the lock and puts the thread to sleep).
-------------------
Reader/Writer locks
-------------------
As with the gasnett_mutex_t wrappers in the previous section, we also provide
wrappers around POSIX reader/writer locks (pthread_rwlock_t). In a nutshell,
these allow multiple threads to concurrently acquire a "read" lock (for
concurrent read-only access to the protected data structures), but provide
mutual exclusion when a thread obtains a "write" lock to update the shared data.
CAUTION: The additional opportunities for concurrency provided by reader/writer
locks come at a SIGNIFICANT cost in additional serial overhead, relative to
simple mutexes. The overhead for obtaining and releasing a read lock on an
uncontended pthread_rwlock_t is commonly 50%-300% more expensive than the
corresponding operation on simple mutex. Also, write locks still need to enforce
mutual exclusion, thus frequent write locks can sharply degrade achieved concurrency.
Consequently, rwlock's are only expected to provide a net performance win
relative to mutexes when there is a high-degree of concurrency for long-running
reader critical sections, and writers are VERY infrequent. In all other cases,
one should probably be using a mutex instead.
On systems lacking reader/writer locks (or when configured with --disable-rwlock),
these compile down to regular gasnett_mutex_t operations - with full
serialization and no read concurrency. Some implementations also have a limit
on the number of threads that can concurrently obtain a reader lock. For these
reasons, client code should be designed to remain deadlock-free when some or
all read locks are serialized, even lacking writers.
Unlike pthread_rwlock_t, these locks may NOT be obtained recursively, and
in debug builds this is detected as a usage violation. Similarly, they are
not safe to use for inter-process synchronization in shared memory segments.
* Otherwise, the following function similarly to the pthread_rwlock symbols of the same name:
gasnett_rwlock_t
GASNETT_RWLOCK_INITIALIZER
void gasnett_rwlock_init(gasnett_rwlock_t *)
void gasnett_rwlock_destroy(gasnett_rwlock_t *)
rwlock creation and destruction, as with pthread_rwlock_t
void gasnett_rwlock_rdlock(gasnett_rwlock_t *)
void gasnett_rwlock_wrlock(gasnett_rwlock_t *)
void gasnett_rwlock_unlock(gasnett_rwlock_t *)
blocking read lock, blocking write lock and unlock
POSIX errors due to reader concurrency limits are masked as blocking
int gasnett_rwlock_tryrdlock(gasnett_rwlock_t *)
int gasnett_rwlock_trywrlock(gasnett_rwlock_t *)
non-blocking trylock - returns EBUSY or EAGAIN on failure, 0 on success
* Additional rwlock utilities:
void gasnett_rwlock_assertrdlocked(gasnett_rwlock_t *)
void gasnett_rwlock_assertwrlocked(gasnett_rwlock_t *)
void gasnett_rwlock_assertlocked(gasnett_rwlock_t *)
void gasnett_rwlock_assertunlocked(gasnett_rwlock_t *)
In debug builds, these functions respectively assert that the given rwlock is
currently locked (for read, write or either) or not locked by the calling
thread, generating a fatal error if the assertion is violated. Has no effect
in non-debug builds.
--------------------
Thread-specific data
--------------------
GASNet tools provides wrappers to define and access pointers to thread-specific data,
using an interface that expands to the fastest available mechanism provided by the
current platform for thread-specific data on threaded configurations
(eg __thread or pthread_getspecific()), and expands to simple dereference of
process-global storage for non-threaded configurations.
Automatically handles the hassle of pthread key creation if required.
A thread-specific data pointer (mykey) must be declared as:
GASNETT_THREADKEY_DEFINE(mykey); - must be defined in exactly one C file at global scope
GASNETT_THREADKEY_DECLARE(mykey); - optional, use in headers to reference externally-defined key
and then can be used as:
void *val = gasnett_threadkey_get(mykey);
gasnett_threadkey_set(mykey,val);
no initialization is required (happens automatically on first access).
Initialization can optionally be performed using:
gasnett_threadkey_init(mykey);
which then allows subsequent calls to:
void *val = gasnett_threadkey_get_noinit(mykey);
gasnett_threadkey_set_noinit(mykey,val);
these save a branch by avoiding the initialization check.
gasnett_threadkey_init is permitted to be called multiple times and
from multiple threads - calls after the first one will be ignored.
---------------------
Environment utilities
---------------------
Following utilities support querying the environment and manipulating the result.
Most of the query functions will report their actions to the console when the user
selects verbose reporting mode, to support self-documenting environment settings.
char *gasnett_format_number(int64_t val, char *buf, size_t bufsz, int is_mem_size);
format a integer value as a human-friendly string, with appropriate mem suffix
int64_t gasnett_parse_int(const char *str, uint64_t mem_size_multiplier);
parse an integer value back out again
if mem_size_multiplier==0, it's a unitless quantity
otherwise, it's a memory size quantity, and mem_size_multiplier provides the
default memory unit (ie 1024=1KB) if the string provides none
void gasnett_setenv(const char *key, const char *value);
void gasnett_unsetenv(const char *key);
set/unset an environment variable, for the local process ONLY
char *gasnett_getenv(const char *keyname);
raw environment query function, bypasses reporting
uses the gasnet conduit-provided global environment if available or regular getenv otherwise
legal to call before gasnet_init, but may malfunction if
the conduit has not yet established the contents of the environment
char *gasnett_getenv_withdefault(const char *keyname, const char *defaultval);
environment query for a string parameter
if user has set value the return value indicates their selection
if value is not set, the provided default value is returned
call is reported to the console in verbose-environment mode,
(only the first call with a given key is reported)
legal to call before gasnet_init, but may malfunction if
the conduit has not yet established the contents of the environment
int gasnett_getenv_yesno_withdefault(const char *keyname, int defaultval);
environment query for a yes/no parameter
if user has set value to 'Y|YES|y|yes|1' or 'N|n|NO|no|0',
the return value indicates their selection
if value is not set, the provided default value is returned
int64_t gasnett_getenv_int_withdefault(const char *keyname, int64_t defaultval, uint64_t mem_size_multiplier);
environment query for an integral parameter
if mem_size_multiplier non-zero, expect a (possibly fractional) memory size with suffix (B|KB|MB|GB|TB)
and the default multiplier is mem_size_multiplier (eg 1024 for KB)
otherwise, expect a positive or negative integer in decimal or hex ("0x" prefix)
the return value indicates their selection
if value is not set, the provided default value is returned
double gasnett_getenv_dbl_withdefault(const char *keyname, double defaultval);
environment query for a floating-point parameter
if user has set value the return value indicates their selection
which must be a valid floating-point value or a fraction (e.g "1.5", "-1e4", or "3/8")
if value is not set, the provided default value is returned
call is reported to the console in verbose-environment mode,
(only the first call with a given key is reported)
legal to call before gasnet_init, but may malfunction if
the conduit has not yet established the contents of the environment
int gasnett_verboseenv();
returns true iff GASNET_VERBOSEENV reporting is enabled on this node
note the answer may change during initialization
void gasnett_envint_display(const char *key, int64_t val, int is_dflt, int is_mem_size);
void gasnett_envstr_display(const char *key, const char *val, int is_dflt);